Firefox PHP

Would like to make a suggestion on "paging" ...

Posted by Rob Granger 
Would like to make a suggestion on "paging" ...
August 28, 2009 09:51PM
I came up with a sliding window based on time instead of pages, when I was looking at building a system.
the main advantage is that it is something that can be optimized and direct reads can be used rather than this odd "limit" thing to break up pages.
the user could select the multiple of whatever "time block" is the default. the down size is that pages may not always be the same size, the up side is that getting to a time or page would be almost instant.
I just loaded up about 1.2 million records and to go from page 9359 to 9358 took, well over 2 minutes and counting. Granted this server isn't fast, but using limit is probably the most ineffectent way to do things.. 2:35 to change a at the back of the list. I think instead of pages, you could have "days" and i bet most would trade super fast for not always having pages the same size?

Thanks for the help getting my data loaded!
Rob in Houston.
Re: Would like to make a suggestion on "paging" ...
August 29, 2009 05:32AM
This slow paging towards the end of the list is a known issue and intrinsic to the type of data that we are handling here. Because sorting has to be done by modifystamp, filtering has to be done by forum id and extra handling is required for something like message status (visible or not), the queries depend on indexing. The index is build in such way that for creating message lists, the first pages will be at the front of the index. This arranges for the most recent pages in the paging being fast. However, for building the latest pages, the index has to be walked over completely, resulting in slow loading times.

More than 2 minutes is quite slow still. Just to be sure: did you enable the indexes after the import? You probably did, but it won't hurt to check.

Can you show us some queries to display what you mean with "direct read"? Can you also include fetching only visible messages for a certain forum id?

With the time frame based paging, I do see some impracticalities.

  • The number of messages per page might vary *a lot*. You are skipping over that fact telling that people would trade variable page lengths for speed, but I doubt that users would be happy if they would see two messages on the first page and a few hundred on the second one.
  • If there are many messages in a single timeframe, then using the forum could easily become unusable. If you have a page with a few hundred threads on it, then there is no fun in using it. Just too much scrolling and searching when you get at the list page. Myself, I really prefer having a static amount of messages per page with at most 20 or 30 messages, so my head can keep track too.
  • A possible work around for this would be to apply old-style-paging-in-time-frame-based-paging. Of course this is quite a theoretical solution, because this does not really reflect user-friendliness.

As a general thought: you as an admin are checking if you can easily go to the back of the list, but are you sure that you users will be doing that on a large scale? When we were fighting this issue in the past, I have been monitoring my own forums to see how my users were using the paging. It turned out that users that used the list page almost all were using only the very first list page. Some users were going back up to 10 pages, probably to read up on new messages. Older messages that were read by visitors were not found through paging, but through Google and the forum search instead. So for my forums, the fact that paging isn't fast for the oldest messages available is not an issue, since it's not a use case for my population.


Maurice Makaay
Phorum Development Team
my blog linkedin profile secret sauce
Re: Would like to make a suggestion on "paging" ...
August 29, 2009 08:43AM
No good answers. I was just playing around at one time wit time based indexed and the between (start, finish) was always near instant. your modifystamp would work as they can be used in the between statement, but you hit it, the page lengths could vary wildly.

this is still better than many of the other systems, so I am not going to worry. My board is more of a conversation about cars, folks do find a start and then page, but going back 10 years, I guess they can wait a little :)

indexed read: if you are using limit, mysql does a serial scan (though at least part of the result set)to get the data, if you use your time stamp (indexed) and between (a, b) then mysql is optimized to use the keys to pull the subset via keyed read and it is instant.

select * from phorum_messages where datestamp between 1250894726 and 1251511375;
1.1 seconds to select 971 records from ~120,000,000 - now that's smoking! :) but again, most folks do not have that much history. www.ppbb.com, oldest porsche boxster forum on the net, that is my story and I am sticking to it :)
Thanks for a nice bit of software, we re moving to it either way!
Re: Would like to make a suggestion on "paging" ...
August 29, 2009 11:11AM
We used to do this exact thing. And while it was faster, it does not allow paging. It only allows a next/previous model of moving around. The most common use for paging is jumping back 3 or 4 pages to find the last place you left off when you last visited the forums.

FWIW, I have 2,690,270 messages in my Phorum at forums.dealmac.com and going from page 10133 to 10134 takes 3 seconds.

Brian - Cowboy Ninja Coder - Personal Blog - Twitter
Wow, do you have any special indexing or uber processor?
August 29, 2009 11:38AM
going back one page on my new server takes about 10 seconds and going back one page took about 23 seconds. what are you doing? more memory, more cache, more cpu, voodoo :)
Thanks for the imput, that would be toally acceptable!!
Rob
Re: Wow, do you have any special indexing or uber processor?
September 03, 2009 10:40AM
Well, the forums there currently just run on our main web site stack. The primary database there is a Dual CPU / Quad core (8 cores) Intel(R) Xeon(TM) CPU 2.80GHz 2MB cache, 12GB RAM, 146GB SCSI 15k RAID 1. MySQL is configured with an INNODB buffer pool of 5MB (I need to increase that actually).

FWIW, most of that power is used by dealnews.com's main web site. The forum is not our dominant application. The forums there previously had a dedicated database server. It is in fact the box that phorum.org runs on now. The queries worked fine on it too. It is a dual cpu Intel(R) Xeon(TM) CPU 2.80GHz 512KB cache with 2GB RAM. It's INNODB buffer pool was likely somewhere around 1.5GB when it was in use. It has 72GB 10k SCSI drives in RAID1.

First, are you using INNODB or MyISAM. The MySQL default is MyISAM, but we recommend using INNODB for Phorum. But, you have to have it configured right. If you don't it is a waste. If you are trying to run a 2 million message Phorum from a shared host or some small virtual host, it will not be succesful. The software can only do so much with that much data. The hardware has to be there to support the load.

Brian - Cowboy Ninja Coder - Personal Blog - Twitter
Re: Wow, do you have any special indexing or uber processor?
September 03, 2009 11:05AM
Thanks Brian, i am pretty sure that I am MyISAM. I brought over an old install of mysql and databases and didn't convert anything. The BBS that i moved to phorum was an import, so I can always drop and re-create it, I think. I really apprecate your time. I had not consider using one format over the other. I will need to do some digging to see if I can flip things over.
I am running in VM and had actually thought about setting up another VM for the database so that it would have more memory. I also thought about installing a 64bit OS and trying to port things over.... not sure that I want to take that on right now though :)

Cheers,
Rob
Sorry, only registered users may post in this forum.

Click here to login