Firefox PHP

Phorum MYSQL query optimization

Posted by Skye N. 
All files from this thread

File Name File Size   Posted by Date  
Screen shot 2010-02-07 at 11.19.07 PM.png 19.5 KB open | download Skye N. 02/08/2010 Read message
Phorum MYSQL query optimization
February 08, 2010 02:20AM
Hello Phorumers,

I just started using the (free version) Jet Profiler for MySQL it's pretty great
[www.jetprofiler.com]

Anyway, it turns out the slowest query on my system is (example):

select thread, message_id, subject, author, datestamp
    from p5_messages
    where parent_id = 0
    order by thread desc LIMIT 127200, 50

The full EXPLAIN is attached.

This is, I can guess, list.php and with "floating threads" turned on as most of my users choose.

I tried adding an index on messages.parent_id but it didn't help it much, I suppose because of the ORDER BY?

I'm not a MySQL tuning expert, can anyone recommend a way to optimize this via indexes?

Thanks,
Skye

___
Skye Nott
Corvus Digital



Edited 1 time(s). Last edit at 02/08/2010 02:27AM by Skye.


Re: Phorum MYSQL query optimization
February 08, 2010 02:25AM
No, its because of the limit.
Its using an index but because someone went to 2500-something it has to go far down in the index to get the corresponding threads in the list.
I doubt that this case is hit often.

Oh and what does "slow" mean to you? How long does it take?


Thomas Seifert



Edited 1 time(s). Last edit at 02/08/2010 02:26AM by Thomas Seifert.
Re: Phorum MYSQL query optimization
February 08, 2010 02:28AM
It's not that running the query once is slow, it's the fact that the query is run many, many times that makes it the top cumulative CPU-consumer and therefore the #1 target for optimization.

___
Skye Nott
Corvus Digital
Re: Phorum MYSQL query optimization
February 08, 2010 02:31AM
Note that that is just the longest-running example of this "class" of query I should have wrote it as:

select thread, message_id, subject, author, datestamp
    from p5_messages
    where parent_id = ?
    order by thread desc LIMIT ?, ?

My hunch is that the vast majority of instances of this query will be for the "front page" of list.php

___
Skye Nott
Corvus Digital



Edited 2 time(s). Last edit at 02/08/2010 02:37AM by Skye.
Re: Phorum MYSQL query optimization
February 08, 2010 02:45AM
Well, this got the scan down from 1.3 million rows to just over 100 thousand...

mysql> create index parent_id_thread on p5_messages (parent_id, thread);
Query OK, 1339181 rows affected (4 min 29.98 sec)
Records: 1339181  Duplicates: 0  Warnings: 0

mysql> 
mysql> 
mysql> 
mysql> explain select thread, message_id, subject, author, datestamp
    ->     from p5_messages
    ->     where parent_id = 0
    ->     order by thread desc LIMIT 127200, 50
    -> 
    -> ;
+----+-------------+-------------+------+------------------+------------------+---------+-------+--------+-------------+
| id | select_type | table       | type | possible_keys    | key              | key_len | ref   | rows   | Extra       |
+----+-------------+-------------+------+------------------+------------------+---------+-------+--------+-------------+
|  1 | SIMPLE      | p5_messages | ref  | parent_id_thread | parent_id_thread | 4       | const | 117904 | Using where | 
+----+-------------+-------------+------+------------------+------------------+---------+-------+--------+-------------+
1 row in set (0.00 sec)

___
Skye Nott
Corvus Digital
Re: Phorum MYSQL query optimization
February 08, 2010 02:49AM
I'm missing the forum_id in your query - which is the case where there is already a key. Are you sure that is a native phorum query?


Thomas Seifert
Re: Phorum MYSQL query optimization
February 08, 2010 02:50AM
This query and the indexes that it uses *are* thought out and optimized, even though some automated program tells you it is not.

The fact that it's considered slow, is because of the limit 127200, 50 in there. If you run this query for more regular cases (my own users barely get to the second page of the index, let alone the 2544th), you will see that it executes fast. Even though it uses an index scan, for smaller limit cases, MySQL will quickly find enough messages to fill up the 50 requested messages. After is has found 50 messages, it will stop scanning. Therefore, on the first page, it executes like lightning. However, if the deep pages are requested, the index scan will take longer to first reach the requested offset.

Crawler bots will generally go down into history. On my forum, the "users" that go to these slower pages normally are named something like "Slurp" or "Googlebot". If most of your queries are queries like the one with the extreme limit values on it, then you might be serving more crawler bots than real users.

If you find a way to further optimize the query, then please tell us, because we have thought a lot about it and did not come up with anything smarter than the current code. As you found out already, it's not just a matter of adding a smart index.


Maurice Makaay
Phorum Development Team
my blog linkedin profile secret sauce
Re: Phorum MYSQL query optimization
February 08, 2010 02:56AM
Thomas is correct. The forum_id is indeed missing. And in the database layer, we use "ORDER BY" and "DESC" (upper case). I too doubt that we are looking at a core Phorum query here.


Maurice Makaay
Phorum Development Team
my blog linkedin profile secret sauce
Re: Phorum MYSQL query optimization
February 08, 2010 03:04AM
Thanks for your fast response, I'll look into it further tomorrow when load is back up to normal on my forum.

___
Skye Nott
Corvus Digital
Re: Phorum MYSQL query optimization
February 08, 2010 03:22PM
You're absolutely right, it was a web crawler going through very old posts on the site, via an old module that I put together to make crawling a little more lightweight.

Now that my load is back to "normal" usage, all the queries are very nicely optimized already.

Thanks for the great work.

___
Skye Nott
Corvus Digital
Sorry, only registered users may post in this forum.

Click here to login