Phorum MYSQL query optimization
Posted by Skye N.
All files from this thread
File Name | File Size | Posted by | Date | ||
---|---|---|---|---|---|
Screen shot 2010-02-07 at 11.19.07 PM.png | 19.5 KB | open | download | Skye N. | 02/08/2010 | Read message |
February 08, 2010 02:20AM |
Registered: 19 years ago Posts: 123 |
Hello Phorumers,
I just started using the (free version) Jet Profiler for MySQL it's pretty great
[www.jetprofiler.com]
Anyway, it turns out the slowest query on my system is (example):
The full EXPLAIN is attached.
This is, I can guess, list.php and with "floating threads" turned on as most of my users choose.
I tried adding an index on messages.parent_id but it didn't help it much, I suppose because of the ORDER BY?
I'm not a MySQL tuning expert, can anyone recommend a way to optimize this via indexes?
Thanks,
Skye
___
Skye Nott
Corvus Digital
Edited 1 time(s). Last edit at 02/08/2010 02:27AM by Skye.
I just started using the (free version) Jet Profiler for MySQL it's pretty great
[www.jetprofiler.com]
Anyway, it turns out the slowest query on my system is (example):
select thread, message_id, subject, author, datestamp from p5_messages where parent_id = 0 order by thread desc LIMIT 127200, 50
The full EXPLAIN is attached.
This is, I can guess, list.php and with "floating threads" turned on as most of my users choose.
I tried adding an index on messages.parent_id but it didn't help it much, I suppose because of the ORDER BY?
I'm not a MySQL tuning expert, can anyone recommend a way to optimize this via indexes?
Thanks,
Skye
___
Skye Nott
Corvus Digital
Edited 1 time(s). Last edit at 02/08/2010 02:27AM by Skye.
Re: Phorum MYSQL query optimization February 08, 2010 02:25AM |
Admin Registered: 20 years ago Posts: 9,240 |
No, its because of the limit.
Its using an index but because someone went to 2500-something it has to go far down in the index to get the corresponding threads in the list.
I doubt that this case is hit often.
Oh and what does "slow" mean to you? How long does it take?
Thomas Seifert
Edited 1 time(s). Last edit at 02/08/2010 02:26AM by Thomas Seifert.
Its using an index but because someone went to 2500-something it has to go far down in the index to get the corresponding threads in the list.
I doubt that this case is hit often.
Oh and what does "slow" mean to you? How long does it take?
Thomas Seifert
Edited 1 time(s). Last edit at 02/08/2010 02:26AM by Thomas Seifert.
February 08, 2010 02:28AM |
Registered: 19 years ago Posts: 123 |
It's not that running the query once is slow, it's the fact that the query is run many, many times that makes it the top cumulative CPU-consumer and therefore the #1 target for optimization.
___
Skye Nott
Corvus Digital
___
Skye Nott
Corvus Digital
February 08, 2010 02:31AM |
Registered: 19 years ago Posts: 123 |
Note that that is just the longest-running example of this "class" of query I should have wrote it as:
My hunch is that the vast majority of instances of this query will be for the "front page" of list.php
___
Skye Nott
Corvus Digital
Edited 2 time(s). Last edit at 02/08/2010 02:37AM by Skye.
select thread, message_id, subject, author, datestamp from p5_messages where parent_id = ? order by thread desc LIMIT ?, ?
My hunch is that the vast majority of instances of this query will be for the "front page" of list.php
___
Skye Nott
Corvus Digital
Edited 2 time(s). Last edit at 02/08/2010 02:37AM by Skye.
February 08, 2010 02:45AM |
Registered: 19 years ago Posts: 123 |
Well, this got the scan down from 1.3 million rows to just over 100 thousand...
___
Skye Nott
Corvus Digital
mysql> create index parent_id_thread on p5_messages (parent_id, thread); Query OK, 1339181 rows affected (4 min 29.98 sec) Records: 1339181 Duplicates: 0 Warnings: 0 mysql> mysql> mysql> mysql> explain select thread, message_id, subject, author, datestamp -> from p5_messages -> where parent_id = 0 -> order by thread desc LIMIT 127200, 50 -> -> ; +----+-------------+-------------+------+------------------+------------------+---------+-------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------------+------+------------------+------------------+---------+-------+--------+-------------+ | 1 | SIMPLE | p5_messages | ref | parent_id_thread | parent_id_thread | 4 | const | 117904 | Using where | +----+-------------+-------------+------+------------------+------------------+---------+-------+--------+-------------+ 1 row in set (0.00 sec)
___
Skye Nott
Corvus Digital
Re: Phorum MYSQL query optimization February 08, 2010 02:49AM |
Admin Registered: 20 years ago Posts: 9,240 |
February 08, 2010 02:50AM |
Admin Registered: 19 years ago Posts: 8,532 |
This query and the indexes that it uses *are* thought out and optimized, even though some automated program tells you it is not.
The fact that it's considered slow, is because of the limit 127200, 50 in there. If you run this query for more regular cases (my own users barely get to the second page of the index, let alone the 2544th), you will see that it executes fast. Even though it uses an index scan, for smaller limit cases, MySQL will quickly find enough messages to fill up the 50 requested messages. After is has found 50 messages, it will stop scanning. Therefore, on the first page, it executes like lightning. However, if the deep pages are requested, the index scan will take longer to first reach the requested offset.
Crawler bots will generally go down into history. On my forum, the "users" that go to these slower pages normally are named something like "Slurp" or "Googlebot". If most of your queries are queries like the one with the extreme limit values on it, then you might be serving more crawler bots than real users.
If you find a way to further optimize the query, then please tell us, because we have thought a lot about it and did not come up with anything smarter than the current code. As you found out already, it's not just a matter of adding a smart index.
Maurice Makaay
Phorum Development Team
my blog
linkedin profile
secret sauce
The fact that it's considered slow, is because of the limit 127200, 50 in there. If you run this query for more regular cases (my own users barely get to the second page of the index, let alone the 2544th), you will see that it executes fast. Even though it uses an index scan, for smaller limit cases, MySQL will quickly find enough messages to fill up the 50 requested messages. After is has found 50 messages, it will stop scanning. Therefore, on the first page, it executes like lightning. However, if the deep pages are requested, the index scan will take longer to first reach the requested offset.
Crawler bots will generally go down into history. On my forum, the "users" that go to these slower pages normally are named something like "Slurp" or "Googlebot". If most of your queries are queries like the one with the extreme limit values on it, then you might be serving more crawler bots than real users.
If you find a way to further optimize the query, then please tell us, because we have thought a lot about it and did not come up with anything smarter than the current code. As you found out already, it's not just a matter of adding a smart index.
Maurice Makaay
Phorum Development Team



February 08, 2010 02:56AM |
Admin Registered: 19 years ago Posts: 8,532 |
Thomas is correct. The forum_id is indeed missing. And in the database layer, we use "ORDER BY" and "DESC" (upper case). I too doubt that we are looking at a core Phorum query here.
Maurice Makaay
Phorum Development Team
my blog
linkedin profile
secret sauce
Maurice Makaay
Phorum Development Team



February 08, 2010 03:04AM |
Registered: 19 years ago Posts: 123 |
Thanks for your fast response, I'll look into it further tomorrow when load is back up to normal on my forum.
___
Skye Nott
Corvus Digital
___
Skye Nott
Corvus Digital
February 08, 2010 03:22PM |
Registered: 19 years ago Posts: 123 |
You're absolutely right, it was a web crawler going through very old posts on the site, via an old module that I put together to make crawling a little more lightweight.
Now that my load is back to "normal" usage, all the queries are very nicely optimized already.
Thanks for the great work.
___
Skye Nott
Corvus Digital
Now that my load is back to "normal" usage, all the queries are very nicely optimized already.
Thanks for the great work.
___
Skye Nott
Corvus Digital
Sorry, only registered users may post in this forum.