Firefox PHP

Alternate search backend with sphinx fulltext search

Posted by Thomas Seifert 
Re: Alternate search backend with sphinx fulltext search
April 24, 2007 07:07PM
added a new release 0.9.2 (the mysqlconf-release ;)) which has some fixes which I had lurking around in my own install.

btw. maurice promised to add a settings page and sphinx-configuration generator right now ;).


Thomas Seifert
Re: Alternate search backend with sphinx fulltext search
May 23, 2007 03:22PM
Just bump up the version info in info.txt :)
Re: Alternate search backend with sphinx fulltext search
May 23, 2007 05:55PM
Well, there really are some extra's in the current development version. But as we were working on it, Brian overthrew the existing search and Thomas had to redo some things to make all the new search options possible with Sphinx too. Doing that, he ran into some cool extended query feature and extra knowledge on how to use a combined index.

What I did manage to build for the previous version was a command line configuration tool (creates the required sphinx config *and* automatically updates the Phorum module config for Sphinx) and a module settings screen, so no variables have to be hacked in the module php file. These will have to be updated for the new ideas and code.

Thus: much more going on than just a version bump ;)


Maurice Makaay
Phorum Development Team
my blog linkedin profile secret sauce
Re: Alternate search backend with sphinx fulltext search
June 21, 2007 08:49AM
Hi all,

I've been playing with this module for the last day or so on a site with over 2.5million posts and it has been a big help. Previously searches often caused the phorums to become almost unusable while they were running.

I have tweaked the Sphinx config a little to help with a large database, the sql_query_range option helps stop the message table from locking while the indexer is running and adding a USE INDEX element to the counter query makes that a lot faster...

source phorum5_msg : phorum5_base
{
        sql_query_pre           = REPLACE INTO sph_counter SELECT 1, 'message', MAX(message_id) FROM phorum_messages USE INDEX (forum_max_message) WHERE status=2
        sql_query_range         = SELECT MIN(message_id), MAX(message_id) FROM phorum_messages
        sql_range_step          = 1000
        sql_query               = \
                SELECT message_id, forum_id, datestamp, author, subject, body \
                FROM phorum_messages \
                WHERE message_id>=$start AND message_id<=$end AND status=2

}

source phorum5_author : phorum5_base
{
        sql_query_pre           = REPLACE INTO sph_counter SELECT 1, 'author', MAX(message_id) FROM phorum_messages USE INDEX (forum_max_message) WHERE status=2
        sql_query_range         = SELECT MIN(message_id), MAX(message_id) FROM phorum_messages
        sql_range_step          = 1000
        sql_query               = \
                SELECT message_id, forum_id, datestamp, author \
                FROM phorum_messages \
                WHERE message_id>=$start AND message_id<=$end AND status=2
}

The mod installed very easily, all I had to tweak was the IP address of the database/sphinx server.

I'll keep you updated with our experiences over the next few weeks!

Jon
Re: Alternate search backend with sphinx fulltext search
June 21, 2007 03:27PM
Hello joonster,

My forum running with sphinx from months and it improve a lot search queries.

I have about 1,2 million posts and your tweaks are very intersting.
Improvements are only for the first indexing process or also for delta indexing ?

PS : Sorry for my english

Momo
Re: Alternate search backend with sphinx fulltext search
June 21, 2007 03:33PM
delta indexing is only indexing a couple of new posts each time. should rarely hit the 1000 messages in a hour :).


Thomas Seifert
Re: Alternate search backend with sphinx fulltext search
June 21, 2007 03:49PM
Hello Thomas,

It's what i was thinking. ;)
I never had problem with delta indexing. :)

But for initial indexing tables are locked. Joonster's tweaks will improve that.

Joonster i have a question about $start and $end variables in your query. Where are they coming from ?
From this query : "sql_query_range = SELECT MIN(message_id), MAX(message_id) FROM phorum_messages" ?

Thanks
Re: Alternate search backend with sphinx fulltext search
June 21, 2007 07:03PM
Momo,

I've left the delta indexes as in the original config, I run the delta updates every half hour and the main update every night. The last delta update indexes around 10000 posts but only takes a second or two.

The $start and $end variables come from the sql_query_range query. Instead of returning all 2.5million posts in one query, Sphinx sets the start and end variables to return 1000 posts at a time, which allows the phorum_messages table to perform normally while the indexing is happening.

See the Sphinx docs on range queries - [www.sphinxsearch.com]

For me the sph_counter update query was taking a long time - over 5 minutes, once I added the USE INDEX it took around 2 seconds which also avoided the table locking.

Jon
Re: Alternate search backend with sphinx fulltext search
June 21, 2007 07:13PM
Thanks a lot Jon !
I'm going to try your tweaks this night. ;)

I will see the improvement tomorrow nfor daily full index. :)
Re: Alternate search backend with sphinx fulltext search
June 21, 2007 07:29PM
I should say that the full index does take a bit longer overall with the range query but I'd rather the forum remained usable rather than having problems with the table locking and that in turn causing a backlog of http connections etc etc...

Jon
Sorry, only registered users may post in this forum.

Click here to login