Firefox PHP

Spam Hurdles Module (CAPTCHA's and other anti-spam tools)

Posted by Maurice Makaay 
Re: Spam Hurdles Module (CAPTCHA's and other anti-spam tools)
January 06, 2007 07:57AM
Yes. If you look at the spamhurdles.php script, then you'll find a percentage definition at the start of it. This percentage defines the chance for the automatic garbage collection to run. By default, I put it at 1%, so about once in every 100 requests, the expired entries are deleted. At least, that's what it is supposed to do ;-)

Deleting of old entries is done by a function from the db.php script (spamhurdles_db_remove_expired). This function will throw away entries that are expired.

There are two things that you can tweak in the garbage collection:

1) Increase the SPAMHURDLES_GARBAGE_COLLECTION_RATE define in spamhurdles.php to run garbage collection more often. Personally, I do not think that will do much, since on a busy forum 100 requests won't take long.

2) Decrease the TTL (time to live) for the spamhurdles data. This TTL defines how long it takes before the spamhurdles data expires. If you make this too small (let's say something really awfull like 2 minutes), then the data may expire before users finish their posting. They won't be able to post in that case. So be sure to leave it high enough. By default, this is 3600*8, so 8 hours. If you want to change this, then you can edit defaults.php, look for key_max_ttl and change the 3600*8 that's behind it.

If you have no fundamental problems with having 3000 entries in that table (it's not relevant for performance or anything), then you can leave everything like it is now. If the garbage collection is working like its intended, then the number of entries shouldn't grow a lot more (since you looked over a period of 8 hours, which is the same as the TTL).


Maurice Makaay
Phorum Development Team
my blog linkedin profile secret sauce
Re: Spam Hurdles Module (CAPTCHA's and other anti-spam tools)
January 06, 2007 08:34AM
Hi Maurice,

Great it does work ! This morning the table is down to 996 entries. I understand now that each entries will stay a minimum of 8 hours.

Thanks for this great improvement,

Yves
Re: Spam Hurdles Module (CAPTCHA's and other anti-spam tools)
January 08, 2007 06:29AM
May be add to Spam Hurdles Module protection against registration user with login which consist characters from more then one language?
(Some "bad" users chose mixed language login which looks like "good" users - just change Latin "a" on Russian "a", for example)
Re: Spam Hurdles Module (CAPTCHA's and other anti-spam tools)
January 08, 2007 06:44AM
Won't some "good" users do the same? And do you have any full list of possible replacement characters available? I'm not yet sure how to program this cleanly for all systems, because of the encoding differences between systems.


Maurice Makaay
Phorum Development Team
my blog linkedin profile secret sauce
Re: Spam Hurdles Module (CAPTCHA's and other anti-spam tools)
January 10, 2007 10:55AM
I am now testing this on my busiest site (the amount of CAPTCHA files created by previous versions was causing me all sorts of problems).

Sincere thanks to Maurice for producing a new version so soon into the new year!

Am I correct in thinking I can now delete the mod_spamhurdles folder in my /tmp directory?

As an aside, I'm excited to see a new table in Phorum created by this module. I always thought that was a hude no-no for a module. I will have to look at the code to see how I can use this for some of my wishlist modules.

/\dam

--
My notable Phorum sites:
Movie Deaths Database - "review comments" system mostly powered by Phorum
Learn Chinese! - integrated forum quiz
Re: Spam Hurdles Module (CAPTCHA's and other anti-spam tools)
January 10, 2007 11:50AM
You're right. The /tmp cache folder can be safely deleted now. All data is stored in the database.

Since the phorum database user has all rights (it can do the install of Phorum as well, which involves creating tables), it's perfectly possible to create tables from a module. Simply execute a CREATE TABLE statement and keep track of this so it will only be done on first access. How to implement it is a choice for the developer. The system that I wrote for the Spam Hurdles is probably only a first example of how this can be done.

One thing to keep in mind when using the database for storing module data, is that Phorum supports multiple database layers. It is good practice to include database functions for all standard available database layers. This means that a published module should include full support for mysql, mysqli and postgresql.

Because of the simplicity of the queries that I use in Spam Hurdles, I created a single layer file (db.php), which will execute different queries in a switch statement, which acts on the active database type. For more complicated database layers, it might be better to split the layer functions into separate layer files (like the Phorum core does).


Maurice Makaay
Phorum Development Team
my blog linkedin profile secret sauce
Re: Spam Hurdles Module (CAPTCHA's and other anti-spam tools)
January 14, 2007 05:34PM
Now I have 7500 rows (10 Mb; 15 Mb before optimization) in phorum_spamhurdles :(

What I should do with SPAMHURDLES_GARBAGE_COLLECTION_RATE and TTL to decrease table size? (and I didn't found TTL variable in spamhurdles.php :(
Re: Spam Hurdles Module (CAPTCHA's and other anti-spam tools)
January 14, 2007 06:09PM
Quote
Vit.A
Now I have 7500 rows (10 Mb; 15 Mb before optimization) in phorum_spamhurdles :(

What I should do with SPAMHURDLES_GARBAGE_COLLECTION_RATE and TTL to decrease table size? (and I didn't found TTL variable in spamhurdles.php :(

Quote
mmakaay
edit defaults.php, look for key_max_ttl and change the 3600*8 that's behind it

3600*8 menas garbagecollector removes every row that's older than 8 hours. Make it 3600*4 and garbagecollector removes all rows older than 4 hours.

garbage_collection_rate is basically a percentage or probability when garabage collection is run. ie 1% means every 100th use of the script all old rows are deleted. Ir you use 2% then every 50th request triggers garbage collection.

At least that is what I understand from Maurices message above.

---
-=[ Panu ]=-
Re: Spam Hurdles Module (CAPTCHA's and other anti-spam tools)
January 14, 2007 06:13PM
The only logical thing to change for a busy site is the TTL, since that is run often already. I already described where to change this a couple of posts above this one (actually it's in the post that you already have read). Be sure not to make the TTL too low, else people might get into troubles when writing a message takes longer than the TTL. Excerpt from the post above:

"If you want to change this, then you can edit defaults.php, look for key_max_ttl and change the 3600*8 that's behind it."

BTW: Having 7500 rows in that table shouldn't be a big problem, unless you're with some provider that is very tight on the space that you can use in the database. The average number of messages in that table should become stable after 8 hours.


Maurice Makaay
Phorum Development Team
my blog linkedin profile secret sauce
Re: Spam Hurdles Module (CAPTCHA's and other anti-spam tools)
January 14, 2007 06:16PM
Thanks to Rick (user eeek), the Postgresql support for Spam Hurdles is also working now. I did supply a db layer for Postgresql, but as a result of not testing this layer at all, it was ..well... crappy at least :-) In the first post of this thread, I uploaded version 1.1.2 of Spam Hurdles with the new working Postgresql support. If you're already using Spam Hurdles in combination with MySQL, there's absolutely no reason for upgrading your installation.


Maurice Makaay
Phorum Development Team
my blog linkedin profile secret sauce
Sorry, only registered users may post in this forum.

Click here to login