Firefox PHP

Spam Hurdles Module (CAPTCHA's and other anti-spam tools)

Posted by Maurice Makaay 
Re: Spam Hurdles Module (CAPTCHA's and other anti-spam tools)
February 03, 2007 12:07PM
I put some of the ideas on my TODO list for this module and will get around to implementing them once I'm done with my currently active Phorum projects. Thanks for the comments and thinking along!


Maurice Makaay
Phorum Development Team
my blog linkedin profile secret sauce
Re: Spam Hurdles Module (CAPTCHA's and other anti-spam tools)
February 03, 2007 02:28PM
Quote
freedman
I'm happy with the module as it is
Me also but this is not a reason for not improving it. I find the logic of this module much more cleaver than any other solutions I've seen for phorum as well as for the rest of my site, and this is one more reason to see it working the most efficient way it can be. So I think that it must follow the initial principle of Phorum and be powerfull by itself regardless of the available disk space for storing or processors/servers capacity. That means that less data we have and less operations we have, better it is for everyone.

I agree with Sheik that reusing captcha's is a good way to economize a lot of captcha's. In my site I use tree listing of answers, so users have to see messages one by one, which means that, currently, for only one discussion, one user will generate as many captcha's as there are answers in the discussion although he will post only one or two messages.

The other important issue I think is bots activity. As they go through all messages of the forum, they generate a lot of useless captcha's. By one way or an other, this should be avoided to be conform with the principle no useless data, no useless operations.
Quote
makaay
if the user agent field is matching a bot's name. If yes, then it virtually closes down viewed threads, so replying is not available, nor possible
This may be a solution, even if it is not a bot but a spammer imitating a bot, writing will not be possible. Thats what we all want no ? But shall "closed viewed trees" will be still indexed by the bots ?

And a last observation for today concerning data quantity/redondance:
Quote
makaay
If you guys are really this much into shrinking the size of the table for the spam hurdles mod, then I could create an option in the admin interface for configuring whether the object generated data should be cached or not
In general, I don't like thinks with a lot of tuning parameters. Some are usefull, too many they are always a source of errors and misunderstanding. If redundant data can be economized, they must be done in the module's code. Don't let to users decide of what developers hesitate to decide themselves!
Re: Spam Hurdles Module (CAPTCHA's and other anti-spam tools)
February 03, 2007 02:33PM
Quote

That means that less data we have and less operations we have, better it is for everyone.

Not really. Less data (which is mostly cached data) leads to more processing / computation needed. so you trade harddisk-space for processing time.
IMO harddisk-space is much less expensive than processing-time and is usually the way to go.
with all that told by you you will be probably one of the guys who will need to disable all caching features which were added to 5.2 as they need addiditional space to give more speed.


Thomas Seifert
Phorum Development Team / Mysnip-Solutions.de
Custom Phorum and general software development
worry-free Phorum Hosting
Re: Spam Hurdles Module (CAPTCHA's and other anti-spam tools)
February 03, 2007 02:55PM
Quote

This may be a solution, even if it is not a bot but a spammer imitating a bot, writing will not be possible.

A spammer imitating a bot would get a closed thread and cannot answer the post for that reason. Whether a capthca would be generated is something that's up to the Spam Hurdles module. I'll have to look into that to make sure that it skips that step on closed threads (maybe it already does so, but I can't remember putting that in the code).

Quote

will be still indexed by the bots ?

Sure, why not? The threads are closed, which means that replying is no longer possible. The bot is still able to read the pages for the discussion and index its contents. Only the reply option isn't available in that case.

Quote

Don't let to users decide of what developers hesitate to decide themselves!

The dev could always program anything in the way he likes it. But that ignores the fact that there are a lot of users that do want to tweak things. The option I was talking about, really is one that can be used differently based on the type of site you have. Big sites with lots of visitors want the caching, small sites on disk space tight shared hosting environments want no caching. I agree that coding choices are to be made by the dev, but this is a clear example of a choice which would upset users (if not, why is this discussion about disk space going on here in the first place? ;-).


Maurice Makaay
Phorum Development Team
my blog linkedin profile secret sauce
Re: Spam Hurdles Module (CAPTCHA's and other anti-spam tools)
February 09, 2007 01:29PM
Hm, there *is* a problem with the cached data... Not so much because of its size but its number:

We have a partition of 1 GB fpr /tmp, which is usually quite sufficient. Now we upgraded from Phorum 5.0.20 to Phorum 5.1.19 and switched from MathCapthca to Spam-hurdles. After 2 days our /tmp-partiton was full... not full of data, but all i-nodes have been used... I guess espacialle the above mentioned point of spiders/robots crawling through a big forum, all messages one by one, will generate quite a lot of files/directories ... at the moment, we just have to stop to use spam-hurdles, keeping me thinking about an unexpensive solution (changing partition sizes is *not* unexpensive in this definition ;-))... "purge expired cache data" is running and running and running... doesn't seem to do much, as number of used i-nodes doesn't change... (?)
Re: Spam Hurdles Module (CAPTCHA's and other anti-spam tools)
February 09, 2007 01:31PM
You're not running the latest version of the mod, are you? The latest version does not use the cache files, but the database instead.


Maurice Makaay
Phorum Development Team
my blog linkedin profile secret sauce
Re: Spam Hurdles Module (CAPTCHA's and other anti-spam tools)
February 09, 2007 01:45PM
doh! Thought I downloaded it 2 days ago, but info.txt says, it's version 1.06! Guess I had an older version still hanging around... stupid me! :-)
Re: Spam Hurdles Module (CAPTCHA's and other anti-spam tools)
February 09, 2007 07:12PM
Quote
barcino
...our /tmp-partiton was full... not full of data, but all i-nodes have been used...

I'd suggest, especially for tmp which you want to be fast, you use a filesystem with dynamic i-node creation and high-quality journaling.

tmp tends to get used for large numbers of small files, so this is especially important.

my personal preference is to allocate my tmp disk space to swap and then run tmp as a tmpfs -- I've found huge performance improvements with this configuration.
Re: Spam Hurdles Module (CAPTCHA's and other anti-spam tools)
February 25, 2007 04:36PM
Czech (win1250) language file ;-)

Tom_CZAg
Phorum - STRIBRO.net
Attachments:
open | download - czech-win1250.zip (1.2 KB)
Re: Spam Hurdles Module (CAPTCHA's and other anti-spam tools)
February 25, 2007 07:13PM
Thank you! I'll include it in the next release.


Maurice Makaay
Phorum Development Team
my blog linkedin profile secret sauce
Sorry, only registered users may post in this forum.

Click here to login