A Patch for SpamLookup 2.0 bundled with MT 3.2
Of course I'm addicted to using SpamLookup 2.0, but its Keyword Filter doesn't recognize regular expressions with multi-bytes strings and Unicode properties/blocks/scripts which are supported by Perl 5.8. Therefore it's not easy to train SpamLookup for rejecting comment/trackback spams in foreign languages, especially in asian languages.
The following is a patch for enabling Unicode support in SpamLookup 2.0.
If you are using a Linux box, it is easy to apply. Just download or copy it into your MT directory and type as follows:
patch -p0 < SpamLookup2.0-encode.patch
Once you applied this patch, you could write regular expressions enpowered by Unicode support, as Keyword Filter rules.
For example, to reject comments/trackbacks with Hiragana strings, just as follows:
/\p{Hiragana}+/
Or to accept only Latin-1 comments/trackbacks, you can do as follows:
/^[^\x00-\xff]+$/
Comments
How does one go about applying this patch?
(and do sixapart know about this patch? they may like to incorporate it!)
Posted by: Murky
|
October 29, 2005 03:13 AM
If you are using a Linux box, it is easy to apply. Just download or copy it into your MT directory and type as follows:
patch -p0 < XXX.patch
And I'd like to say thank you for your advice. I've just posted this to the ProNet mailing list. If they think it's valuable, it'll be captured into the future release, I guess.
Posted by: (o)
|
November 3, 2005 04:10 PM
Thanks, though I do note that this info is now in the article. I do hope that you have since added it in, because if it was always there then it doesn't bode well for my observational skills!
Posted by: Murky
|
November 5, 2005 05:12 AM