« Entry Category Entries Plugin | Main

A Patch for SpamLookup 2.0 bundled with MT 3.2

Of course I'm addicted to using SpamLookup 2.0, but its Keyword Filter doesn't recognize regular expressions with multi-bytes strings and Unicode properties/blocks/scripts which are supported by Perl 5.8. Therefore it's not easy to train SpamLookup for rejecting comment/trackback spams in foreign languages, especially in asian languages.

The following is a patch for enabling Unicode support in SpamLookup 2.0.

SpamLookup2.0-encode.patch

If you are using a Linux box, it is easy to apply. Just download or copy it into your MT directory and type as follows:

patch -p0 < SpamLookup2.0-encode.patch

Once you applied this patch, you could write regular expressions enpowered by Unicode support, as Keyword Filter rules.

For example, to reject comments/trackbacks with Hiragana strings, just as follows:

/\p{Hiragana}+/

Or to accept only Latin-1 comments/trackbacks, you can do as follows:

/^[^\x00-\xff]+$/

TrackBack

TrackBack URL for this entry:
http://as-is.net/mt/en_US/mt-tb.cgi/16

Comments

How does one go about applying this patch?

(and do sixapart know about this patch? they may like to incorporate it!)

If you are using a Linux box, it is easy to apply. Just download or copy it into your MT directory and type as follows:

patch -p0 < XXX.patch

And I'd like to say thank you for your advice. I've just posted this to the ProNet mailing list. If they think it's valuable, it'll be captured into the future release, I guess.

Thanks, though I do note that this info is now in the article. I do hope that you have since added it in, because if it was always there then it doesn't bode well for my observational skills!

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)