Stop spam, read books
Like everyone else, I get a lot of spam. Google mail generally does a good job of filtering it out, but even so 2 or 3 items of spam get into my inbox each day, and on bad days I'll find 200 emails sat waiting for me. I also keep getting grief about the amount of spam generated from websites I help manage, so when I heard about reCAPTCHA - a system that's designed to reduce website spam and help digitise books at the same time - I was interested.
Since most spam is automated - spammers send millions of emails at the
same time - a good strategy to avoid spam is to try to prove that the person sending
it is not a computer. CAPTCHAs (Completely Automated Public Turing Tests)
are designed to do exactly this. You'll have seen CAPTCHAs all over the
place - they're the warped, sometimes colourful text at the bottom of
the page which you need to identify before you can sign up to the
latest and greatest website or post comments on your favourite blog.
The idea is that the website offers you a word which is designed to be
hard for a computer to read. If you can see the word and type its
letters into the box accurately, you're more likely to be a human than
a computer. Although spammers can occasionally beat CAPTCHAs (e.g. if
the word is not warped or disguised in some way, a computer can use
Optical Character Recognition to decode it), they're generally pretty
effective at stopping spam.
Since most spam is automated - spammers send millions of emails at the
same time - a good strategy to avoid spam is to try to prove that the person sending
it is not a computer. CAPTCHAs (Completely Automated Public Turing Tests)
are designed to do exactly this. You'll have seen CAPTCHAs all over the
place - they're the warped, sometimes colourful text at the bottom of
the page which you need to identify before you can sign up to the
latest and greatest website or post comments on your favourite blog.
The idea is that the website offers you a word which is designed to be
hard for a computer to read. If you can see the word and type its
letters into the box accurately, you're more likely to be a human than
a computer. Although spammers can occasionally beat CAPTCHAs (e.g. if
the word is not warped or disguised in some way, a computer can use
Optical Character Recognition to decode it), they're generally pretty
effective at stopping spam.
reCAPTCHA is a CAPTCHA system
designed by the clever people are Carnegie Mellon University. Their
idea is to use CAPTCHAs not only to stop spam, but also to do some good
for the world:
reCAPTCHA is designed to be easy to implement on your own website or in your own application. It's available to developers in many languages (Perl, PHP, Ruby, Java, etc) and also has plugins for some popular blogging tools. It certainly is easy to use. I used the Perl Captcha::reCAPTCHA today to implement a captcha on our website's contact form. It took about 20 minutes including testing, and it works really well.
MovableType 4.0 has reCAPTCHA built in, but just you try getting it to work without Josh Carter's excellent guide. Make sure you edit the plugin template as described by Josh. If you can't see your saved public and private reCAPTCHA keys, it probably won't work. Josh has also written a reCAPTCHA plugin for earlier versions of MovableType.
Why not read a few more books right now, by commenting below?
About 60 million CAPTCHAs are solved by humans around the world every day. In each case, roughly ten seconds of human time are being spent. Individually, that's not a lot of time, but in aggregate these little puzzles consume more than 150,000 hours of work each day. What if we could make positive use of this human effort? reCAPTCHA does exactly that by channeling the effort spent solving CAPTCHAs online into "reading" books.There are lots of projects (e.g. Google Print and Internet Archive) that are attempting to digitize out-of-copyright books to protect them and make them more widely available. Since computers can't read pages flawlessly without making mistakes (which is why CAPTCHAs work), the university boffins hit upon the idea of using humans to help the computers. Words that the computer cannot recognise on its own are piped into the reCAPTCHA system and humans are asked to help identify them. By combining the 'unknown' word with a known test word, reCAPTCHA can serve two purposes at once: stopping spam and 'reading' books. Because some of the words are not identifiable due to poor quality printing, several humans are asked to identify the same 'unknown' words to make sure the word is identified correctly.
reCAPTCHA is designed to be easy to implement on your own website or in your own application. It's available to developers in many languages (Perl, PHP, Ruby, Java, etc) and also has plugins for some popular blogging tools. It certainly is easy to use. I used the Perl Captcha::reCAPTCHA today to implement a captcha on our website's contact form. It took about 20 minutes including testing, and it works really well.
MovableType 4.0 has reCAPTCHA built in, but just you try getting it to work without Josh Carter's excellent guide. Make sure you edit the plugin template as described by Josh. If you can't see your saved public and private reCAPTCHA keys, it probably won't work. Josh has also written a reCAPTCHA plugin for earlier versions of MovableType.
Why not read a few more books right now, by commenting below?
0 TrackBacks
Listed below are links to blogs that reference this entry: Stop spam, read books.
TrackBack URL for this entry: http://www.tritastic.com/cgi-bin/mt-tb.cgi/33

Leave a comment