Wednesday, June 23, 2010

A new video.

See it here.
A lot of changes which I need to make better use of.

A little rant about recaptcha (prompted by having to solve recaptcha). In theory, they could do a noble thing: instead of wasting human attention, use it to read words in the books, words which computer software cant read. That's what they claim they are doing. That would of been absolutely terrific. That would of been totally awesome.
Unfortunately, there's one little thing everyone sort of misses, even though it is absolutely right-in-your-face even on their homepage:


You see, recaptcha is, for most part, using quite computer-recognizable scanned words, resorting to addition of extra distortion, blur, strikes or blobs, as to make those words readable only by human and to stop the bots. It's mostly the distortion that's making it computer-unreadable, not the book's age. As the technology evolves, they are adding more and more distortion. And also harming human's accuracy. Case closed. Sorry, guys, you've been duped, perhaps too easily because it feels better to believe that your captcha is doing something good and noble rather than just wasting people's time.
(Other little detail that is always glossed over is that its not 'books', its new york times newspaper archives and the like. "Stop the spam. Read newspapers." doesn't sound so noble)

3 comments:

  1. I like the last point. But, are you aware that from the two words one is known by the system?

    ReplyDelete
  2. What difference does it make to the usefulness of it? The point is that they have very OCR-recognizable text in both words, and then mangle it to stop the bots. The scanned paper merely provides some sort of peace of mind.

    ReplyDelete
  3. This blog is dope.

    ReplyDelete