Posted by: Rowan | June 14, 2009

Things you don’t know you’re doing

reCaptcha in use at Last.fm

I know captchas are a pain and cause accessibility issues, but they’re one way of keeping the bots out of places meant for humans.

I came across a March 09 presentation from the guy that invented them, one Luis von Ahn from Carnegie Mellon University, which might cast them in a different light for you. The video (down below) is 12 minutes long if you want to watch it but here’s a nutshell.

Let’s start with the New York Times. Their archive stretches back to 1851 but it wasn’t until 1980 [correction: 1980-something] that they had digital records of their content. Right now they’re part way through digitising all that content and expect to be finished later this year. They make a scanned image of their paper pages and use OCR software to extract the words into a searchable electronic record, no surprise there. But they’re finding that the reliability of OCR for old material that may have yellowed or faded can be as low as 70%. Which leaves a whole lot of work to be done manually, on 129 years worth of NYT. No small task.

Enter Recaptcha. The next time you come across something like the above example from Last.fm’s sign-up screen, here’s what’s happening.

An image of the word that the NYT’s OCR software can’t decipher gets sent across to Recaptcha via a web service. Recaptcha munges up the unknown word a bit more and pairs it up with a known (munged) word that it can verify, then sends the pair out as a two-word captcha to places like Last.fm – or Twitter or Facebook or 100,000 other sites. Along comes Rowan Smith to open a Last.fm account, who enters the two words in the captcha and sends them off for verification. Recaptcha confirms I’m human by matching the known word with what I typed in, and hands me back to Last.fm to complete the account setup.

The other word I typed in was the one that NYT’s OCR software couldn’t recognise. Well, I just decoded it for them. Recaptcha thanks me immensely and sends the human-translated answer back to the NYT. Just like that. Dead simple. Brilliant, actually. And NYT is only one example – other digitisation projects like Google and The Internet Archive can and do use the same web service.

The captcha might be a pain, but I do like the warm fuzzies I get from doing my bit for humanity, in the time it takes to deal with one. I hope the 400,000,000 other people who have contributed so far, at a rate of 35,000,000 newly digitised words per day, do too.

If you can handle watching a computer science type for 11:50, here’s the man himself. Don’t be put off – it’s actually quite good.

Posted by: Rowan | May 25, 2009

The trouble with geeks

Here’s a YouTube video presaging the internet from a 1969 movie clip (purportedly at least, and I have few reasons to doubt it):

They weren’t actually far off with the technology in terms of what it does – end to end transactions and all that. Glad I don’t have to use those big clunkers though, I’d need something the size of an aircraft hangar to do what I want to do now. And I’m quite happy with the humble interface I use to do that stuff already.

They didn’t anticipate the social mores changing underneath them though when they made their pitch, did they? That’s the trouble with geeks.

Posted by: Rowan | April 19, 2009

Who’d have thought it…

As both an RSA member (ex-president-of-Vice, in fact) and an online sort I’ve always been quite interested in what RNZRSA – the headquarters – does online. And so far it’s been a rather good web site built by Shift a few years back. (HQ might have had something to do with a couple of sites around the Vietnam commemorations as well, I don’t know, but I think that’s been about it).

The site does the job. It presents RNZRSA very professionally and if you’re travelling the “Find an RSA” lookup is clunky but it sort of works. It’s a good web site, but still just a web site.

So I was really impressed when I saw the online poppy field and Wall of Remembrance released last week. So were lots of other people, apparently, if the messages and donations so far are anything to go by.

I’m not going to waffle about it other than to say what a slick job online by RNZRSA. Nice one, Stephen.

So with Anzac Day just round the corner, now’s a perfect time to spare a moment and go and have a look for yourself … here it is. And it’s certainly worth a donation.

[And btw, it looks like it's built on a Silverstripe platform so hopefully they might've put it together fairly cheaply. These guys are onto it...]

Posted by: Rowan | April 1, 2009

Well done, those folks!

I have to say I’m impressed. The Eee netbook, out of the box, is a little bloody stunner.

At $399 all up, it got off to a pretty good start with me from the moment I paid for it. They’re normally $499 but I chanced on a special at Harvey Norman, who do great specials (well at least one) but have an utterly useless web site. Take it down or do it up guys, purleez!

Back to the Eee. ScrEeech (sorry) into the driveway, sprint inside. Rip it out of the box, plug it in and turn it on, and that’s it. No, wait – I had to assign myself a user name and password and timezone, and I had to tell it my network password when it asked me. Too easy. It’s almost disappointing if you like to play round configuring stuff.

It’s the size of an A4 page folded in half and if I chuck it in my backpack in the mornings with my other A4 pages folded in half, it’ll weigh me down by 0.9kg plus the power pack.

Battery life is meant to be around 5 hours – sounds great – but I haven’t found out for myself yet. The flash drive is only 8GB so I won’t be storing my photos on it but it’s big enough to hold the internet as far as I can tell. And it draws 12 watts so my running costs are are a whopping 0.3c per hour.

It’s not entirely without drawbacks. The screen is limited to 800 pixels wide (on the Linux model at least, but maybe not), so you run into a bit of horizontal scrolling on sites hard-coded for bigger screens. But that’s precisely the reason for my occasional but impassioned rants about building fluid web sites. I’ll count that as a cross against web designers, not against the Eee. Sites like this work beautifully.

And the keyboard is pretty cramped so it’s fairly easy to make typos [note-to-self: Use spellchecker]. That’s the only real downside of the Eee in my eyes. I can live with it, I’m not a high-speed typist anyway. I’m typing this post on it and we seem to be getting along OK (in a text file, though – the WordPress editor crumbles at 800 pixels wide. Irony strikes unexpectedly…)

The Linux version’s default Xandros desktop is a class act. It’s so intuitive that I suspect my wife and friends who are maybe more in the mainstream of computer users than I am would use it quite happily. It’s about the nicest Linux desktop I’ve seen so far, for ease of use anyway.

Eee Work tab Eee Play tab Eee Internet tab

It feels a little too locked down for me though and my other Linux gear is Ubuntu, so Xandros is probably going overboard in favour of Eeebuntu.

But for the sake of four hundred bucks, I’m not arguing. I’ve got myself a cheap, lightweight, very portable and very usable web platform and, increasingly, that’s all I need. And Asus have got a very happy customer. Well done, Asus.

Posted by: Rowan | March 11, 2009

Sauvignon or chardonnay?

Every now and then, something opens your eyes, so to speak.

Now that Jonathan Mosen is back in Wellington we took the chance yesterday to catch up and shoot the breeze.

Jonathan’s got this annoying habit of flooring me with his use of technology. He does it every time we meet. I should have known he was going to do it to me again.

And sure enough, after free and frank discussions about the web and accessibility and Flash and Facebook and working for [insert variable] and ARIA and RSS feeds and what we’ve been up to and good ol’ DOS and accessible appliances and people hooking their washing machines up to Twitter etc … he pounced.

He’s got this neat piece of software loaded onto his phone, he told me. It does an OCR scan of the photos he takes with it, and reads out the words it finds in them. Ergo, he can pull two bottles of vino out of his fridge, photograph the labels, and find out which one’s the sauvignon and which one’s the chardonnay.

Brilliant.

I wish I knew how to do that. It would save me the time I spend back-tracking to find where I left my glasses.

Older Posts »

Categories