Captchas Promise to Keep Bots at Bay

In the tradition of the old Turing tests, captchas are tests for telling computers and humans apart online. This article from NYTimes reports that captchas make use of the users' vision or hearing to create a "puzzle" that current bots can not solve. You you may already be familiar with the one that Yahoo! uses for user registration, but at captcha.net you can attempt several types of captchas to see how they work and to read about the possible applications for these measures.

Link courtesy of Arts and Letters Daily.

I think it's interesting that right now when visual literacies and rhetorics are becoming more prominent in composition studies that we are turning to these aptitudes in order to bypass the many types of artificial intelligence and automated programs that have been developed to mimic actual human users. While we can create bots that reproduce the linguistic patterns of human speech well (see recent Kairosnews post on the Alice bot), it seems a more difficult thing to approach the visual or audio elements of cognition. Could you imagine being fooled by a chatroom bot if you could enter a common drawing space and ask it to draw a cat?

The web is already arguably a highly visual medium (of course there is a certain amount of text-browsing going on as well), but our interactions with many sites are still highly text-based. Captchas may develop beyond visual security protocols. Soon we may be creating personal visual passwords rather than textual ones. This seems a return to what Gibson's Neuromancer and movies like Johnny Mnemonic envisioned all along: the net as an immersive visual and spatial interface (these are just two commonly known examples; I think the extent of the visual "bias" in writing about the future of the net was/is fairly widespread; even The Matrix has that catchy image of the cascading green numbers). For instance, in the film Johnny Mnemonic, the courier played by Keanu Reeves uses a VR headset and special gloves to navigate an electronic landscape in which programs, sites, and security protocols are represented visually. Furthermore, he has information placed in his brain that only a three-image visual password can decrypt. Security captchas that ask us to determine what elements in a picture are closest to the viewer, for instance, ask us to judge depth and perspective, rather than recall letters, shifting the focus to visual rather than textual literacy.

I was especially intrigued by the visual captchas at captcha.net, especially Pix which asks you to look at several pictures and identify the common object among them. I actually got it wrong a few times, for instance, by choosing "face" instead of "nose" or typing "tooth brush" instead of "toothbrush," but overall it does seem effective against a bot. I wondered though, since the pictures are of concrete objects, whether it would be easier for a bot, in order to defeat such a captcha, to go through a "dictionary attack" of just words referring to visible concrete objects.

Whether or not bots can be programmed to defeat captchas, this development speaks to the current focus on visual literacy in a way that proves how complex some of our most basic cognitive processes are. For a while, it seems we have been envisioning the net as a primarily visual landscape. Perhaps captchas bring us a little closer to making (that) vision matter.