We’ve all had been asked to type in the letters and numbers in those funky pictures on websites, such as pictured to the right. These pictures are called CAPTCHAs and are used to try to identify if a visitor to a website is human or computer program. Online banks, stores, news sites, chat boards and other sites want to weed out automated programs that try to steal information and spam message boards. The question you may have is how do these pictures identify a user as human.
CAPTCHAs are designed around human psychology. Specifically they are designed on the human’s natural ability to guess at or pick out identities in ambiguous information.
The letters and numbers in the CAPTCHAs aren’t literally the letters and numbers, but abbreviated, mixed up, distorted versions, often with other information (visual noise) overlapping and/or surrounding it.
When told to pick out the symbols, the human picks out which letters and numbers they perceive. The symbols don’t perfectly match the textbook letters and numbers, but the human guesses what they most approximate. Computers can eventually, or at least sometimes, identify the symbols but it takes longer. They have more difficulty with the ambiguous, messy and mixed up images.
Humans can make out different individual letters in this CAPTCHA, but it’s really just one connected graphic.
This CAPTCHA tilts the symbols and places them on a background, both which make it harder for a computer program to read.
If a computer program gets too good at solving CAPTCHAs, the CAPTCHAs can be made more difficult by distorting the letters more and adding more visual noise– but it is hoped not so difficult that they fool humans. CAPTCHAs also come in a wide variety of forms– shapes, angels, different backgrounds, colors and more– making it so a nefarious program can’t apply one cookie cutter rule.
The Human Psychology Behind CAPTCHAs
Humans live in an environment that is filled with complex, ambiguous and distorted information. Humans have learned and inborn mental methods to nonconsciously identify things and judge the complex information in our daily lives. We compare side-by-side objects to judge size, distance and speed. We identify distant silhouetted objects by how their shapes match up with our memories. We ‘recognize’ objects and qualities in paintings, sketches and movies using these same nonconscious methods.
Realize that humans never see the entirety of an object or scene, any object or scene. Not only are things such as coffee cups and sticks and tree branches partially visibly obscured by overlapping other objects, but we can never see all sides and parts of an object at once. Even with an apple you’ve turned over in your hands, you can’t be sure whether it’s fresh or rotten in the core until you bite or cut it apart. Humans live, learn and learn how to process and judge information in an environment where information is always obscured or otherwise hidden from view.
Ambiguity is a concept essential to understanding humans, as humans constantly make choices in the face of ambiguous information. Often caused by missing or obscured information, ambiguity means there is more than one possible explanation to something, and the viewer doesn’t know, often can’t know, which one is correct. In the face of ambiguity, the mind will almost always pick the explanation that meets its expectations and experience.
CAPTCHAs are examples of ambiguous information where we guess what is the identity.
The following is a closer look at some of the specific cognitive techniques we use to process ambiguous information, both in nature and in CAPTCHAs.
Shape, patterns and form biases
Human visual perception is profoundly influenced by biases about forms, shapes and patterns. Humans have ingrained and nonconscious attractions for specific forms, shapes and patterns. Some of these biases are genetic, while others are learned. These biases greatly influence how we perceive, organize and label, and are essential to the quick identifications needed to go through life.
You instantly perceived a dog in the black shape that started this chapter, even though the shape lacked fur, eyes, whiskers, correct size and other essential doggy details. You didn’t have to contemplate the shape. You perceived it instantly.
The problem for humans is that their biases for certain shapes, forms and patterns are so strong and ingrained that they will perceive these things when don’t objectively exist. These biases lead to many visual illusions.
Our form and pattern biases are shown when we perceive horses or castles or hot rods or other familiar shapes in clouds. These ‘identifications’ are subjective to the viewer, and do not objectively exist in cloud. There are thousands of possible connect-the-dot shapes in a cloud, but you perceive, or mentally pick out, that which matches your knowledge. The horse or castle is a projection of what exists in your mind. If there were no horses on earth or in fantasy books, you would not perceive a horse in the cloud, as you wouldn’t ‘recognize’ it.
Technically, our perception of letters and numbers in CAPTCHAs is visual illusion, because they aren’t exact representations of those symbols.
Many see the figure of a lion cub in this cloud
People mentally assemble the lines and marks in this Rembrandt etching into a face
As with the sketch and cloud, we pick out recognizable figures amongst the myriad of information.
When looking at a scene, all humans have the natural and nonconscious ability to extrapolate beyond what is visible. When information is missing, or assumed to be missing, humans make it up in their minds.
This ability is essential to normal living, as we must regularly make quick guesses with limited information. When you step on a sturdy looking building step, you assume it will hold your weight. When you pull a book from the library shelf, you assume the pages are filled with words. When your waitress brings you a steaming mug, you assume it is filled with a hot liquid.
In many cases the extrapolation is accurate, or at least a fair estimate of reality. If your dog is standing on the other side of the open doorway, half hidden by the wall, you correctly assume an entire dog exists. As the dog steps forward into the room, your assumption is proven correct. When the waitress puts down your steaming coffee mug, you are far from surprised to see it’s filled with the hot coffee you ordered. Humans would be a dim, slow species if we couldn’t make these kinds of elemental deductions.
In many cases, however, the extrapolations are wrong. These bogus extrapolations involving the viewer non-consciously perceiving what he wants to see or expects to see.
Though the overlapping prevents us from knowing, most will assume the above picture shows whole playing cards. I assume the cards are rectangular and whole.
This says ‘I love you’ many times
Now read it with the ruler removed. Your earlier reading was based on assumption.
Similar to the overlapping cards and ‘I Love You’ pictures, we imagine, or guess, there to be whole letters behind the line.
With all the ‘overlapping’ information, it takes imagination to guess the symbols.
Focusing and ignoring
Both in real life and when art and CAPTCHA viewing, humans focus on some information in a scene while being oblivious to other. The audience can get into a movie to a point they forget they are sitting in a theater and watching a projected image showing paid actors seen in earlier movies. This explains why a movie shark can make jump the audience in a desert theater one thousand miles from the nearest ocean. Someone get into a book or music he forgets where he is.
A human does not and cannot simultaneously focus on all information in a scene. Humans don’t have the mental capacity. Humans focus on some things and ignore others. When you enter a room, your eyes are drawn to something or things. Perhaps you focus on the gracious hosts, perhaps a statue to the side. If there is a rat in the middle of the floor, your immediate perception will be of the rat and not of the rose wallpaper.
If you enter the room and there is an attractive nude, you likely won’t notice what is on the coffee table. You might not even notice the coffee table. After blushingly excusing yourself and scooting out of the room, you may not recall the existence of a coffee table, but it was there right in front of your eyes.
This focus, and the resulting perception, is your creation.
This visual illusion involves both focusing and ignoring, and imagination. The viewer forms a perception about the whole from looking at just one end. When she looks at the rest of the graphic she realizes her extrapolation, or initial perception of the whole, was wrong.
When you label this picture as ‘a fox’ you both ignore or ‘bleep over’ the marks around the fox design and the fact that it has no front legs. The fox label involves human imagination and a choice how the marks fit together, which matter and which don’t to identity. Of course, this isn’t a real fox, but a bunch of pen marks that you subjectively assemble into a fox.
You simply ignore the marks around the symbols, as if they don’t exist. The computer has a harder time ignoring the marks.
Getting back to CAPTCHAs
As already mentioned, CAPTCHAs aren’t fool proof and internet bad guys are continually trying to break CAPTCHAs. In fact, computer programs can solve CAPTCHAs. It is just that it usualy takes longer and with less accuracy that with humans. Bad guys also hire humans to solve the CAPTCHAs. They literally have a room full of low paid humans in India or wherever solving the puzzles. Websites usually use CAPTCHA’s as one of numerous ways to block hackers and spammers. Passwords, monitoring message language and other puzzles to solve are also often used. It’s a never ending taste to keep websites safe and CAPTCHAs is just one tool.