How to solve AI? With our brain!
Our brain: The final frontier in science

The Final Frontier in Science: Our Brain
The final frontier in science is how our brain works. We know a lot about almost everything. Biology, chemistry, physics, and medicine, for example. We even know a lot about our brain from some perspectives, but not how it works in detail. Patom theory changes that, but let’s work up to it. Today in 2025, most people assume that the brain is like a computer. Crickey! (An Australian expression meaning ‘golly!’ – at least it did in the 1980s)
Anyway, we always assume our brain is doing what the latest technology does. Brains have been hydraulics, gears, holograms and now computers. I mean, the study of neuroscience even has an arm called ‘computational neuroscience’!!
My colleague, who helped me focus on a path to support Patom brain theory in the 1980s, the late Marvin Minsky, once said in a discussion about the closure of linguistic departments:
“… for some reason there had never been any mathematics in linguistics before and somehow this became immensely popular and, for maybe 30- or 40-years, support for semantics and understanding of how language works virtually vanished from the planet earth because of Chomsky’s influence and
“Most universities changed their linguistics departments to grammar departments and theories of formal syntax and so forth.
“It’s a phenomenon I hope I never see again, of an important field being replaced by an unimportant one. A wonderful, marvelous phenomenon.”
Reflections on science often follow this path. “Science progresses one funeral at a time” and people live for a long time! In the case of modern chemistry, it wasn’t the current chemists who took on Antoine Lavoisier’s advice, but the next generation!
Today’s newsletter looks at what brains do and compares it with what computers do.
Brains are not computers
Computers work because humans write their programs. In contrast, nobody helps our brain to do what it does. We experience the world, imitate others, and make use of what we learn to do new things. Even today’s AI (generative AI) didn’t just write itself, but instead programmers designed and implemented a sweeping set of procedures at a granular level.
It is hard to start from a greater difference! If AI is defined as being human-like, perhaps it needs to start from a premise that it cannot have humans involved in programs or, especially, in reviewing data. The idea that a human reads intermediate results in order to limit the amount of error being surfaced must stop. The reason is that once a human is in the loop, the AI is no longer human-like. It is a human-made product. That’s OK, but don’t call it AI. Call it what it is: a program.
What do brains do well?
People don’t speak the way we write. Speaking is more spontaneous and harder to edit because it runs in real time. In the quotes from Minsky above, you can see his sentences are really long and connected with ‘and.’ To make it read better in writing, we would typically remove the ‘and’ and put a period.
But what is going on in natural speech is useful to clarify how our brain generates language. A lot of the phrases we use stand alone. It is pragmatic in that the context uses topics and focus to introduce new information(focus) and clarify what is known, old information (topic).
More interestingly is how languages permit variations. They also have levels. Meaning is seemingly connected to language.
These features are just standard capabilities of brains, but not computers.
Variation
The primary developer of Role and Reference Grammar (RRG), Robert D. van Valin, Jr., pointed out to me in an email that:
“Variation is endemic in language” (meaning there is a lot of it!) When we think of characters in a computer, they are encoded in some way, like with ASCII, for example. If the code is changed in the machine, the character is changed. In contrast, brains rarely deal with such inputs.
In some room, we could see a number of different objects. We can recognize objects with exquisite accuracy and variation. It is in shade, sun and where is the light source? How does it relate to other objects? Is it in front or behind? A conclusion is that brains are good at finding new and known elements in sensory input, while a computer requires codes that are built of opposite polarities – zeroes and ones, current on or off.
This is a fundamental difference. Let’s call it the source of Moravec’s Paradox.
In the section below, you can see the ‘allos’ or “allo-” (Greek for other) forms. For example, other forms for phonemes in a language are therefore allophones.
Phonemes
Phonemes are the sounds that a language permits. In children’s brains without intervention they get narrowed down to the recognition and pronunciation of those used in the language that they hear. Familiarity locks it in.
There are simple variations between dialects. A friend of mine says ‘breakfast’ rhyming with BRAKE + FAST and most others say it rhyming with B+WRECK+FAST. Those very different sounds are effortless to understand for me. And English dialects such as in Ireland, parts of England, Scotland, the south of the USA and even the Philippines are different. Different words and phrases or more or less popular as well, depending on the speaker.
An allophone is defined more narrowly as a sound that doesn’t change a word’s meaning (e.g., the aspirated [pʰ] in “pin” and the unaspirated [p] in “spin”). These are hard to recognize for a native speaker, and different languages may make complex distinctions!
Computer programmers prefer the same sounds and phrases, not variation. Brains in contrast, revel in similar patterns.
Morphemes
Morphemes are the smallest units of meaning in a language (words or parts of them).
Allomorphs are different forms of the same morpheme that have the same meaning but different pronunciations or spellings, like the plural morpheme which has allomorphs /s/, /z/, and /ɪz/ in English.
In English, when you pronounce the plural forms of nouns, different sounds are made depending on the word, despite the spelling being typically -s.
For example, the English plural morpheme has several allomorphs: /s/ as in “cats”, /z/ as in “dogs”, and /ɪz/ as in “buses.” I never notice the sounds when I talk, as I just deal with the meaning of what is being said. Again, I was never taught to do this suggesting that our recognition differentiates automatically in Patom theory.
Sentences
A sentence can have allosentences – different forms that have the same semantic meaning, but differing pragmatic interpretations like topic and focus variation.
· The dog, walked on Monday, is happy.
· The happy dog walked on Monday.
And simple active and passive forms variations
· The man the dog chased saw the car
· The car was seen by the man the dog chased
Graphemes
Simple allographemes are written variations. In type, the forms are a little more constrained than in handwritten form. The following three examples can convey the same meanings, and our brains seem to recognize even complex and damaged forms effortlessly, unlike computer-based recognition.
Remember that typefaces are numerous and varied. Variations within a typeface are known as fonts. These variations include differences in size (e.g., 12-point, 14-point), weight (e.g., light, bold), slope (e.g., italic, oblique), width (e.g., condensed, extended), and other stylistic features.
· CAT
· Cat
· cat
The variations are allowed because our brains effortlessly recognize the key elements needed to read – the characters are recognized and augmented with features such as size variance, bold/italics/underline and other customizations.
It’s amazing that we both easily recognize the letters and subsequent words, but also the purpose of variations like boldness to emphasize that word or part. We are never taught these things directly, but we learn them, nonetheless.
Note that today’s language-based generative AI (LLMs) don’t start with the possible variations. They are given feeds of encoded data, not source material from physical books, for example. That’s one of the reasons calling it ‘AGI’ or strong AI in which the machine is conscious, uses senses like a human and moves like a human is so egregious. Machines that emulate the brain should start by emulating our sensory inputs because there is so much value in that to animals. Without sensory inputs and motor control, any machine remains distant from our capabilities.
Errors
When we talk, we often backtrack. We repeat words for emphasis, too. Sometimes again, and again and again! Try talking on a topic you aren’t too familiar with and don’t often talk about. What’s the bet that you say something and then correct it?
e.g. “It is clear that the banking, sorry no, the financial markets are strong.”
This message is clear, but notice that a model that is based on statistical word sequence is undermined because there are these pesky words – “the banking, sorry no” – in the dialog. By recognizing the error and adjusting the meaning, machines will be able to clarify the meaning rapidly, or ignore it until later on if final validation is needed.
Linguistic Identity
The key point about these kinds of variation is that our brains deal with them all. It isn’t surprising because we equally need to recognize predators and prey with natural backgrounds of diversity. Animals in water, on land and in the sky all have similar sensory organs to deal with real-world variation.
Brains in species from fish to humans have the capability to deal with the range of similarity in sensory inputs and motor control outputs. Brains deal with other forms of the same thing: it’s what they do and therefore what we must replicate in AI.
How will human-like natural language understanding (NLU) systems deal with the variation?
At their core, NLU systems must emulate humans. We must distinguish between different forms in a language, and then represent their meaning the same way – i.e., they have the same “value.” The Greek “allo-,” is a key concept!
Brains Use Multiple Senses
We have a number of senses. Traditionally, there are 5 senses – touch, taste, smell, vision and hearing. But that’s a little simplistic.
We have senses of balance from our semicircular canals. This isn’t hearing.
Equally, we have rods and cones to detect nighttime low-light images and daytime-colored images. These are different kinds of sensors and are recognized in different brain regions.
When we think of touch, remember that our skin has touch sensors for pain and temperature, and also some for motion and limb position, and pressure, vibration, stretch and texture. When you touch a tree branch in the night, the stretch of your skin signals to your brain so it can deal with it.
Our brain isn’t dealing with 5 integrated senses, but with a myriad of different sensors that all get tied together as multisensory patterns.
Brains are Hierarchical
Our brain is hierarchical, meaning that it builds up patterns from simpler ones.
Let’s look at how a brain recognizes a friend. A friend can be spotted from visual recognition, but also auditory recognition of their voice, or olfactory recognition from a particular perfume, for example. By recognizing objects by their multisensory constituents, the sensory recognition is low-level while the combination of patterns is a higher level.
People with brain damage that causes loss of human face recognition can use voice to recognize their spouse. This is a hierarchical pattern.
How does this work, exactly? That exactness is not the goal of Patom theory. Like a description of astronomy works without need to refer to gravity or F=ma by Newton, if a brain simply stores, matches and uses hierarchical, bidirectional patterns we can attempt to falsify that model and improve on it.
Let’s start with a high-level explanation and drill down if it works well, or try to falsify it proactively.
That’s how science works. Notice in the image below that you can recognize the characters visually.

We can look at recognition of language as a hierarchical pattern as well. Letters lead to words. Words lead to phrases. Phrases lead to sentences. To see ‘THE CAT,’ your brain is using a hierarchy - letters first, then words, then phrases. Perhaps it means TAE CHT, but only context can validate an interpretation. The hierarchy helps each element validate context to help the brain see in a consistent way.
You can’t recognize a sentence without its lower-level recognition elements – words from letters/phonemes. That’s what we learn through experience, in hierarchical order.
This model is common across recognition tasks. Vision is decomposed into regions that recognize specific things – reading words is recognized in the visual temporal region. Recognition of objects is also in that region. Motion is the visual V5 region. Our parietal lobes contain regions that connect parts of objects to whole ones and vice versa.
The range of human sensory brain regions and their deficits has been studied extensively and can be of great help in cognitive science.
Our motor control is similar. We learn to perform some action, perhaps by seeing it done, and then repeating that action to coordinate the sequences of muscles in order. Our brain also receives sensory feedback signals each time. We have “learned the skill” when we can perform the necessary variations with accuracy.
Watch a child learning how to move blocks in some game. They pick up and move the pieces and see what happens with motions and placements. The Terry Winograd SHRDLU project from MIT released in 1972 included language for the block world manipulations, but since then systems that manipulate such artificial worlds with human language remain elusive - because the computer model isn’t a good model for brains.
Robots today aren’t set up to learn like children do, nor do they have sensors and muscles in animal-like replication. Perhaps a closer replication of animal models can usher in more useful robots in the short term.
This alignment between sensory recognition and brain region loss is commonly seen in the medical literature, albeit with variation in the exact location of regions due to ‘plasticity.’
The hierarchical model allows whole objects to determine their parts and parts to determine their wholes. It is top-down recognition as well as bottom-up recognition.
Is Reasoning a Thing?
The model appears the same regardless of the pattern. Therefore, it follows that higher-level brain functions often named ‘reasoning’ and ‘planning’ can be implemented as a simple extension of the same approach.
Pattern-matching enables ‘reasoning’ by ensuring consistency across the current patterns aligned with similar, previously matched patterns.
If someone is pointing a gun at you, what do you want to do? Run? Grab it? Start singing? A ‘reasoned approach’ can be to consider similar experiences (TV, movies, for example) and take an action. The wrong action is painful (death?) while the right action allows you to live another day.
If this event has been seen before one or more times, and can be matched and its results listed, emulation can be as simple as choosing an appropriate result and then following the same steps. It isn’t reasoning as we would define it, but it looks like it.
The inclusion of context is a high-level pattern that can be used to make a decision without effort, as long as a similar experience can be applied with an acceptable outcome to the current situation.
Brains are Bidirectional
A homunculus is a little person sitting inside your brain watching the movies that are sent to it. That model comes as a consequence of thinking of the brain as the thing that receives our sensory signals and therefore, it just acts like a little version of us.
This brain model leads to the question of what controls it, and in its head is another littler version. And so on. Therefore, the model isn’t taken seriously these days.
Don’t encode data for representation
This also leads to a model we use in computer science. We store messages by encoding them in some representation and send that ‘information’ to something that unpacks it and uses it. This typical computer model is unlike our brain’s model for many reasons.
Nobody has been able to build this kind of model for robots that approaches animal-level competence. Even as computers have become more and more commonplace, faster and smaller, they don’t enable this robotic model. Perhaps that’s because it’s not the right model?
Senses and motor control come from similar brain material
I mean, bats that live in caves of total darkness get by with a kind of echolocation where their ears pick up on reflected sound for navigation. The same principles that work for human brains are likely working for bats.
When an elephant walks on its four legs, somewhat like a human crawling on hands and knees, our different brains probably make use of similar functions in similar brain regions.
People with brain damage that causes loss of human face recognition will use voice to recognize their spouse. This is bidirectional. Why? Because recognition of the object (your spouse) enables recognition of the whole (high-level) which then recognizes the parts in other senses.
Getting to the point, the commonality amongst the world’s brains and their highly diverse sensory inputs shows how, regardless of the sensory input, the brain is able to function in the real world in real time without much training. So unlike a model where there is programmed software to control some recognition, a single algorithm (store, match and use patterns) may be the best model since it avoids the need for programming, logic or intervention to make it work.
Building experience from a hierarchy aligns with what animal-brains do. Storing patterns where recognized is a valid approach that aligns with observed deficits in animals and humans. And therefore, a bidirectional model enables any of the senses to identify an object regardless of limitations in modality.
Lastly, in the brains of mammals in particular, our cortex is built with an anatomy that includes regions connected with bidirectional links – forward to more specific patterns and backwards towards sensory (or motor) patterns.
Brain Damage
The range of brain damage examples through accident, disease, age and so forth enables the analysis of damage.
In brain damaged patients who can no longer see colors, they also lose their ability to imagine colors. This hints at the storage of recognition and recall of colors in a brain region.
Patients also report that they can no longer imagine color when the brain regions that recognize color is lost. This would be predicted if the access to color recognition patterns is lost, despite fully functioning eyes.
In this model, we learn the colors of objects by seeing them. We recall the colors of objects by recognizing the object in a high-level pattern and using the bidirectional links back to its constituent parts to remember its colors. Loss of that color region means loss of imagining colors.
In Patom theory, the results of brain damage play out with these kinds of observed deficits time and again.
Discussion
Brains are hierarchical, bidirectional pattern-matchers. The patterns themselves, although not discussed here, are stored where matched and linked together with other sensory patterns that reflect the same objects. That’s how we learn to recognize individuals regardless of their sensory modality – your friend’s perfume, visual face and voice sound can all be adequate to recognize them in full.
This is the high-level explanation of Patom theory, a 1990s brain model based on observations of brain damage and scanning technologies.
Recognition based on any sense could be thought of as evidence for encoding of sensory inputs like a computer would do, but that would require some kind of signaling of the encoded elements through a brain’s networks. That’s never been claimed, at least not with reference to compelling evidence.
A simpler model is to store patterns in situ where matched and link them together. This doesn’t require the same kind of designs we need in digital computer systems, and it explains what we observe with brain-damaged patients.
Deficits tend to be lost capabilities. These can be entire senses, parts of senses like visual motion recognition, paralysis and part/whole recognition problems where parts of objects are recognized and not wholes or wholes are recognized but not parts.
The better model takes at face value what we see in simple terms. When a pattern from some object is detected in some modality, as would be consistently taking place when interacting with that object (the object can be a person, machine, or anything), it is linked together with the other modalities. But to recognize the object from a sense, reverse links are accessed to match the other object’s modalities. Simply using the bidirectional links explains the full recognition from one sense by activating the other sense’s patterns.
When you recognize your spouse, even if only by voice, the whole object (your spouse) is recognized together. We can call them by name as a consequence.
The application of the scientific model in cognitive science and brain science in particular should help to agree on a common model of brain function first, and brain emulation second as the basis for the next generation of AI based on natural science. Once we start to refine observations and capabilities based on observation, progress will be rapid and accurate to achieve the technology goals we set for ourselves.
References
Marvin Minsky, The Society of Mind, 1985
Do you want to read more?
If you want to read about the application of brain science to the problems of AI, you can read my latest book, “How to Solve AI with Our Brain: The Final Frontier in Science” to explain the facets of brain science we can apply and why the best analogy today is the brain as a pattern-matcher. The book link is here on Amazon in the US (and elsewhere).
In the cover design below, you can see the human brain incorporating its senses, such as the eyes. The brain’s use is being applied to a human-like robot who is being improved with brain science towards full human emulation in looks and capability.



