Current large-scale language models such as GPT are becoming increasingly sophisticated tools for text generation and question answering. For example, OpenAI’s GPT-4 or Anthropic’s Claude are proving to be extremely capable tools for generating coherent and diverse texts, from question answering to creative writing. When enriched with more advanced features such as the implementation of a “thinking” way of accessing answers, as shown by o1 or other extensions in the form of canvas 4o or artifacts in Claude, they can handle more complex logical tasks and provide an undoubtedly higher level of accuracy. These advances allow models to generate not only text that is close to human speech, but also answers that are much more contextually relevant and accurate.
Thus, these models, based on neural networks and trained on huge corpora of texts, are proving to be very effective in mimicking human language. With the ability to learn from billions of sentences, these models can predict the most likely sequences of words based on context, producing text that appears natural and is often comparable to human speech. However, their amazing generative abilities sometimes lack the most important thing – firstly, deep understanding and secondly, a real capacity for logical reasoning, aspects that can be complemented through formal logical systems. This reflection focuses on the possibilities of combining these two approaches and the obstacles that await language modelers on this path.
How to teach artificial intelligence to truly understand language?
We see remarkable feats of natural language processing every day. Chatbots and virtual assistants such as ChatGPT can produce texts that are indistinguishable at first glance from those written by humans. They answer complex questions, write poems, and even program. The Turing Test has long since become obsolete.
This progress begs the question: Does AI really understand what it is saying? The answer so far is “no” – current AI systems mimic human communication very well, rather than really understanding it. However, researchers are working hard to change this situation and develop AI systems that actually understand the meaning of the words and sentences they process and generate. The following text outlines one potential path.
To understand why current AI systems do not understand language as we humans do, we must first look at how these systems work. Today’s most advanced AI language models, known as “big language models” (LLMs), work on the principle of statistical prediction. Think of it as an extremely sophisticated version of the autocomplete text-filling features we know from our smartphones or email clients. These models learn from the vast amount of text available on the internet – from newspaper articles to scientific publications to social media posts. In the process, the model builds up complex statistical patterns about what words and phrases typically follow other words and phrases in different contexts.
Then, when given the beginning of a sentence or a question, the model uses the learned patterns to predict what words should follow to produce a meaningful answer. This approach has proven surprisingly effective. AI models can generate coherent and contextually relevant text, answer a wide range of questions, and even solve some types of logic problems. The results are often so convincing that they give many people the impression that the AI actually understands what they are talking about.
The problem, however, is that these models don’t actually “understand” the content in the sense that we humans understand understanding. Human understanding of language involves creating mental models of the world, understanding causal relationships, logical reasoning, and the ability to abstract. For example, when a person reads a sentence such as “A child dropped a jar and it broke,” they immediately understand that the cause of the breakage is the jar falling to the ground. It can also recognize complex contexts such as sarcasm or irony, levels of understanding that AI models usually cannot interpret correctly. AI models don’t do any of this – they just mimic the patterns they’ve learned from the data very well.
Imagine the following situation: ‘All humans are mortal. Socrates is human.” Using logical reasoning, one could easily conclude that Socrates is mortal. An AI would probably also come to the same conclusion, but not because it performed logical deduction. Instead, it would have answered based on the fact that it often saw this type of question associated with this answer in the training data. If we changed the name to a less familiar one, or changed the structure of the argument in a way that was not present in the training data, the AI might fail or provide inconsistent answers.
This lack of true understanding and reasoning ability is a significant limitation of current AI systems. Although these systems excel at tasks that require pattern recognition and statistical prediction, they fail in situations that require deep understanding of context, abstract reasoning, or solving novel problems not encountered during training.
Formal logical analysis: accuracy vs. flexibility
Formal logic systems, such as Transparent Intensional Logic (TIL) or Montague Grammar, focus on analyzing the meaning of words and sentences based on fixed rules. Transparent Intensional Logic (TIL) is a formal system that is used to model the meaning of natural language using complex mathematical structures, while Montague Grammar is an approach to formal semantics that attempts to describe the meaning of natural language using methods from formal logic and mathematics. The meanings of words are precisely determined and the logical relationship between them can be formally analyzed. This provides a high level of precision that allows us, for example, to infer the truth values of individual statements. These systems work with a precise representation of semantic relations, which allows logical inference and ensures logical consistency of statements.
However, formal logic suffers from a lack of flexibility. For example, when dealing with sentences that contain idiomatic expressions or culturally specific metaphors, formal logic can fail because these expressions often do not make sense when interpreted literally. Formal logic cannot deal with these nuances of meaning without additional context. The problem is working with ordinary spoken language, which is often full of ambiguities and semantic nuances. Natural language is dynamic, changeable and often inconsistent. Formal logic can hardly cope with ambiguities, where a single sentence can have different interpretations of meaning without additional context.
Take for example the sentence, “The man was looking at the woman with binoculars.” It is immediately clear to a person that this sentence is ambiguous – it is not clear whether the binoculars are being held by a man or a woman. One would identify this problem and might ask for clarification. The AI model would likely simply choose one of the interpretations based on which one occurred more often in its training data, without realizing that there was an ambiguity. Such ambiguity is difficult to resolve purely through strict logical analysis without additional information or broader context. It is in these situations that neural models excel, as they use probabilistic information from large amounts of data to predict the most likely meaning. Thus, if elsewhere in the text there is a reference to the man being a hunter, or equipping himself with binoculars before leaving, the LLM’s answer will be not only probable but also true. The ability to handle ambiguity and to estimate meaning based on context is a key difference between statistical and logical approaches.
Despite these differences, both approaches have strengths that could complement each other. Logical systems provide clear structure and explainability, while neural models bring the ability to respond flexibly to linguistic patterns and interpret complex contextual situations. Formal logic could provide neural networks with clearly defined rules, while neural networks could bring flexibility and the ability to adapt to different linguistic nuances and changing contexts.
The principle of the logical approach in relation to the tokenising nature of LLM
One of the main differences between logical approaches and current large language models (LLMs) is the way they process and interpret text. LLM models work on the principle of tokenization, where text is divided into smaller units – tokens. These tokens can be single words or parts of words, allowing LLMs to work with very fine textual details. Based on these tokens, the model makes probabilistic predictions about the next occurrence of words or phrases.
Logical approaches, on the other hand, work with the whole meanings of words and phrases, focusing on the logical structures and relationships between parts of the text. As a result, logical approaches look for deeper semantic consistency, which is essential when analyzing logical implications or drawing conclusions based on statements. Thus, the tokenizing nature of LLM allows working with language at the micro level, but often lacks the macro level of semantic understanding that logical systems provide. In order to achieve the integration of these two approaches, it is necessary that tokens not only predict probable sequences, but are also able to capture the logical links and structures between parts of the text.
Towards linking neural models with logic
In order to link the strengths of neural models and formal logic, we can try to introduce hybrid approaches that would allow models to generate text that not only looks human but is also logically consistent. One solution could be to integrate probabilistic components into formal logic. In this way, the logic system would work with different possible interpretations of a sentence during parsing, and the probabilistic model would assign weights to them based on the available context.
This approach would not only provide better handling of ambiguity, but also ensure that logical inference is still possible. Neural models could bring the ability to interpret texts in light of a broad context, while logical systems would ensure that this interpretation is consistent and correctly formally inferred. For example, when interpreting more complex sentences where there are different possible relationships between actors, probabilistic logic could help select the most appropriate option, and subsequent logical analysis would ensure consistency.
Another way could be to integrate language models as an auxiliary layer of the logical system. For example, in a legal system, such integration could help to analyse complex legal documents, where the language model would first identify relevant parts of the text and the logic system would then perform a formal analysis and draw conclusions. Similarly, in medicine, the combination of language models and logic could support a doctor in making a diagnosis – the language model would analyse the patient’s symptoms and medical records, while the logic system would use this information to help suggest possible diagnoses and treatments. For example, the logic model would encounter ambiguity and the language model would determine the most likely meaning based on trained patterns. This could lead not only to better semantic relevance but also to an explanation of how a given result was arrived at, which is crucial for fields such as law or scientific analysis. This approach would lead to a system that is able to respond flexibly to changes in context, while maintaining a precise and logical structure to its conclusions.
Another possible way to achieve connectivity is through the use of explicit knowledge bases and ontologies. These structures can provide a deeper context for logical systems, allowing them to better interpret different meanings based on specific real-world knowledge. Ontologies could contain information about common relationships between objects and actions, making it easier to resolve ambiguities in natural language and allowing neural models to work with meanings based on real logical relationships.
Implications and questions
Combining these two approaches brings new possibilities, but also a number of unanswered questions. The ability of AI to not only generate text, but also to understand it and derive its logical meaning, leads to new ethical implications. If a system can reason and argue based on logical rules, it raises the question of accountability for erroneous decisions that could have a negative impact. Transparency and explanation of results are key requirements for these systems to be deployed without fear in areas that require a high level of trust.
Let alone if we want to ensure that hybrid systems do not become a tool for information manipulation or abuse. If models can generate and understand text based on formal logic, there is a risk that their outputs will be abused to create false but all the more convincing content. It is therefore important that the development of these systems is accompanied by adequate ethical rules and standards to ensure responsible use and alignment of the technology with the interests of society.
These issues highlight the need for an interdisciplinary approach to AI development. In addition to computer scientists and linguists, philosophers, ethicists, psychologists and experts from other disciplines will play an important role. We will need not only technical solutions, but also new frameworks for evaluating and regulating AI systems with advanced cognitive capabilities.
Linking neural networks to logical systems is an exciting way forward that could extend the capabilities of AI systems in both generation and true understanding. In medicine, for example, hybrid systems could help improve diagnosis by combining the ability to analyse vast amounts of medical data with logical inference based on known cause and effect. In the legal field, such systems could provide more sophisticated analysis of legal cases and search for relevant precedents, helping lawyers find solutions to complex situations more quickly. In education, the combination of neural networks and logical models could enable the creation of instructional materials that are not only comprehensible but also logically coherent, helping students better understand complex concepts.
This direction may yield more robust, consistent, and explainable natural language systems, which would have a positive impact not only on scientific research but also on a wide range of practical applications. Applications in fields such as medicine, law or education could gain significantly in quality due to AI’s ability to truly understand context and provide logically derived and relevant answers.
The road to AI that truly understands language as well as we do will be a long one. It will require not only technological innovation, but also deep reflection on the nature of language, thinking and understanding. However, a future in which AI not only speaks but also thinks is no longer a distant science fiction idea, but a real possibility that researchers are actively working towards. With each new advance in this field, we are getting closer to creating truly intelligent systems, true artificial intelligence.