Fuzzy Turing Machines: Toward Explainable Artificial Intelligence

katoshi
6 min readAug 9, 2023

Photo by Shubham Dhage on Unsplash

As artificial intelligence systems permeate society, the notion of explainable AI has become increasingly crucial for safety.

Neural networks excel at pattern recognition, and they’re known to possess something called Turing completeness. This means they can simulate the principles of a traditional Turing machine.

This article examines large language models, which possess inferencing abilities akin to a Turing machine, from the perspective of standard computer architecture. The analysis concludes that current AI systems exhibit the characteristics of a Turing machine with ambiguity, or what’s termed a “Fuzzy Turing Machine.”

By using the term “Fuzzy Turing Machine,” this article aims to provide insights into understanding AI from a software engineering perspective. This understanding can enable us to interpret AI through paradigms like object-oriented programming, test-driven development, agile software development, and refactoring. From there, the article explores paths toward explainable AI, focusing on readability and maintainability.

Let’s delve deeper.

Contrasting Neural Nets with Turing Machines

Consider a neural network system in natural language processing that takes a string as input and produces a single character as output. The system uses a trained model. In one cycle of operation, it computes using all nodes to produce one character. It can also opt not to output anything, holds several intermediate states within the network, and uses previously outputted strings as part of the next input.

When viewed from the perspective of a Turing machine:
- Trained neural network corresponds to the processing system (instruction set of a CPU, OS, middleware, libraries).
- Processing cycles equate to CPU clock cycles.
- Nodes that hold intermediate states are similar to data in the heap area.
- Input strings are analogous to program and constant data.
- Output strings are akin to the program counter and data in the stack area.

From this analogy, training a neural network is comparable to designing the CPU’s instruction set, OS, libraries, and middleware. This is equivalent to building the processing system, which encompasses programs like the OS, middleware, and libraries. Thus, during training, neural networks can be seen as self-programming.

Characteristics of Large Language Models (LLM)

Considering this, several traits of large language models become clearer:

1. **Network Size:** Once a network reaches a certain size, it becomes capable of advanced inferencing and processing. Beyond a certain point, increasing its size doesn’t yield efficiency gains. This is similar to embedded systems that, due to limited memory, cannot run an OS or middleware. With an OS or middleware, complex tasks become possible with simple programs. But without them, advanced tasks require more intricate programming.

2. **Prompt Engineering:** Inputs to an LLM can be adjusted or specified, apart from the main text, to achieve desired outputs. If input texts are thought of as programs, system sentences can be understood as parameter settings or internal functions of the program. This enables specific operations or functionalities.

3. **Inferencing Ability:** Inferencing is akin to simulations guided by logical principles. Supplying a state or inputs and letting it change states during the simulation aligns with what Turing-machine-based computers excel at. The inferencing ability of LLMs likely hints at their capacity for Turing-machine-like operations.

Neural Networks as Fuzzy Turing Machines

Unlike Turing machines, neural networks have ambiguity in their operations, making them Fuzzy Turing Machines. This ambiguity can be both advantageous and a source of errors.

As AI systems learn deeper and more effectively, they’ll likely become more precise in logical or mathematical operations. Yet, they’ll retain the beneficial ambiguity where necessary.

This is reminiscent of human cognition. Children might struggle with even simple calculations or logic, but with training, they refine their precision, albeit still making occasional errors.

Not only AI but also our brains are neural networks made of countless neurons. Although human brains differ significantly from current AI, both can be seen as Fuzzy Turing Machines in many respects.

Structure of a Fuzzy Turing Machine Program

Generally, a program consists of procedural and data parts.

The data part comes in two kinds. One is a box to input data during processing and the structure of this box. The other is a static dataset embedded in the processing system.

In large language models, this static dataset can be thought of as containing dictionary-like data associating words and phrases with their concepts and meanings.

There are also two types of procedural parts. One is the framework. The other is the component.

The framework defines the flow and structure of processing. Components are parts that can operate as components built into that framework. Components can be interchanged within a framework, and their combination can achieve a variety of processes and functions.

In large language models, the framework corresponds to grammar and the logical structure of sentences. Our way of thinking, perspective, and method of organizing thoughts are also frameworks. This aligns with the concept of a framework as a way of thinking, not programming. By learning or coming up with new frameworks, you can streamline your thinking and expand your way of thinking.

Components correspond to concepts and meanings. Concepts and meanings connect with words and phrases through dictionary data.

Learning of Fuzzy Turing Machine = Software Development

The Fuzzy Turing Machine learns dictionary data, frameworks, and components. In large language models, these are self-organized by reading vast amounts of text.

In the learning process, the mechanism of regenerating text with autoencoders corresponds to the testing phase in software development. Testing whether the output is the same as the learned text comes before the implementation of the program. This resonates with the idea of Test-Driven Development (TDD).

Moreover, the trial-and-error process of machine learning appears similar to extreme agile software development. In the waterfall model of development, where analysis, design, implementation, and testing are done sequentially, you need skilled engineers or programmers with prior knowledge. On the other hand, the machine learning process, which iterates implementation and testing in short sprints, is truly agile.

One reason we can’t grasp the learning results of artificial intelligence, i.e., the instruction set of the Fuzzy Turing Machine’s CPU, OS, middleware, and libraries, is this. Of course, the distributed structure called neural networks and its probabilistic nature with ambiguity also make understanding challenging. In addition, imagining a program developed with an extreme agile method, its readability is despairing.

And when you input a sentence into the trained Fuzzy Turing Machine, it processes it like a program.

Software Design Perspective: Fuzzy Object-Oriented, Fuzzy Aspect-Oriented

Large language models understand the inheritance and containment relationships of words and phrases. They can also be interpreted to understand things like design patterns, including frameworks and components. Therefore, they seem to be doing fuzzy object-oriented modeling.

Also, the system sentence in prompt engineering dictates that the following main text can be applied flexibly and broadly. This reminds us of the mechanism of aspect-oriented programming. Thus, the instructions in prompt engineering can be said to be a fuzzy aspect.

The Near Future

With natural language, which has a fuzzy structure, it was appropriate to learn using neural networks. As a result, a human-level processing system has been realized as a Fuzzy Turing Machine.

On the other hand, it should also be possible to train a deterministic language system, like programming languages, to implement a processing system. Currently, neural networks handle even the parts where precise processing is required.

Eventually, if we can make conventional computers, which are deterministic Turing machines, handle such processes, both efficiency and accuracy should improve.

Furthermore, we touched on similarities with object-oriented and aspect-oriented programming, as well as test-driven development and agile software development. Here, we can see the possibility of refactoring.

From the perspective of artificial intelligence safety, research to improve readability and maintainability through refactoring will likely become important. If the system becomes more structured and visible, there might be potential to think about ways to enhance safety or to consider check and audit mechanisms.

In Conclusion

In this article, we discussed the perspective of viewing AI systems as “fuzzy neural networks.”

By interpreting artificial intelligence from this perspective, we may be able to create AI systems that are more explainable and safer.

In general software development and system development, I believe that in fields where high levels of safety and security are demanded, the waterfall development model may be more suitable than agile development. With the increasing impact of AI on society, it’s imperative for us to have a comprehensive overview of its design and to enhance the readability of its internal structure. To achieve this, we must deeply understand its underlying mechanisms.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

katoshi
katoshi

Written by katoshi

Software Engineer and System Architect with a Ph.D. I write articles exploring the common nature between life and intelligence from a system perspective.

No responses yet

Write a response