Another Genetic Algorithm: Digital Chain Structure and Pinhole World

katoshi
9 min readAug 21, 2023
Photo by Sangharsh Lohakare on Unsplash

I am conducting unique research on the origins of life. In an effort to shed light on the mysteries of the origin of life, where there are few clues, I’m taking an approach to gain insights from similarities with mechanisms working in other evolutionary and developmental systems.

Until now, I’ve focused on extracting mechanisms related to intelligence, society, etc. In this article, I will focus on the evolution of biology, specifically DNA.

Overview of Genetic Algorithms

A genetic algorithm is a computational processing method designed based on the genetic mechanism model of biological DNA.

It is mainly used when trying to find the optimal parameters out of countless combinations. For example, consider determining the optimal temperature, pressure, and concentrations of substances A, B, and C in a solvent to produce a particular chemical compound most efficiently.

Initially, each parameter is set randomly to create the first set of parameters. From this, “children” are created. These children have almost the same parameters as their “parents,” but with slight modifications. Multiple children are made.

Then, for each “child” parameter set, the efficiency of chemical production is tested. The “children” with better production efficiencies produce more offspring in the next generation. Through this process, characteristics of individuals with efficient parameters are passed down to the next generation, gradually converging towards the optimal parameter set.

Just like the genetic mutations in organisms, occasionally, a major change is made to the parameters. This can possibly uncover surprising combinations of parameters that are optimal.

Digital Chain Structure

Generally, the aforementioned is the typical concept of genetic algorithms and its application models. On the other hand, the evolution of actual organisms is different from this example.

Firstly, genes of organisms are not sets of analog numeric parameters but are DNA. DNA consists of sequences of nucleotide combinations, forming what we might call a “digital chain structure.” If we label the four types of nucleotides as A, B, C, and D, they would form structures like ABBDCABDAC… and so on.

Slope-Type and Pinhole-Type Search

Next, for application targets, the objective for DNA is more complex than the previous chemical example.

Case 1: Searching for the optimal parameter within an already expressed function.

Consider the length of a giraffe’s neck. If it’s too short, they can’t eat leaves from tall trees, but if it’s too long, it’s inconvenient for maintaining balance or escaping from predators. Thus, it’s believed they evolved to have an optimally long neck.

This case is somewhat analogous to the chemical example earlier. Here, one can imagine a slope where organisms on the lower parts of the slope have a higher survival rate, and evolution drives them to the lowest point. We’ll refer to this as “slope-type” searching.

Case 2: Searching for the expression of a new function.

This case differs from mere optimization. Let’s say a sequence like ABBCCCDDDD expresses a new function. If even one nucleotide in this sequence is different, the function won’t be expressed.

Regardless of how close or far the sequence is from this optimal combination, if it doesn’t match perfectly, the function doesn’t express. Hence, there won’t be a process like the earlier example where near-optimal organisms proliferate more. Unless the sequence matches the new function, all organisms have the same number of offspring.

Thus, this case can be visualized as searching aimed at just one point. As opposed to the slope-type search, we will term this as a “pinhole-type” combinatorial search.

Slope World and Pinhole World

In the case of organisms, the expression of a function depends on the surrounding environment at the time it’s expressed.

If the environment is not conducive to the function when it’s expressed, the function is meaningless. For instance, expressing the function of lung respiration underwater is meaningless. Therefore, in interaction with the world, we search for optimal parameters or new functions meaningful for survival and reproduction.

In this context, I’d like to call Case 1 as a Slope World-type exploration and Case 2 as a Pinhole World-type exploration.

It’s easier to understand this by imagining the flow of water.

Slope World is like water flowing on a terrain with ups and downs. Water naturally flows downward and eventually accumulates in depressions like lakes or ponds. This becomes the optimal parameter.

Pinhole World is a flat terrain. There are no ups and downs. However, there are small holes here and there. Water spreads in all directions from the starting point and eventually, when a nearby hole is found, it flows into it. This represents the expression of a new function.

Digital Chain Structure and Pinhole World

In the Pinhole World, the strategies of the Slope World cannot be used to express new functions. Therefore, we essentially have to find it randomly.

For example, let’s assume the first gene is AAAA. Its offspring might slightly differ, becoming AAAB or ACAA. Other sequences with added bases, like AAAAA or AAAAD, might also emerge.

These offspring genes neither provide particular advantages nor disadvantages for survival. Therefore, each gene will produce the same number of offspring in the next generation. In this way, from the initial AAAA gene, various genes are created, and diversity gradually expands.

Within this expanding genetic diversity, if a specific gene sequence becomes advantageous for survival or reproduction, the proportion of individuals with that gene sequence increases, while the proportion without it decreases.

Right before a new functional “pinhole” is found, the genes might converge in that form. However, after that, the spread of genetic diversity for expressing a new function will continue. Once functions or properties that can be realized with simpler sequences are discovered, the diversity will expand again until the next function is found.

In this way, particularly neutral genetic sequences in terms of survival or reproduction advantages expand their diversity, leading to the expression of new functions.

Rational Strategies in Digital Chain Structure and Pinhole World

With this mechanism, there tends to be a progression where functions expressed by relatively simple base sequences are found in order. Considering the efficiency of evolution, this is quite rational.

Moreover, when a new function is expressed by a longer base sequence, it can be interpreted as a change in the environment. In that case, we can think of it as the birth of a new Pinhole World. Within it, a shorter base sequence might again hold meaningful functions.

Even after expressing new functions, by continuously exploring from shorter base sequences, new functions are explored solidly and efficiently.

Supplement 1 on the Evolution of DNA: OS, Middleware, App Model

When genes explore the pinhole world, I explained that non-advantageous free gene sequences can have meaning for survival and reproduction.

On the other hand, functions that have already been acquired are essential for survival and reproduction. If this part of the DNA mutates, it becomes difficult for the organism to reproduce, and it becomes difficult to continue inheriting that DNA.

I liken this part to software, considering it equivalent to the OS and middleware sections. On the other hand, the free gene sequence that is not advantageous for survival or reproduction is, in software terms, the app part. In regular computers or smartphones, if an app stops, the OS does not stop. It’s the same thing.

Supplement 2 on the Evolution of DNA: Protection through Error Correction and Redundancy

I explained that unless the OS or middleware section mutates into an improvement, the mutated version won’t continue to exist. However, there is a dilemma: as DNA gets longer, it becomes difficult to maintain offspring if the OS or middleware part mutates frequently.

Therefore, DNA that has evolved has mechanisms to prevent mutations in the OS or middleware parts. This is primarily through error correction mechanisms and redundancy.

Error correction is a system that, even if an essential part of the DNA is damaged or mutated, automatically finds and repairs that damaged or mutated section. By applying this to the OS and middleware parts, any mutations in that section are automatically corrected.

Redundancy involves having multiple copies of a partial DNA sequence within the genes of one species. By having multiple copies of essential DNA sequences, the impact of damage or mutation on one part is minimized. The other copies function, ensuring the overall operation of the organism.

Supplement 3 on the Evolution of DNA: Library

Within genes, the same DNA sequence being copied in multiple places is not just for redundancy. There are believed to be shorter DNA sequences that can be commonly used for various new functions. Waiting for such DNA sequences to be generated again for another function is inefficient.

Therefore, there is thought to be a mechanism where useful DNA sequences are copied in their entirety during mutations or are utilized in a way that processes are well connected to existing DNA sequences.

In software terms, this is like a library. Copying during mutations is like static-link libraries, and connecting processes to existing sequences is like dynamic-link libraries.

Supplement 4 on the Evolution of DNA: Probability Distribution and Superposition

The diversity distribution of DNA in all individuals of a species can be viewed as a probability distribution. As generations progress, this probability distribution appears to transition in a self-feedback manner.

From this perspective, the diversity of the digital chain structure of DNA can be interpreted as a superposition. While it’s different from the superposition in quantum mechanics, it can be understood commonly as a probability distribution in an abstract mathematical model.

There are two benefits to viewing it from this perspective:

The first is the possibility of converting genetic algorithms into efficient algorithms for quantum computers. If genetic algorithms can be computed on quantum computers, which excel at processing superpositions, it becomes possible to compute at speeds entirely different from regular computers.

The second benefit is gaining an intuitive understanding of the mechanism by which two or more interdependent new functions are expressed.

There are cases where combinations of multiple new functions are expressed within the same individual or where new functions that lead to symbiotic relationships or co-evolution between multiple species emerge.

When considering two interacting functions, it is not clear which one emerged first. However, understanding it as a probability distribution, it is intuitively understandable that, albeit at an extremely low probability, pinholes exist where both functions can emerge simultaneously. Over a long period, one can understand that the probability distribution concentrates around this pinhole when reached by chance.

Application Examples

The genetic algorithm as a pinhole world of this digital chain structure can be applied to the study of artificial life.

Additionally, it could be applied to text generation. Text, too, has a digital chain structure connecting character to character. And just as certain sequences are nonsensical, so it resembles a pinhole world.

Starting from a single character and progressively adding or replacing existing characters will eventually lead to meaningful short sentences. Notably, this mechanism can be calculated in parallel across multiple computers.

You can present such generated text candidates to a chat AI, for instance, to check if the text is meaningful or even have it scored as poetry. Consequently, short poems could be mechanically auto-generated using this algorithm. In theory, even longer texts can be generated.

Additionally, this model might be applied to the chemical evolution of organic compounds related to the origins of life. This is because organic compounds also correspond to the digital chain structure of molecular bonds.

Summary

The perspectives of the digital chain structure and the pinhole world are crucial in the actual evolution of DNA.

One of the points here is the mechanism by which freely-sequenced genes not related to survival or reproduction diversify over time. Considering the contrast with the converging tendencies of the “slope world”, it’s an intriguing point.

Another point is the trend where functions that are expressed by relatively simple base sequences are discovered in sequence. From the perspective of utilizing every precious pinhole, it’s a rational strategy.

This mechanism is not only for explaining or simulating the evolution of DNA but could potentially be applied to systems that similarly have a digital chain structure and a pinhole world.

It might cover text or program generation. And, as initially mentioned, I believe there’s a high possibility that this mechanism is applied to the origins of life, or the chemical evolution of organic compounds, which I’m independently researching.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

katoshi
katoshi

Written by katoshi

Software Engineer and System Architect with a Ph.D. I write articles exploring the common nature between life and intelligence from a system perspective.

No responses yet

Write a response