Tracing Back to the Origins | The MP Neuron "Logical Calculus of the Ideas Immanent in Nervous Activity" (1943)

Zixin's Multiverse.

2023-05-29

Academics

Introduction

This article discusses the paper A Logical Calculus of the Ideas Immanent in Nervous Activity [1], published in 1943 by biologist Warren McCulloch and logician Walter Pitts. In Chinese, it translates to The Logical Calculus of Ideas in Nervous Activity. Before reading this article, I suggest reading the introductory piece to better understand the symbolic systems used in the paper. Recently, I’ve been diving into some “classical” papers (often regarded as foundational in modern AI development). However, finding the content challenging and lacking detailed analysis, I managed to piece together a basic understanding during my spare time. I hope this “back-to-basics” article can aid learners with similar interests who may not know where to start. 😊

This article is around 5000 words, estimated reading time is 20 minutes.

If we were attending a general introduction to artificial intelligence, the instructor would likely start with artificial neural networks (ANNs) and introduce the basic structure of input layers, hidden layers, and output layers, as shown on the right side of Figure 1. The weighted sum of neuron outputs, passed through an activation function , intuitively aligns with our understanding of biological structures and the basic mechanism of neuron pulse generation. But how did scientists abstract the complex circuitry of neural activity (Figure 1 left) into the simplified ANN structure? The 1943 paper A Logical Calculus of the Ideas Immanent in Nervous Activity is widely recognized as the earliest work addressing this question. This article’s main contribution was the proposal of an idealized neuron model based on logical operations, allowing neurons to process information through threshold responses. It also demonstrated that neural networks could simulate any mechanical computing process and its invariance from a macro perspective. In short, as shown in Figure 1, it simplifies the complexity of neural physiological processes into logical propositions.

Figure 1: (Left) A high-resolution detailed image of a small part of the human brain mapped by Google in 2021 [2]; (Right) Basic structure of neural networks [3].

Introduction

The article begins with several key points in the Introduction (not analyzed here word-for-word, but you can search online for a PDF copy; the content, being rather old, can be difficult to read, especially for non-biology majors, like me):

Neural activity has an “all-or-none” nature (neural activity is either present or absent, akin to 0 or 1, true or false).
To trigger a pulse in the next neuron, a certain number of synapses must activate within a specific time frame (each neuron has a threshold independent of its position and previous activity).
The only significant delay in the nervous system is synaptic conduction delay (axon delay can be ignored).
Inhibitory synapses prevent neuron activation at certain times (explains the refractory period of neurons using inhibitory synapses).
The structure of neural networks does not change over time (long-term phenomena like learning and extinction are broadly equivalent to the original network).

Based on these assumptions, we define some symbols. If this part seems hard to understand, consider reading the preliminary article.

Definitions

We first define a functor whose value for a given property is the property held when satisfies its predecessor. The expression is given by

where is a numerical operator, and denotes the successor of . This expression may look complex, but it essentially means that . Here, the parameter in the parentheses on the left side is often omitted, making it a predicate, , and .

Now, we define the structure of a neural network , which includes neurons . Let represent the state of neuron at time , describing whether neuron fires at time . Sometimes, we use the object language and denote it as . The boundary neurons of are defined as neurons without axons connecting to their synapses. Boundary neurons do not receive synaptic connections, so their input signals act directly on other parts of the network, unaffected by other neurons in the network (“inputs”). This property allows us to break down complex neural activities into simpler components for understanding. Let represent the boundary neurons and represent the other neurons. The solution of such a neural network takes the form:

where contains only one free variable and may include some constant sentences . Furthermore, each holds for . Next, we define two key terms: narrow realizability and extended realizability.

Figure 2: Original definition of narrow realizability (i.n.s.) and extended realizability [1].

Oppositely, consider a predicate . It is called narrowly realizable if there exists a network with a series of such that holds. Subsequently, we define extended realizability, meaning that is narrowly realizable for some .

To interpret these definitions, narrow realizability is a stringent condition requiring a specific neural network structure to directly achieve a particular state described by the predicate through specific neuron arrangements and input substitutions. Conversely, extended realizability is a more lenient condition, allowing multiple applications of functor to transform and expand propositions until the expanded proposition is narrowly realizable.

Figure 3: A simple neural network.

Let’s consider a simple example (Figure 3). First, take the expression . Since , this neural network is narrowly realizable by the i.n.s. definition. For , this expression is not directly realizable. Still, we can utilize intermediate neurons cleverly to derive a neural network since , and according to , we get . As is narrowly realizable, is extendedly realizable, as shown in the neural network in Figure 3.

Finally, we arrive at the last definition! The paper defines Temporal Propositional Expressions (TPE) using recursive rules.

Figure 4: Recursive definition of Temporal Propositional Expressions (TPE) [1].

First, is the basic temporal propositional expression, where is a predicate variable and is a time point. Secondly, the definition states that the result of applying functor , logical “or,” “and,” or “not” operations to TPEs and is still a TPE. Lastly, apart from the above forms, no other forms constitute TPEs—this ensures the completeness of the definition.

Review

We defined functor , solutions for neural networks, narrow and extended realizability, and TPEs. Before moving on to proofs and examples, let’s review the core questions:

Find an effective method to obtain a computable set of that forms the solution for any given network (calculate the behavior of any network).
Describe a set of realizable solutions.

In simple terms, the problems are (1) to calculate any network’s behavior and (2) to determine networks that manifest as specific states.

Theorems and Simple Proofs

Note 1: A 0th-order neural network refers to a network without circular structures. The latter half of the paper provides detailed explanations for nets with circles, which are not covered here.

Theorem 1. Every 0th-order net can be solved in terms of Temporal Propositional Expressions (TPEs).

Let be any neuron in with a threshold . Let represent the neurons with excitatory synapses on , with synaptic weights . Let represent the neurons with inhibitory synapses on . Let be the class of subsets of such that the sum of these subsets exceeds (capable of activating ). Based on the assumptions introduced in the Introduction, we can write
$Missing or unrecognized delimiter for \left N_i(z_1) \equiv S \left{\prod_{m=1}^q \lnot N_{jm}(z_1) \land \sum_{\alpha \in \kappa_i} \prod_{s \in \alpha} N_{is}(z_1) \right}$
where and represent finite disjunction and conjunction, respectively. This expression may look complex, but its meaning is simple. It states that neuron fires at time if and only if at time , none of the inhibitory neurons are firing and there exists a subset of excitatory neurons whose cumulative pulse value exceeds the threshold , as illustrated in Figure 5.

Figure 5: A neural network composed of excitatory and inhibitory neurons.

Assuming a threshold of two, it’s clear that firing requires both and to be true (i.e., and fired in the previous moment) and to be false (i.e., the inhibitory neuron was not activated). Since such an expression can be written for each non-boundary neuron, we can replace each and with their equivalent expressions until is entirely equivalent to a propositional logic expression formed by boundary neurons, thus deriving the solution of a neural network.

Theorem 2. Every TPE is realizable by a 0th-order network in the extended sense.

Note 2: The term “realizable in the extended sense” is abbreviated as “realizable” in the proof.

The second theorem states that for any neuron state description, there exists a 0th-order network that realizes it in the extended sense. The proof of Theorem 2 provides a recursive method for constructing a 0th-order network that realizes a TPE.

Since functor essentially acts as a time operator, it commutes with basic logic operations (disjunction, conjunction, negation). Thus, if network can realize at the current time, we can achieve at the previous time by introducing appropriate delay neurons. Let’s look at an example:

Consider a proposition , and suppose is narrowly realizable. Applying the functor and using its commutativity with logic operations, we get

is a time operator, so and describe the states of and at the previous time point. By extending network with appropriate delay neurons, we can realize in the narrow sense.

Figure 6: Achieving a neural network by making neuron 3 an intermediate neuron.

Thus, if is narrowly realizable, its result after applications of is still narrowly realizable because we can add intermediate neurons indefinitely. If and are both narrowly realizable, then and are also narrowly realizable. Now consider the four basic components of neural networks: , , , and (Figure 7), all of which are narrowly realizable.

Figure 7: Four basic components of a neural network.

Based on the previous conclusions, , , , and are also narrowly realizable. By the definition of extended realizability, these primitive propositions are extendedly realizable. By logically combining these basic structures, more complex neural networks can be realized in the extended sense, enabling the realization of any TPE.

Theorem 3. Let there be a complex sentence built up in any manner out of elementary sentences of the form , where is any numeral, by any of the propositional connections: negation, disjunction, conjunction, implication, and equivalence. Then is a TPE if and only if it is false when its constituent are all assumed false—i.e., replaced by false sentences—or that the last line in its truth table contains an ‘F’, or there is no term in its Hilbert disjunctive normal form composed exclusively of negated terms.

Though complex at first glance, Theorem 3 provides the criterion for determining whether an expression is a TPE. In other words, it gives a method for determining if an expression is a TPE. There are three equivalent methods:

The composite sentence is false when all its basic sentences are false.
The last line of the truth table contains “false.”
There is no term in its Hilbert Disjunctive Normal Form (HDNF) composed solely of negated terms.

Let’s examine a few examples for each condition:

For the composite sentence , where and are basic propositions. When both and are false, is also false, i.e., . Therefore, satisfies the first condition and is a TPE.
For the composite sentence , the truth table is as follows:

As shown, when both and are false, is also false. Hence, satisfies the second condition.
For the composite sentence , converting it to HDNF gives . In HDNF, the only term consists solely of negated terms. Therefore, does not meet the third condition and is not a TPE.

Now, let’s apply these theorems to implement a neural network. Consider a scenario: when an ice cube touches and then leaves our skin for a moment, we feel heat before coolness; but if it stays longer, we only feel a chill. To model this situation, let and represent “heat” and “cold” receptors, respectively, and let and represent neurons sensing heat and cold. We assume that senses cold only if the cold touch persists for two time units, yielding the following sentences:

Since these sentences are already in Hilbert Disjunctive Normal Form (HDNF), Theorem 3 indicates that both and are TPEs. Given this, Theorem 2 can construct a neural network in the following form:

Finally, the paper mentions that different inhibition phenomena are broadly equivalent (2) extinction and learning are equivalent to absolute inhibition. If these two conclusions are proven, it indicates that the current structure simulates actual neural networks, allowing us to compute any network’s behavior and determine networks manifesting specific states. This part is briefly discussed in the original paper, so we will not explain the details it here.

Summary

The McCulloch-Pitts neuron model simplifies neuron behavior into logical operations, executing basic logic through “all-or-none” responses. Its significance lies in formalizing the behavior of neural systems, demonstrating the Turing completeness of neural networks, capable of simulating any state. This groundbreaking theory laid a solid foundation for the Perceptron and propelled the development of artificial neural networks (ANNs). (Spoiler alert: the next topic will be Perceptron or theories related to emergence, expected within three days 🤞).

End

References

[1] McCulloch, W. S., & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. The Bulletin of Mathematical Biophysics, 5(4), 115–133. https://doi.org/10.1007/bf02478259
[2] Marshall, M. (2021, June 7). Google has mapped a piece of the human brain in the most detail ever. New Scientist; New Scientist. https://www.newscientist.com/article/2279937-google-has-mapped-a-piece-of-human-brain-in-the-most-detail-ever/
[3] A Friendly Introduction to [Deep] Neural Networks | KNIME. (2021). KNIME. https://www.knime.com/blog/a-friendly-introduction-to-deep-neural-networks