Minimum time required to satisfy
We define Etherealware as the concept of implementing the functionality of an algorithm by means of the clocking scheme of a cellular automaton (CA). We show, which functions can be implemented in this way, and by which CAs.
- cellular automaton
- update rule
- temporal order
Your task: Compute a lot of different functions on -bit inputs.
Your device: A (fixed!) cellular automaton (CA) on the (fixed!) ring or torus topology with cells, is capable of holding one bit each.
You may not change the CA (its update rule) nor the topology. You may not enter additional information in the form of parameters (there would be no space to store them anyway)—and yet you are supposed to evaluate many different functions. The available degree of freedom is the clocking scheme of the cells, anything from synchronous to completely asynchronous is allowed.
Can you do it?
The perhaps surprising answer is: yes!
Every bijective function on the set , which acts as an even permutation is clocking-computable, as well as many non-bijective functions.
1.1. What is etherealware?
Computation takes place in dedicated hardware or on general-purpose hardware by dedicated software. Different functionality requires either changing the hardware (think ASIC, FPGA) or changing the software running on it. Firmware is an intermediate concept, where the hardware is modified by microprogramming a CPU or personalizing an FPGA.
Etherealware is the first way to use fixed hardware (certain cellular automata (CA) in this case), run fixed software (the same update rule for all cells, for all time, for all purposes), and still deliver diversity in the resulting function: by changing only the clocking scheme, the order in which the CA’s cells are updated.
The study of synchronous CAs starts with Wolfram . We use asynchronous CAs as deterministic devices with a finite number of computation steps, which is a new point of view.
Previously, asynchronous CAs have been treated as dynamical systems, where infinite computations are considered, and the focus lies on concepts like orbits, fixed points, ergodicity, transients, cycles and their periods, and long-term behavior. Also, randomness can be introduced to average over many possible asynchronous schemes. Papers in this respect are:
Ingerson and Buvel , 1984, distinguish synchronous, random (completely asynchronous), and periodic clocking, which yield clearly distinguishable behavior.
Barrett et al. [3, 4, 5, 6], 1999–2003, consider sequential dynamical systems (SDS), including CAs with arbitrary toplogy and neighborhoods. They cover random graphs as topology and dynamical systems topics such as fixed points and invertibility.
Siwak , 2002, gives an overview of simulating machines, including CAs and SDSs, and unifies them under the concept of “filtrons.”
Lee et al. , 2003, give an asynchronous CA on the two-dimensional grid , which is Turing universal.
Laubenbacher and Pareigis , 2006, build upon [3, 4, 5, 6] and observe that not all permutations of the cells lead to different temporal rules. Their equivalence classes coincide—for our setting, CAs on the torus—with our result (, Thm. 1]).
Fatès et al. , 2006, consider ECAs with quiescent states (, i.e., with even Wolfram rule ). They show that 9 ECAs diverge, while the other 55 converge to a random fixed point, in 4 clearly distinguishable time frames or with characteristic behavior per time frame.
Macauley, McCammond, and Mortveit [12, 13], 2007–2010, also treat SDSes, in particular ECAs. For each ECA,  gives the periodic states and the dynamics group. Conjecture 5.10 in  about Wolfram rule 57 coincides with our finding that ECA-57 generates the alternating groups on patterns of bits. They verified this claim for up to 8, while in  this is extended to and in this paper up to .
Dennunzio et al. [15, 16], 2012–2013, consider ACAs, every turing machine can be simulated by an ACA, with quadratic slowdown. Introducing a certain fairness measure, they show that injectivity and surjectivity are equivalent (-a.a.), and the existence of a diamond is equivalent to not -a.a. injectivity.
Salo , 2014, shows that nonuniform CA generates all SFTs (subshifts of finite type) and several non-SFT sofic shifts.
2. Notations: ECA and update rules
Here, continuing the work in [10, 14], we again employ CAs as computing devices, whose work comes to an end, when the pattern transformation or function evaluation has been obtained. Also, the clocking, the temporal update rule, is completely deterministic and replaces the usual ways of representing an algorithm, either in software (initial data) or in hardware (choice of ECA and connecting graph). Thus, the algorithm resides exclusively in the clocking scheme. We therefore call functions computable in this way as “clocking-computable functions.” The main additional contribution of this paper is the introduction of unfair clocking schemes.
2.1. Cellular automata: Neighborhoods and local update rules
We consider cellular automata (CA) on a torus or ring of cells, that is index set , over the binary alphabet . Cell index wraparound, that is, for , and the canonical cell names are . We deal with elementary CAs (ECA) with three input cells, where the middle one is also the output cell.
The neighborhood (ci+1, ci, ci−1) can have eight different values from . Let . Then is replaced by , where are defined via Wolfram’s rule  .
The ECAs can be arranged into 88 equivalence classes under the symbolic symmetry 0/1 and the chiral symmetry left/right (); see (, Appendix A). It is sufficient to consider one member per class.
We considered quad CAs (QCAs) with four inputs and nonstandard neighborhoods in (, Section 1.2).
Local bijectivity requires that ECA ECA . This is equivalent to requiring that the hexadecimal digits of the rule be from
Example: The behavior of the ECA with Wolfram rule :
, , , , , , , , in other words , for all other contexts we have .
2.2. ECA: Global update rules on the torus
2.2.1. Fair update schemes
We repeat the definition of asynchronicity rules from (, Section 2).
The set of asynchronicity rules over consists in all words of length over the alphabet such that both “” and “” occur at least once. We also include the word “,” the synchronous case, and have.
A rule defines the firing order as follows:
|Cell fires after cell|
|Cell fires before cell|
|Cell fires simultaneously with cell|
To ensure bijectivity, we must first have a locally bijective CA, and, furthermore, no two adjacent cells may fire simultaneously. Why this is so will be dealt with in Chapter 5. There are exactly bijective fair rules, those from ; see also (, 4.1]).
A fair update step (bijective or not) can be decomposed into a sequence of elementary steps such that all cells fire exactly once during the execution of that sequence; see .
2.2.2. Unfair update schemes
We now include unfair updates, where some cells may fire less often than others (even not at all).
We start with elementary steps (steps in ). During one elementary step, any nonempty subset of indices may define the active cells.
These cells fire simultaneously, hence
We define elementary steps as words , with the meaning , if cell fires, and if cell is inactive in this step.
A sequence of such elementary steps is upper indexed by the time step .
A fair update rule consists in a number of elementary steps such that every cell fires exactly once. The fair rule “” for can be decomposed into (), while the sequence () is unfair, since cell does not fire at all.
The number of bijective elementary steps is the number of words of length over the alphabet , such that no adjacent 1’s occur to ensure bijectivity. For there are 3, 6, 10, 17 such steps, respectively. The sequence obeys the law . At least for , the sets are as follows: prepend a 0 to each pattern of length , prepend a 10 (a 01) to each pattern of size terminating in 0 (in 1), and add the new pattern . The sequence is used again in Section 5 and has been verified to coincide with OEIS A001610 up to .
We also can define unfair bijective rules (full steps), where we first fix some subset of size of active cells by a word from , meaning that cell is active.
We next order adjacent active cells by the usual signs. Hence, a run of consecutive 1’s (with wraparound) has ways to fix the internal firing order. This order is independent of the other 1-runs, since the cells are separated by at least an inactive cell with . The number of bijective unfair rules on a torus of size is
where the pattern avoids and and then the 1-runs in the pattern have lengths , considering wraparound.
For , we have and such unfair rules, respectively. The terms count how many patterns with active cells are feasible. These terms can be found in OEIS  as subsequences of A209697.
We consider pattern conversions , where can be identified with the set . From Definition (, Def. 3), by , we mean that the elementary CA with rule ECA maps to via the asynchronicity scheme AS. We define recursively for , starting with .
In [10, 14], we considered five universality properties to , where each makes use of a certain update rule AS applied several times. We only give a summary here. Property is ruled out for any , while properties to have only been verified experimentally, for .
Some is mapped to every by varying the rule AS and the required number of time steps. There are 44 ECAs doing this.
All are mapped to all by varying the rule AS and the required number of time steps. There are 6 ECAs (rules 19, 23 (for ), 37 (for ), 41, 57, 105 (for )) doing this (checked for ).
All are mapped to all at the same time, which time may vary for but not for , for different rules AS .
ECA-57 realizes this for (no further has been considered).
ECA-105 realizes this for odd (no further has been considered).
All are mapped to all at the same time, varying the rule AS. This is actually possible for the two survivors of property ; see Table 1. The required time roughly decreases with growing , since we have rules to choose from for patterns . Thus for higher the probability to meet the conversion early on increases.
Eventually, all conversions may happen at all times for some update scheme. This property cannot be satisfied, for no ECA; see (, Thm. 2).
4. Bijective functions
We first introduce the computer algebra system GAP and then give several examples.
4.1. GAP: Graphs, algorithms, programming
4.1.1. GAP and the alternating group
GAP  is a system for computational discrete algebra, in particular computational group theory. We use GAP to decide, whether certain fair or unfair update rules generate the full symmetric or alternating group or , respectively.
Our results so far:
The fair update rules for ECA-generate the full symmetric group for .
The fair update rules for ECA-generate the full alternating group for .
The unfairelementary update rules with exactly one active cell for ECA-generate the full alternating group for .
By exhaustive generation of all permutations on .
First observe that all these rules consist of an even number of transpositions. Hence, we will at most obtain the alternating group . This group comprises those bijective functions on which have positive sign as a permutation.
We checked with GAP for certain sets of 5 fair rules for each of these sizes , and GAP’s function
For we have the canonical set of elementary unfair update rules, with exactly one cell active in each rule. This set generates all fair and unfair update rules and hence is sufficient to decide on the group generated by all rules.
Again, GAP shows that indeed the alternating group is generated, for . We used 144 GB of RAM, which was sufficient for but not so for . □
We believe that, apart from the special case , we always obtain the alternating group.
For every torus size , both the fair update rules, as well as the elementary unfair update rules with a single active cell, are a generating set for the full alternating group on elements, using ECA-.
4.1.2. Example for GAP usage with
We replaced 0 by , since GAP only uses numbers from . P00 to P03 are the permutations generated by the elementary steps , and , respectively.
4.1.3. Example for GAP usage with
4.1.4. Example for GAP usage with
We define the permutations P00 to P27 via a C++ − program; see Appendix A. Using it like gap.out 28 > G28, its output, the file G28, is then read in by GAP.
Times are in milliseconds. Therefore, reading in the 56 GB of permutations P00 to P27 generating the group G28 took 1640 seconds, or about half an hour, while actually checking the resulting group for took another 7280 seconds or 2 hours. The same procedure for G29 resulted in lack of memory; 140 GB of RAM are not sufficient.
Recall that, according to Stirling’s formula, has elements, a number with about 1 billion (1000) digits. Amazingly, GAP gets it done!
4.2. Examples for bijective functions
4.2.1. Multiplication by 9 mod 16
We give two realizations of the function in Table 2.
|First realization in hexadecimal||1. Active cells||2. Active cells||Second realization in hexadecimal|
Observe that the first 23 steps only implement the permutation , consisting of two transpositions. The final 24th step is almost identical to the whole function.
4.2.2. Exponentiation and logarithm in
Identifying 16 with 0b0000, we can map to . Here is the exponentiation ; see Table 3.
|Active cells||FED…210 62C…931 some patterns in binary||All values F…0 in hexadecimal|
= [0001, 1010, 0101, 1010, 0101, 1000, 0100, 1000, 0001, 1000, 0100, 1010, 0100, 1010, 0001, 1000, 0101, 1000, 0100, 1000, 0100, 1010, 0100, 1010]
The inverse function is therefore computed by the inverse update sequence = [1010, 0100, 1010, 0100, 1000, 0100, 1000, 0101, 1000, 0001, 1010, 0100, 1010, 0100, 1000, 0001, 1000, 0100, 1000, 0101, 1010, 0101, 1010, 0001].
5. Non-bijective functions
5.1. Non-bijective global rules and in-degree distributions
In Section 2.2, we have said that, in order to ensure bijectivity, we must first have a locally bijective CA, and furthermore, no two adjacent cells may fire simultaneously. Here is why:
We start with the effect of AS = “” for the locally bijective ECA-57. We consider four adjacent cells and the effect of between the two middle cells, which are thus updated simultaneously, .
We obtain the patterns 0011 and 0101 twice, while missing 0001 and 0111. Hence the image is smaller than the full by 2 or by a factor 7/8 in general.
We have the following in-degree distribution (loss pattern 4 of [, Table 8]):
Twelve patterns map bijectively (1,1) to 12 patterns, 4 patterns map 2:1 to 2 patterns, while 2 patterns of the range are not met at all (0,1).
We have #(domain) = in-degree#(range) for every in-degree. Also, the overall sum is
Any additional simultaneous firing may increase the losses. Additional in-degree distributions (loss patterns) for ECAs and QCAs may be found in (, Table 8).
We restate Theorem 3 (i) from :
We assume an ECA that generates at least the alternating group , when using temporal rules from . Let be any non-bijective function on symbols, where we require for ECA.
Let be the number of configurations leading to configuration . Then is clocking-computable by an ECA with in-degree distribution or , if and only if
Proof. See , Theorem 3. □
5.2. Algorithm for non-bijective functions
The computation of a non-bijective function can be decomposed into 3 steps:
Step I. We start in the middle: Shrink the singletons of the domain to the desired distribution on the range. Find a sequence of elementary steps, necessarily including non-bijective ones, which generates the same distribution of counts in the image space as for the original function. Usually there are more than one of these sequences with the same complexity. Values/patterns as such do not yet play a role; therefore the requirements are usually easy to meet in a variety of ways, and the sequence of necessary elementary steps is short.
Step II. Bijectively map the input of the original function to the input of any of the results of Step I in such a way that the desired image counts are matched. Only the occurence counts must be observed, while, within this restriction, pattern values can be permuted.
Step III. Bijectively map the output from Step I to the output of the desired function, considering the flow within Steps II and I in that order. Now all values matter and cannot be permuted. However, this step takes place on a subset of size instead of .
In Steps II and III we use the meet-in-the-middle approach, applying update rules starting both from the input (identity) and the desired function values to take advantage of the birthday paradox: In this way, patterns are sufficient to generate potential matches in the middle (here ), a considerable saving in both space and time.
5.3. Example of a non-bijective function
multiplication up to on the torus of size
The 4-bit input is interpreted as a pair of numbers from the set , whose product, the output, lies within the set .
5.3.1. Step I
The image consists of 7 patterns with occurence counts 7 (0000), 2 (0010,0011,0110), and 1 (0001,0100,1001).
In Step I, we thus have to shrink the set of patterns present from 16 to 7 in such a way that (any) 7 patterns are mapped to the same one, and additionally 3 sets of two patterns are joined within each set. Finally, the three remaining input patterns stay on their own.
An exhaustive search over sequences with 4 update rules, starting and ending with a non-bijective one, yields 72 such sequences with the desired shrinking factor, one example is the sequence of active cells.
|All patterns in hex||Active cells||Occurrence counts|
5.3.2. Step II
After Step I, the output pattern 2 has frequency 7. Therefore, 2 has to match 0 in Step III, while 0,1,2,3,4,8,C are matched to 5,7,8,9,A,B,E in any order in Step II. Thus, we already have different possibilities for Step II.
Similarly, we can maps 0110 and 1001 to either , , or and so forth. Let run through the occurrence counts, here , 2 and 1 are actually taken, and let then be the number of output patterns with in-degree , here , . We get
bijective functions consistent with the count distribution to choose from in Step II. One of these is given in the left column of Table 4.
|Step II||Step I||Step III|
5.3.3. Step III
The outcome of Steps I + II completely fixes the necessary permutation for Step III. However, we do not have to deal with values, but only are relevant, in our example 7 instead of 16. Again, meet-in-the-middle yields a match. Be aware that the necessary effort does not drop from 16! to 7! but only to or in general.
In total, 13 + 4 + 15 = 32 steps are required to compute this multiplication function (Table 5).
|IN||Step II||Step I||Step III||IN||Step II||Step I||Step III|
In the case of unfair bijective functions, any subset of cells may fire simultaneously, provided that no adjacent cells are contained in the set.
We define local efficiency of a bijective update sequence by two properties:
No cell is active during two consecutive time steps (these two would cancel each other).
No active cell can be moved to the previous time step.
Global efficiency—which shortest update sequence generates a certain function/permutation—is beyond the scope of this paper (it is dealt with implicitly by brute force in a breadth-first manner).
A sequence of rules is bijective and efficient, if cell active in step implies that.
and both inactive,
at least one of and equals 1, active.
is required for and then ensures bijectivity.
If was active, both and could be removed, since each single-location action is an involution. Thus is required.
If and were both inactive, the active cell could be moved to , maintaining bijectivity. Hence is required. □
Using this lemma, we now can compute an upper bound on the number of bijective functions realizable with update steps: We define a matrix with rows and columns indexed by the bijective elementary update rules, here named . Set , if rule followed by rule is efficient, otherwise.
Let be the largest eigenvalue of .
Observe that 10,100 followed by 00001 is efficient, while the other way round 00001 leaves room to move up to . Hence, is not symmetrical for .
In Table 6, we denote several figures describing the number of efficient bijective update schemes (including unfair ones). The number of rules is denoted by #; it can be found as A001610 in OEIS ; see also Section 2.2.2. The number of nonzero entries in the matrix A is #. The quotient #/#is the average number of 1’s in each row/column.
|17||3570||688,024||192||91.321||96.473||1.413e + 06||313,092|
|18||5777||1,519,586||263||124.571||55.699||3.008e + 06||623,547|
|19||9348||3,354,944||358||156.098||156.314||6.380e + 06||1.26e + 06|
|20||15,126||7,405,058||489||212.932||81.313||1.348e + 07||2.51e + 06|
|21||24,475||16,341,254||667||266.822||250.502||2.842e + 07||5.08e + 06|
|22||39,602||36,056,154||910||363.969||118.336||5.976e + 07||1.01e + 07|
|23||64,078||79,547,616||1241||456.085||397.758||1.253e + 08||2.04e + 07|
|24||103,681||175,485,442||1692||622.138||171.942||2.623e + 08||4.07e + 07|
We denote by #the number of effective bijective update schemes of length on a torus of size . Essentially, .
Let . Then gives the constant hidden in the Landau symbol .
We have .
is the number of steps necessary such that . Invoking the birthday paradox by the meet-in-the-middle approach, we therefore expect to require steps each, starting from the initial identity permutation and backwards from the desired function, to achieve a match yielding the function evaluation.
The asymptotic behavior, as far as we can infer from the range , is
We can compare the asymptotic number of elementary steps necessary for unfair update rules with the lower bound on the (full) steps for fair rules.
Lemma 4 (, Section 5, Lemma 1(i)).
There are clocking-computable bijective functions that require at least
steps or evaluations per cell.
Remarkably, as far as the experimental results for indicate, the number of elementary steps is only a constant factor () higher than the lower bound for full update steps. If that result remains valid for all , and the reported values strongly suggest this, this would mean that the typical full step can be replaced by no more than 2..3 elementary unfair steps, independent of .
As can be appreciated, might be feasible for an attack by meet-in-the-middle, but and above is certainly no candidate for this brute-force approach. Dealing with these sizes will require a more intelligent approach, for instance using group-theoretic techniques like representation theory, applied to the alternating group.
We have seen that many functions are clocking-computable, namely, the even bijective ones and the non-bijective ones with enough “loss” in their image.
The elementary cellular automaton ECA-57 can be used to implement an etherealware computing device: The computed function is a result only of the clocking order.
Temporal order of activating the CA cells is thus a new way to encode algorithms, a “volatilization of information”.
From onwards, we shall need more mathematical concepts, e.g., from group theory.
My thanks go to Dr. Mónica del Pilar Canales Chacón for proofreading, commenting, and all the rest. Furthermore, the anonymous referee gave valuable comments concerning the first draft of the paper, pointing out a substantial error and several occasions for clarifying the intended meaning.
//remainder, what to put interactively in GAP, no use as part of file