Time is one of the most fascinating and mysterious subjects in physics. In June, about 60 experts gathered to discuss the nature of time at the Time in Cosmology conference, as Dan Falk reported in *Quanta*. Some of the questions they considered included: What makes the arrow of time push forward inexorably from the past to the present to the future? Why is the time we call “now” so special to us in the real world, but not at all special in any way in physics? Why does entropy, the technical measure of the amount of disorder in the universe, increase as time moves forward, as the second law of thermodynamics dictates, even though the equations of physics are time-reversible? The contradictory qualities of the many ideas, models, suggestions and speculations presented at the conference highlight how little even the experts understand these fundamental questions.

Since a little hands-on modeling might give us a better appreciation of the issues, I thought we should do a very simple simulation of a toy universe ourselves. But first, let’s look at some ideas we can explore.

One of the motivations for the Time in Cosmology conference was the dissatisfaction felt by some physicists, including co-organizer Lee Smolin, with the standard “block universe” of physics — a static four-dimensional block of space-time in which the flow of time is mere illusion, and in which the future already exists. Smolin rejects this frozen block universe; in his latest book, *Time Reborn*, he presents arguments in favor of the commonsense view that time flows toward an open-ended future.

While Smolin is known to have iconoclastic views on a lot of modern physics, there are two general ideas of his that I agree with wholeheartedly. The first one, as I pointed out in my “Is Infinity Real?” solution column, is that the mathematical models used in physics are merely mental constructions and cannot be expected to encompass all of physical reality: The map is not the territory. The second is the idea that the Darwinian principle of natural selection is the only underlying source for the truly novel complexity we find not just in the biological world but everywhere in the universe. Smolin has applied this latter idea to his speculative theory of “cosmological natural selection,” which he explains in this short video.

You do not have to subscribe to Smolin’s full-blown theory of cosmological natural selection to see how a simple form of “natural selection” can apply to nonbiological processes such as the creation of stars, galaxies and chemical compounds. This idea, called stratified stability, was expounded almost 50 years ago by Jacob Bronowski, the author of the BBC documentary series *The Ascent of Man*. Biological natural selection requires random variability of offspring, which causes differences in fitness, resulting in differential survival and reproduction. Stratified stability, on the other hand, simply requires that random combinations of nonliving elements, such as atoms or molecules, form more complex structures, some of which, thanks to forces like gravity or electromagnetism, are more stable than others. These stable structures, whether physical aggregations or chemical compounds, “survive” to form even more complex structures, of which again the most stable persist, thus giving rise to layers upon layers of complex structures and varied chemistry, all formed by the nonbiological “natural selection” of inherent stability. In fact, this process, in principle, can explain how large aggregations like galaxies, stars and planets can form due to gravity and how the chemicals required for life could have originated by prebiotic chemical evolution.

With these ideas in mind, let me describe the rules of my toy universe.

*Space*: The entire universe consists of a 5-by-6 grid of nodes (4-by-5 cells), as shown.

*Atoms*: The universe is populated by seven red bars, each one unit long, which span two neighboring nodes on the grid horizontally or vertically.

*“Central Galaxy” or LCD*: The central 3-by-2 portion of this universe can be thought of as the familiar seven-segment liquid crystal display (LCD) from an old-fashioned calculator.

*Initial State*: The bars are all arranged on the seven segments of the central LCD to form the number “8.”

*Dynamic Laws*: Every second, all the bars move randomly as follows: One of the two ends of each bar moves to a new position, picked at random from among the neighbors of the node it was on (like a random walk, but constrained by the grid’s border). The other end of the bar follows randomly to one of the neighbors of the new position.

*Entropy (Measure of Disorder)*: The system is considered to be most orderly if all the segments of the central LCD are occupied. The “entropy” of this initial state is zero. Maximum entropy is achieved when no bars touch any of the nodes of the central LCD “8.”

The value of the entropy is calculated as follows. Count the numbers of the following: unoccupied central LCD segments (*u*); the number of bars fully occupying a segment of the central LCD (there may be more than one bar occupying the same segment) (*b*); the number of bars touching a node of the central LCD but otherwise outside it (*t*). Then calculate the entropy using the formula (14 + 4*u* – 2*b* – *t*)/42. Why divide by 42? Well, that’s the secret of the universe, don’t you know?

Obviously, this model can be modified and improved in many ways, but I was aiming for something that had enough complexity to tell us something interesting, while at the same time remaining simple enough to allow for intelligent guessing and estimation regarding its evolution. To those of you who enjoy simulations and have the interest, skills and time to do it, you are welcome to do so. Please share your simulations and insights with us! For the rest of you, we may post a downloadable file within a couple of weeks that will allow you to play with the model yourselves.

For now, try to guess or estimate the answer to the questions below. It’ll be fun to compare the answers to the results of the simulations.

Question 1:

Here’s a passage from Dan Falk’s article, where he talks about how entropy increases in the real universe and why a whole object does not spontaneously re-form from its pieces.

Scrambled eggs always come after whole eggs, never the other way around. To make sense of this, physicists have proposed that the universe began in a very special low-entropy state. In this view … entropy increases because the Big Bang happened to produce an exceptionally low-entropy universe. There was nowhere to go but up.

An analogue of the egg in our toy universe is the central number formed on the LCD. From the initial zero-entropy state, how long do you think it will take to get to the completely “scrambled” or maximum-entropy state? What will be the approximate value of the entropy when the universe reaches a “steady state”? How long will this take? Can it ever get back to the original minimal-entropy or “whole egg” state? If so, how long do you think this scenario will take?

Question 2:

Let’s introduce stratified stability into our universe, by adding the following rule: If a bar lands on an unoccupied segment of the central LCD, the bar stops moving and occupies that position permanently. The initial state of the universe is also different: Let’s assume that it is the most disordered state (maximum entropy), where none of the bars are touching the central LCD.

From this disordered start, how long do you think it will take to reach the minimum-entropy state and form the number 8 on the LCD? Assume that the movement rules are the same as before — one move per second.

Question 3:

In the universe of question 2, of the following three shapes that could appear on the LCD — the numbers 6 and 9 and the letter *A* — which one is most likely to be the last step before the number 8 is reached? Which is least likely?

Question 4:

Leaving aside my jocular reference to the number 42, can you figure out the rationale behind the entropy calculation?

Suggested puzzle enhancements are welcome. Did this exercise stimulate any ideas or conclusions?

Happy puzzling. Have fun simulating the universe and unscrambling eggs!

*Editor’s note: The reader who submits the most interesting, creative or insightful solution (as judged by the columnist) in the comments section will receive a *Quanta Magazine* T-shirt. (Update: The solution is now available here.) And if you’d like to suggest a favorite puzzle for a future Insights column, submit it as a comment below, clearly marked “NEW PUZZLE SUGGESTION” (it will not appear online, so solutions to the puzzle above should be submitted separately).*

*Note that we may hold comments for the first day or two to allow for independent contributions by readers.*

"Scrambled eggs always come after whole eggs, never the other way around. To make sense of this, physicists have proposed that the universe began in a very special low-entropy state. In this view … entropy increases because the Big Bang happened to produce an exceptionally low-entropy universe. There was nowhere to go but up."

So where did the low-entropy Universe come from? For that matter, didn't the whole eggs start out kinda scrambled, in the sense that, traced back far enough, they started out as stardust? Scientists always try to use thermodynamic arguments to explain the so-called arrow of time but they just don't hold water. And I question the whole "Big Bang" cosmology anyway. The Big Bang cosmology rests on the Singularity Theorem which rests on the assumption that there are no negative energies; however, LIGO, which is nothing but a refined version of a Michelson interferometer, recently detected gravitational waves but what's waving? I'll tell you what's waving: the ether! LIGO, being a refined version of the famous (infamous) Michelson-Morley experiment, irrefutably detected the ether; this should be headline news! Why isn't it? Because if you have an ether you have negative energies – Dirac's negative energy sea – and the whole Big Bang cosmology comes tumbling down!

Lee Smolin, in his book, "The Trouble with Physics," presents himself as Mr. Honesty, but then he goes on to say that if Einstein taught him anything it was the need for a background independent theory, but Einstein himself knew that wasn't true. The Lorenz Transformations are necessary precisely because of the background – the ether. Light propagates at a constant velocity relative to an absolute background – the ether. Space contracts and time dilates precisely in the manner necessary to make "c" constant in any reference frame. This caused Poincare to say that the ether was not detectable and the community, primarily due to social reasons, decided to pretend it just doesn't exist; this is no longer possible! And scientists continue to argue against Intelligent Design knowing very well the odds of such a Universe appearing by random chance is effectively zero!

All of the meditative spiritual traditions state that there is a Vast, Active, Living, Intelligence permeating our Universe; in the yogic traditions this is referred to as the cosmic prana or the ether. There is no arrow to time outside of the human mind!

From the wonderful and often hilarious book of essays, "The Book of Time" (https://www.amazon.com/Book-Time-John-Grant/dp/0715377647), I quote:

‘The argument regarding the ether went as follows: if the Earth is moving through the ether, and if light moves at a constant velocity through the ether, a ray of light sent in the direction of the Earth’s motion and then back to its starting point should arrive later than a ray sent over an equal distance at right angles to the direction of the Earth’s motion through the ether. The experimental apparatus is illustrated in Fig.3. the Earth moves around the Sun at a speed of some 30 kilometres per second and, although that is very small compared with the velocity of light, the sensitivity of the apparatus was such that this sort of motion through the ether was well within its range. In practice, no difference whatsoever was measured between the travel-times of the two beams. […]

When the results of such experiments had reluctantly been accepted some notable physicists attempted to preserve the notion of the all-pervading ether by rather DEVIOUS means (emphasis mine). Independently, in the 1890’s, the Irish physicist George Fitzgerald and the Dutch physicist Hendrik Lorentz suggested that motion through the ether would affect measuring instruments by just the right amount to PREVENT MOTION THROUGH THE ETHER FROM BEING DETECTED. In particular, measuring rods (rulers, or whatever was used to measure length) would shrink in the direction of motion, and clocks would run slow. Any instrument designed to detect motion through the ether would thus fail to achieve its goal. Although this very convenient hypothesis in a sense “explained” the failure of the Michelson-Morley experiment, it also demonstrated – as the French mathematician, Henri Poincare pointed out – that the ether, if it existed, must always remain undetectable. […] Something which, even if it exists, is wholly undetectable in principle as well as practice, is of no value to science. The ether turned out to be a wholly useless concept. […]

Einstein had been concerned that, although Newtonian mechanics was unaffected by the motion of inertial frames (i.e. mechanical experiments would give the same results inside laboratories no matter how fast the laboratories were moving), electromagnetic phenomena (such as the propagation of light) appeared to be based upon one particular frame of reference – the ether. It seemed to him that there was no good reason why one set of physical experiments should not be affected by uniform motion while another set should. The relativity principle abolished that distinction and accounted for the failure of the Michelson-Morley experiment; the speed of the laboratory (the Earth) clearly had no effect on the experiment (the measurement of the time taken for rays of light to cover equal distances in different directions). The second postulate was that light travels through a VACUUM (emphasis mine) at a constant velocity in all inertial frames. In other words, the velocity of light measured by an observer is the same regardless of the relative velocity of the observer and the source of light. This appears to be nonsense . . . […]

The fusion of the relativity principle and the constancy of the speed of light came in the special theory of relativity, and with its coming the whole idea of an ether, and the associated concept of absolute space. was discarded. The whole basis of more than two centuries of established physics was swept away at a stroke. […]

So now, who’s really being DEVIOUS?

http://www.physics.arizona.edu/~rafelski/PS/0903EcolPolColl.pdf

https://www.math.auckland.ac.nz/~king/Preprints/pdf/Transup.pdf

Thats soooo easy….

There are 12 zodiac signs and the yin/yang tells us that there are 2 facets to everything… that makes 24…..

and if your dyslexic thats 42…. its so simple….

Your definition of entropy is incorrect. Entropy is the logarithm of the number of microstates that satisfies a given macrostate description. Entropy is a measure of one's knowledge of the state of a system; it is not an intrinsic property of the state itself.

One could chose a measure for your macrostate description by counting the number of ends of the bars that match the central pattern you specify. There is only one configuration with all ends in the central pattern, so that macrostate description has an entropy of ln(1) = 0. The number of states that have all ends but one in the central pattern is ln(14*3) = ln(42) = 3.74, etc.

E.T. Jaynes has already explained how time-symmetric dynamics can still lead to time-asymmetric predictions of diffusion (e.g diffusion of sugar in a glass of water) in his paper "Clearing Up Mysteries – The Original Goal", in the "Diffusion" section. While the physical dynamics may be symmetric, our probabilistic inferences are not. If we know where a molecule of sugar is in the water right now, we can predict where it will be in the future (we expect it to be right where it is now on average) independent of wherever all the other sugar molecules have been. However the reverse isn't true. If we know where a molecule of sugar is in the water right now, where we predict where it was in the past, does dependent on our knowledge of the historic sugar density of the glass of water. "It is not the dynamics, but the prior information, that breaks the symmetry and leads us to predict non-zero flux."

The answer to the question (why does 'entropy' always increase with 'time' when 'time' is symmetric) is that it doesn't. If the Universe were in a state of lowest entropy – which by the law itself is more expectable than it's current one by hundreds of orders of magnitude – entropy would fluctuate up and down with 'time' around a point of thermodynamic equilibrium. The question is one of many illusory puzzles produced by our own subjective position as evolved dissipative structures on an entropy slope facing away 'in time' from the 'big bang'. In other words a more objective question might be 'why are we on an entropy slope when the laws of entropy imply we should be close to equilibrium'; why the big bang?

1) Calculate probability for the 9 different kinds of points where the first end of the bar might land and use that to calculate the probability of any segment being selected.

> Probability of a bar not being in the eight = 127/180

> Probability of none of the bars being in the eight = (127/180)^7

> Expected time to max.entropy = (180/127)^7 = 11.48898 sec

Similar calculation for forming an eight back

> Expected time back to eight = 555428571.4 sec ~ 17.6 years

3) All seem equally likely with a 6/9/A converting to an 8 with probability 1/60.

I created a simulation structure in Python to explore the questions posed in this article, and I have made it available on Github if anyone wants to use it for their own exploration.

https://github.com/Ericlafevers/time_puzzle_simulation

The code is contained in an IPython notebook, so if you are familiar with those or willing to do a little bit of Internet searching, it should be straightforward to start using the simulation.

(I'm not sure if this is what was meant by "Please share your simulations and insights with us!" but hopefully this is useful and appropriate)

To avoid confusion, I make the following statement as a strict Materialist: I think for as long as we ignore the environment, or the surroundings, of a "closed" system we'll forever misunderstand time's arrow, and entropy. Because for as long as scientists are hovering over their experiments, taking measurements, and making analyses and not including themselves into the system a very significant variable will be ignored.

Consider this: suppose intelligent life (eg. not human) were to multiply across the Universe to the extent it could use various machines, and knowledge, to control the outcome of Expansion? Say, preventing the Big Crunch, or the Cold Death, how firm a law is entropy then?

Thanks for the interesting puzzle (I'm not much of a puzzle-doer myself, but I enjoy a good premise).

I'm doing research on how to apply ideas of natural selection to non-biological systems and I experienced minor whiplash when you mentioned a paper on the topic that I was unaware of. You write, "a simple form of “natural selection” can apply to nonbiological processes such as the creation of stars, galaxies and chemical compounds. This idea, called stratified stability…"

I had to read Bronowksi's paper for myself. I don't think he puts it quite the way you do. While I find his use of terminology somewhat loose to begin with–more rigorous definitions of natural selection and evolution, often from the newer sub-discipline of philosophy of biology, started gaining more attention shortly after this paper was written–I think he explicitly distinguishes natural selection and stratified stability as different kinds of concepts that are both components of broader evolutionary theory. He does not say, or imply (as far as I can tell), that stratified stability is a "simple form" of the natural selection. Whereas he takes stratified stability to be present in biological and non-biological systems, he seems to say that natural selection is what distinguishes the biological from the non-biological. For example, he writes, "There are evolutionary processes in nature which do not demand the intervention of selective forces. Characteristic is the evolution of the chemical elements, which are built up in different stars step by step, first hydrogen to helium, then helium to carbon, and on to heavier elements."

I agree with you about the theoretical potential of applying the concept of natural selection to non-biological systems, but I don't think that's what Bronowski himself is doing. I've been keeping tabs of instances of earlier attempts to use natural selection theory on non-biological systems, so if you know of any others, I'd be most curious to see them.

Nothing is 'random'; every action and reaction is governed by a finite (but large) number of predictable variables. That we do not currently have the intellectual capacity to understand that this means that there is no such thing as objective time – or entropy – does not make it any less factual.

There is a method to reduce discrete lattices like the above into one dimensional maps.

One first reconS that replacing each edge around two vertices into a dot maps the above in a dual lattice with each new central vertex, filled or not, corresponding to each bar present or not. Then the original is reduced to a lattice with 8 symbols (occupation number for eash edge). Any such lattice can be put to an exact correspondence with the powerset of all octal strings in the interval [0, …, 8^55 – 1] (2^165 total of strings). The reason for 55 is simply that each row in the original lattice contains 5 horizontal places for each bar and 6 vertical ones immediately below, hence one gets a sequence of { 5, 6 } x 5 total places.

Then the central "8" neighbors inside each 55 symbol string correspond to the positions {14, 19, 20, 25, 30, 31, 36} and the relevant positions for finding u and b are inside this range while t requires positions +/-1 of the above. When projected across the integers this way, the particular entropy function mayl present certain regularities due to the natural recursive structure of similar morphisms over integers as strings.

Estimation of "transition time" in this picture is then associated with the combinatoric structure of the 8^55 strings into disjoint sets of different occupation numbers (as symbols in the range [0…8]) requires multi-index notation as in the case of a multinomial distribution. Given that the particular configuration of the original eight-shaped configuration corresponds to a single binary string with ones in positions {14, 19,…,36} the probability of it been reached by the use of any sufficiently fair, flat random number process appears to be negligibly small and apparently it would need an enormous time to make a complete recurrence although there is no meaning in a single "time" here rather than an ensemble of trajectories.

Some initial thoughts on the puzzle in general, which may help resolve some ambiguities and possible confusions.

1. The "maximum entropy state" is defined here to be any configuration of the bars that does not share any nodes with the initial configuration. Thus, "state", as used in this puzzle, is clearly a very different notion from "configuration". In particular, a state is some set of configurations.

One might use the terms "microstate" and "macrostate" to make this distinction, but to keep consistency with the language of the puzzle I will use "configuration" and "state".

2. Even the notion of a "bar" and a "configuration of bars" needs some care here, it seems. I note that a "bar" does not carry an "arrow", i.e., it is defined solely by the segment of the graph that it occupies. It is the same bar if one swaps its two ends. A configuration of the seven bars is therefore a list of seven segments. Some of these segments can be the same: for example, all bars can occupy the same segment.

The puzzle is ambiguous as to whether the bars are inherently distinguishable, e.g., each carrying a distinct number or colour, etc. I think it is intended that they are all inherently identical. This means that if any two bars in a given configuration are swapped, the configuration itself remains the same. This is like the notion of indistinguishable particles in quantum mechanics. In fact, since bars can occupy the same segment, they are like bosons in quantum mechanics.

It follows that the "initial state" consists of exactly one configuration (rather than, say, 7! configurations or even 7!x2^7 configurations). In contrast, the "maximum entropy state" corresponds to the set of configurations that have no segments touching the initial configuration. Since there are 26 such segments, and each of the 7 bars can occupy any one of these segments, it follows that the maximum entropy state corresponds to a set of 7^26 distinct configurations.

3. Even making the above distinction between "configuration" and "state", there is still an ambiguity in "Question 2", which talks about taking the "maximum entropy state" as the initial state. The dynamics only specify how to update an intial configuration.

However, it seems natural that one is intended to update a "state" by updating each one of the configurations that it comprises. This seems to be sufficient for attacking Question 2.

Note that one could alternatively associate a state with a probability distribution over the set of all possible configurations. However, the "maximum entropy state" is then not well defined, since no such probability distribution is specified. The obvious choice for this state would be an equally probable distribution over the set of configurations that it comprises. However, the statistical entropy of this distribution is not the same as the entropy defined in the puzzle (see also item 5 below), and it seems best to ignore this issue as not important to the puzzle per se.

4. The dynamics refers to how the configuration changes from one time to the next. That is, the list of seven segments is updated to a new list of seven segments. The dynamics is not deterministic, as each bar moves to a new segment (or the same one) with some probability.

Since the rules specify that at least one end of a bar must move, and the other end must move to an adjacent node, the central bar in the initial state can move to any one of 23 segments (including staying where it is by swapping its ends). 9 of these can be reached in two ways, and 14 in only one way, with each way being equally probable. The first 9 segments thus each have probability 1/16 of being reached, and the other 14 each have probability 1/32.

A bar not occupying the central segment is limited in the number of possible moves due to the boundaries of the grid, and so has a different set of probabilities.

5. The definition of an entropy function as a property of a configuration, rather than as a property of a state, is not at all standard. Usually, the entropy of a state in physics is a measure of randomness, such as

(i) the logarithm of the number of (distinguishable) configurations compatible with the state, or

(ii) -\sum p_j log p_j where p_j is a probability distribution over a set of configurations associated with the state, or

(iii) the number of binary digits needed to identify or to specify a random configuration compatible with the state at a given time.

However, "entropy" in the puzzle is, on the one hand, defined to be a function of the configuration, but, on the on the hand, to be a property of a state (such as the "maximum entropy state").

It seems, to make sense of this, that only some "states", i.e, only some sets of configurations, have a well-defined "entropy". This are the states which consist of of configurations that all have the same value of the entropy function.

I note this last issue doesn't seem to be important to the Questions per se, but it is likely to cause some confusion as "entropy" is not "statistical entropy".

Entropy is not always destructive. It could be perfectly constructive and destructive . You can not compare the universe with an egg. First we have to get over this idea of big bang . Starting wrong , everything is wrong. The universe came from outside to inside ( without the big bang ) and not from the inside out ( with the big bang ) . He started only with energy, which turned on . In this matter, everything changed. They were created in the stars , star clusters , quazares , galaxies, galaxy clusters , etc. (positive entropy and later follows the destructive entropy) . See explanations on the blog : " Looking at the universe."

In regards to your comment on natural selection, I believe that you seem to have fallen into the same trap almost everyone else does:

"Biological natural selection requires random variability of offspring, which causes differences in fitness, resulting in differential survival and reproduction."

"Survival of the fittest" has nothing to do with the health, agility, or robust nature of the offspring. Instead, "fittest" – in the Victorian vernacular – means "fit for purpose", as in "The thing most suited to its environment".

Ashish: That is impressive. However, those results are likely only if the LCD toy conforms to the mathematical transforms that are perceived by the bio-cognition of this era. What if that cognition is incomplete? It ought to be, otherwise why are the present day mathematicians not able to compute our next minute, hour, day or week?

6 and 9 have the same shape in that case.

Question 1:

There are 49 segments, and 7 distinguishable bars can occupy these in 7^49 ways (and 7 arrowed bars in 2^7 7^49 ways). But, as per my previous comment on the various ambiguities in the puzzle, it appears that configurations are to be treated as if the bars are indistinguishable (thus, the "initial state" corresponds to a unique configuration, not to 7! or 2^7×7!).

This means each configuration of bars corresponds to M=49 positive numbers, n1, n2, …,n49 that add to 7. The number of possible configurations for M segments, C(M), can be found by noting that 1/(1-x)=sum_n x^n, so that

f(x) = 1/(1-x)^M = sum_{n1, …,nM} x^{n1 +…+ nM} .

It follows that C(M) is the Taylor series coefficient of x^7, i.e.,

C(M) = (1/7!) d^7 f/dx^7|_x=0

= M(M+1)…(M+6)/7! .

So, the total number of distinct configurations is C(49), and the number of configurations with maximum "entropy" is C(32) (since there are 32 segments that don't touch the initial configuration). I would expect, given the random nature of the dynamics, that the probability of being in any particular configuration after a long time is equal to that of being in any other configuration, i.e.,

p = 1/C(49).

Hence, the probability of having a maximum "entropy" configuration (see previous comment on ambiguities), at any time following a long time, is estimated to be

p(max "ent") = C(32)/C(49).

Further, the probability of returning to the initial configuration (or to any other configuration) after a long time is similarly estimated to be

p(return) = 1/C(49).

I can only wave my hands as to the average recurrence time as thinking about passing on average through every configuration, so that it is on the order of

T_return ~ 1/p(return) = C(49).

Finally, to estimate how long the initial configuration might take to move to a maximum "entropy" configuration, I guess one can think of the dynamics as a random walk through configurations, and so it will take N steps to diffuse a radius of sqrt{N} configurations. I don't really know distances between configurations, so I will instead just guess

T_max"ent" ~ 1/p(max "ent") = C(49)/C(23).

Question 2:

Since there is a nonzero probability that two or more bars will "stick" onto the same segment of the 8, and never move, then the time to get to the 8 state, averaged over all dynamics, is infinite.

Question 3:

I count 14 equally likely ways of moving from 6 or 9 to the 8 in one step, and 20 equally likely ways of moving from A to the 8 in one step. So, it appears that 6 and 9 are each less likely penultimate configurations than A.

Question 4:

I guess the rationale is to ensure the "entropy" ranges between 0 and 1, inclusive.

> — the numbers 6 and 9 and the letter A —

What about number 0?

@Michael > Since there is a nonzero probability that two or more bars will "stick" onto the same segment of the 8, and never move

The additional rule in question 2 says: “If a bar lands on an unoccupied segment of the central LCD, the bar stops moving and occupies that position permanently.”

I guess that the word “unoccupied” changes a lot of things! 😉

@Ethaniel

Thanks for pointing that out – oops!

So, I guess an upper bound for question 2 would be one half the return time from Question 1, which could be estimated as

C(49)/2 ~ 5 x 10^11 s.

Perhaps a better bound can be found by considering individual bars and lots of handwaving. Each bar moves randomly and indpendently among the 49 segments until becoming stuck in an 8 spot. If the probability that a given bar does not reach the 8 region after n moves is p(n), which clearly must decline with increasing n, then the probability that no bar has reached the 8 region is p(n)^7, which becomes tiny quite quickly. So I would estimate an upper bound of

T_1 ~ 32/7 ~ 4s

for the first bar to move from one of the 32 'maximum entropy' segments to one of the 7 "8* segments. During this time, the other 6 bars will become randomly spread over the 49 segments, and continue to move randomly (assuming no double stick), and so one can similarly estimate a further upper bound

T_2 ~ 49/6

for the average time for the next bar to become stuck on one of the remaining 6 segments, and so on, giving final upper bound of

T_8 = T_1+T_2+..+T_7

= 32/7 + 49[ 1/6 + … + 1/1]

~ 125 s.

for the average time to coalesce back to an "8". This does seem a bit low though! Only 8 times longer than my previous estimate of

T_max"ent" = C(32)/C(49) ~ 16s

to move out from the 8 to a "maximum entropy" configuration.

Maybe one should use ideas from the previous months Drunkard's Walk problem and square everything….extend the grid to a repeated lattice….no, I don't really want to do that.

Q1A: How long do you think it will take to get to the maximum-entropy state?

Very quickly, usually within a few ticks the system will reach maximum "entropy"

Q1B: What will be the approximate value of the entropy when the universe reaches a “steady state”?

There will be no "Steady State" but fluctuations usually near maximal "entropy"

Yet rarely reaching low "entropy" states, including, very rarely, the zero "entropy" state.

Q1C: Can it ever get back to the original minimal-entropy or “whole egg” state?

Yes.

Q1D: If so, how long do you think this scenario will take?

On average 1/(PERMUT(42,7-b)*PERMUT(7,b)/PERMUT(49,7)) or 85,900,584 ticks.

Yet in some "universes" zero entropy will be restored at tick 3.

In other “universes” zero entropy is never restored.

Q2A From a [random] disordered start and "sticky" evolution,

how long do you think it will take to reach the minimum-entropy state?

Not very long, about 100 seconds on average.

Yet in some "universes" zero entropy will happen in 3 ticks.

And in other “universes” it will never happen.

Q3A: In the “Sticky” universe, of the following three shapes

the numbers 6 and 9 and the letter A

which one is most likely to be the last step before the number 8 is reached?

For random maximal entropy start conditions 6 and 9 must be the same likelihood by left right symmetry.

It is not obvious that the A likelihood is identical with the 6/9 likelihood as the system lacks rotational symmetry.

I believe the A penultimate end state is less likely than the 6 or 9 state

as the average bar density will be slightly higher closer to the boundary

and the top and bottom bar will tend to fill in more quickly.

But there is another penultimate state, the O state, which is even more likely than 6 or 9. There are also the backwards 6 and 9 states that are the same likelihood as 6 and 9.

Q3B: Which is least likely?

Penultimate A

Rarer is two or more bars falling into place on the last tick.

The rarest of all is all 7 falling into place on the last tick.

Q4: can you figure out the rationale behind the entropy calculation?

Yes.

You want the central configuration to be zero “entropy”.

You want all bars totally away from central configuration to be one.

You want all central (but not all central covered) to be >0 but low.

You chose a not touching bar to have twice the entropy of a touching bar (arbitrary).

14 is just equal to 2b when b=7 and u=0 (so the all central configuration is zero “entropy”).

42 is just equal to the numerator at maximum “Entropy” of b=0,u=7,t=0

yielding 14+4*7=14+28=42…so the final "Entropy" is between 0 and 1.

I found playing with this got me thinking about reality in a few interesting ways.

Firstly related to Everett’s Theory of the Universal Wave Function and the fascinating issue of the computing of “probabilities” of various pathological universes. In your toy multiverse there are “some” pathological “universes” that never return to zero “entropy”, but these all have probability zero.

Next is the inherent randomness…This made me wonder about de Broglie–Bohm Theory with results indistinguishable from randomness, yet actually deterministic. A computer simulation of your toy multiverse would require a rand() function, which, although it may appear to be random, is not. A quantum source of randomness could be used, but it may actually be deterministic as well.

Finally, was thoughts about the nature of time and space. Your toy had an inherent tick and fixed background. It lacked background independence. It was non-relativistic. It had particles (the bars). All classical concepts that modern physics questions. This makes me wonder…is there a simpler toy model that is background independent, fully relativistic, deterministic/reversible, non-particle, and completely discrete?

Thanks for the puzzle!

Thanks for putting this puzzle Pradeep; it was really interesting! Here is my approach to the problem, mostly aimed at answering Question 1.

In the setup we have 7 bars on the edges of a grid able to randomly to nearby edges. Each bar is indistinguishable, non-interacting, not subject to any forces, and may occupy the same position. Physically we can think of these as 7 "free bosons" on a discrete spacetime (with boundary). Since the bars are non-interacting, we may decouple this system and study the dynamics of each bar separately (a huge simplification as there are only 49 states for a single bar, whereas there are 49^7/7! multi-particle states). We can then use this information to analyze statistical quantities which depend on the multi-particle state such as the entropy.

To study the dynamics of a single bar, first note that there are 49 total states a bar may be in at a given time. These states are labeled by the edges of the grid. The update rule (crucially) does not depend on the previous state of the bar. We may therefore model the behavior using a finite state Markov chain (this is equivalent to describing the motion of a random walker on a graph where the nodes in the graph are the states and there is an edge from state a->b iff it's possible for the bar to move from edge a to edge b). We need to construct our probability transition matrix P, a 49×49 matrix whose entries are P_ab, the probability to go from state a to state b. This is given by (#moves ending on b)/(#moves available). This is easy to do with a computer (I used Mathematica).

With P in hand (we should check that the row sums are 1), we can compute its eigenvalues to determine the long term behavior. It turns out we have a rank 29 matrix with eigenvalues (abs. values listed in decreasing order):

{1., 0.842682, 0.77303, 0.60501, 0.534109, 0.408684, 0.322904,

0.270717, 0.258056, 0.147444, 0.12501, 0.116822, 0.109126, 0.108336,

0.10559, 0.0986584, 0.0973698, 0.09545, 0.082835, 0.0713484,

0.065473, 0.0571961, 0.0513396, 0.0395642, 0.039514, 0.0305805,

0.0298848, 0.0115633, 0.00986795, 0., 0., 0., 0., 0., 0., 0., 0., 0.,

0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.}

For us it's only important to note that we have a single eigenvalue of 1 (and we'll make use of the second eigenvalue later). This tells us that we will approach a stationary distribution, i.e. lim_t->\infty P^t = Id*pi, where pi is a vector whose entries (pi)_a tell us the probability of finding the bar in state a. Pi can be computed as e/sum(e_i), where e is the left eigenvector of P^T with eigenvalue 1. It turns out:

pi={0.0129758, 0.0129758, 0.016436, 0.0198962, 0.016436, 0.0216263,

0.0129758, 0.0198962, 0.0129758, 0.0198962, 0.016436, 0.0250865,

0.0250865, 0.0250865, 0.0268166, 0.0198962, 0.0250865, 0.016436,

0.0216263, 0.017301, 0.0268166, 0.0259516, 0.0268166, 0.0276817,

0.0216263, 0.0259516, 0.017301, 0.0216263, 0.016436, 0.0268166,

0.0250865, 0.0268166, 0.0268166, 0.0216263, 0.0250865, 0.016436,

0.0198962, 0.0129758, 0.0250865, 0.0198962, 0.0250865, 0.0216263,

0.0198962, 0.0198962, 0.0129758, 0.0129758, 0.016436, 0.016436,

0.0129758}

corresponding to edges

{{{0, 0}, {0, 1}}, {{0, 0}, {1, 0}}, {{0, 1}, {0, 2}}, {{0, 1}, {1, 1}}, {{0, 2}, {0, 3}}, {{0, 2}, {1, 2}}, {{0, 3}, {0, 4}}, {{0, 3}, {1, 3}}, {{0, 4}, {1, 4}}, {{1, 0}, {1, 1}}, {{1, 0}, {2, 0}}, {{1, 1}, {1, 2}}, {{1, 1}, {2, 1}}, {{1, 2}, {1, 3}}, {{1, 2}, {2, 2}}, {{1, 3}, {1, 4}}, {{1, 3}, {2, 3}}, {{1, 4}, {2, 4}}, {{2, 0}, {2, 1}}, {{2, 0}, {3, 0}}, {{2, 1}, {2, 2}}, {{2, 1}, {3, 1}}, {{2, 2}, {2, 3}}, {{2, 2}, {3, 2}}, {{2, 3}, {2, 4}}, {{2, 3}, {3, 3}}, {{2, 4}, {3, 4}}, {{3, 0}, {3, 1}}, {{3, 0}, {4, 0}}, {{3, 1}, {3, 2}}, {{3, 1}, {4, 1}}, {{3, 2}, {3, 3}}, {{3, 2}, {4, 2}}, {{3, 3}, {3, 4}}, {{3, 3}, {4, 3}}, {{3, 4}, {4, 4}}, {{4, 0}, {4, 1}}, {{4, 0}, {5, 0}}, {{4, 1}, {4, 2}}, {{4, 1}, {5, 1}}, {{4, 2}, {4, 3}}, {{4, 2}, {5, 2}}, {{4, 3}, {4, 4}}, {{4, 3}, {5, 3}}, {{4, 4}, {5, 4}}, {{5, 0}, {5, 1}}, {{5, 1}, {5, 2}}, {{5, 2}, {5, 3}}, {{5, 3}, {5, 4}}}

I can't post pictures, but to give an idea this looks like a smoothly varying distribution on the grid which is top/bottom & left/right symmetric, strongest in the middle, fades as you move towards the boundaries, and is weakest in the corners. This makes sense as the boundary can be thought of as exerting a sort of "pressure" – the number of available places to move is smaller when you are on a boundary, and less still if you're in a corner. Note also that since we converge to a stationary distribution, the long term behavior is *independent of the initial state*.

Let's make use of that second eigenvalue to answer the question "how long does it take to get to a stationary state"? First of all note that we are never going to converge to a single state, but rather to this stationary distribution – a fixed statistical mixture of states. This makes sense as all states communicate. The time it takes be in a mixed state is roughly the time it takes for |lambda_2|^t to go to zero where |lambda_2| = 0.842682 is the second largest eigenvalue (if we can analyze P^t using eigenvectors, the eigenvector corresponding to 1 will stay present but all others will drop to zero). This a bit of a matter of opinion, but for t=27 |lambda_2|^t < .01. This can be verified by measuring the entropy in a simulation – you'll see it quickly climb from zero and with in about 20 time steps the entropy will start jumping around "randomly" (later we'll see later that the distribution of entropy values is predictable).

On to the multi-particle states. The state of the system can be described by |phi>=|phi_1,…,phi_7> where phi_i is one of the 49 states (edges on the grid). We don't want to distinguish states which look the same i.e. |phi_\sigma(1),…,phi_\sigma(7)> describe the same state where \sigma is any permutation of 1,..,7. Since our system reaches equilibrium so quickly (we'll justify why t=27 is fast later), we can assume the system is already in equilibrium when computing p(|phi>) = probability of being in a multi-particle state.

p(|phi>) = 7!*p(phi_1)*…*p(phi_7) = 7!*pi_(phi_1)*…*pi_(phi_7)

where again pi is the stationary probability distribution. Knowing the probability for each multi-particle state allows us to in principle compute the expectation value of the entropy.

E[S] = sum_|phi> S(|phi>)*p(|phi>)

It turns out we can compute this in practice as well. We rely on two properties of the entropy function:

S is linear in u, t, b

u, t, b can be split into sums/products of single particle states

b(|phi>) = #bars occupying central LCD = b_1(|phi>) + … + b_7(|phi>)

where b_i(|phi>) = {1 if phi_i is in LCD, 0 if not.

t(|phi>) = #bars touching LCD = t_1(|phi>) + … + t_7(|phi>)

where t_i(|phi>) = {1 if phi_i is touching the LCD, 0 if not. u is harder, but doable.

u(|phi>) = # unoccupied LCD segments = u_1(|phi>) + … + u_7(|phi>)

where u_i(|phi>) = (1- ind(chi_i == phi_1))*…*(1- ind(chi_i == phi_7)) and ind(chi_i == phi_j) = {1 if phi_j==chi_i, 0 else. chi_i are just labels for the edges in the LCD so run from 1,…,7. The u_i are constructed so that if any of the phi_j are are on the spot chi_i, one of the factors in the product is zero, so the whole thing is zero. The only way to be nonzero is if all the phi_j are not equal to the LCD segment chi_i, in which case you get 1.

E[S] = (1/3)*E[1] + (2/21)*E[u] – (1/21)*E[b] – (1/42)*E[t]

E[b] = E[b_1] + … + E[b_7]

Let's compute E[b_1] for example – the others are identical.

E[b_1] = sum_|phi> p(|phi>)*b_1(|phi>) = sum_{phi_1,…,phi_7} pi_(phi_1)*…pi_(phi_7) * b_1(phi_1) = (sum_{phi_1} pi_(phi_1)*b_1(phi_1))*(sum_{phi_2} pi_(phi_2))*…*(sum_{phi_7} pi_(phi_7)) = sum_{phi_1} pi_(phi_1) b_1(phi_1) * 1 * … * 1 = sum_{edge in LCD} pi_edge

Therefore we just have to sum the probabilities of the edges in the LCD using our probability distribution pi. It turns out:

E[b] = 7E[b_1] = 378/289

E[t] = 7E[t_1] = 973/578

E[u] = 3994796502525175107469/689675577587306205184

Therefore E[S] = 5667379099208577310093/7241593564666715154432 ~ .782615

One can confirm this using a simulation – if you average the entropy of the system over long periods of time you will approach this number. The numbers are exact because pi can be computed exactly. This method can in principle be used to compute the higher moments of S and get an idea of what the entropy distribution looks like. Simulations show that one obtains three large, separated peaks. The middle peak corresponds to the expected value. There is a fourth small peak around S=.55.

The probability of being in the LCD state is 7!*pi_(chi_1)*…*pi_(chi_7) ~ 4.8592 x 10^-8. This seems very small! However, there are a lot of states, so we should really compare this to an "average" state. In other words, we should compute the expectation value of the probability itself:

E[p] = 7!*sum_|phi> p(|phi>)*p(|phi>) = 7!*(sum_i pi_(chi_i))^7 ~ 1.08518 x 10^-8

Therefore the LCD state is 4.47779 times more likely to occur than a typical state. We can use this principle to compute the higher order moments and approximate the distribution of multi-particle state probabilities. What one will find is the you obtain a left-skewed distribution which peaks around the mean and quickly tails off. The LCD state belongs to the far right of this distribution, meaning it has an unusually high probability of occurring as compared to a random state.

Finally, knowing the probabilities allows us to compute the approximate return time. If the system has been running for time T, we expect the system to spend p(|phi>)*T of its time in state |phi>. The return is thus 1/p(|phi>) ~ 2.05795 x 10^7 time steps. This justifies t=27 time steps as being extremely quick. Physically we may want to think of a time step as more like a nanosecond. Practically, this tells us that if we run a simulation on a computer it’s *highly* unlikely that the system will return to the initial state. Keep in mind, however, that is highly unlikely the system will return to any given state.

Concluding comments: we see that this system quickly reaches a high entropy, probabilistic mixture of states. This makes sense as the motions are effectively random. The “boundary pressure” makes it a bit more likely to be in the LCD state, but it is still very unlikely to find a particle there. This framework can be immediately adapted to handle a larger grid. If the grid is made much larger, the stationary distribution will be closer to uniform, and so the boundary pressure won’t have much effect. This means that the LCD state will become a “typical” state, just as likely to occur as any other average state.

Responses@Russell O’Connor,

You said: “Your definition of entropy is incorrect. Entropy is the logarithm of the number of microstates that satisfies a given macrostate description.”

You are technically correct, of course—famously, S=K log W where S is the entropy, K a constant, and W the number of microstates, which is the immortal formula carved on Boltzmann’s grave—and that’s why I said my formula was for “entropy” in quotes. My idea was to come up with a simple measure of disorder that still somewhat faithfully tracked actual entropy while giving due importance to the visual salience of the central LCD area with primacy to the display of the number 8, and its role in stratified stability. The actual statistical entropy will be lowest if all the bars clustered together undramatically in a single location anywhere in the toy universe, whether it is in the central LCD or not, unlike my toy measure. Nevertheless, I think my measure should vary similarly to actual entropy in the questions we are interested in — the ratio of the times from low to high entropy and back again, the role of stratified stability in creating low entropy structures and so on.

@Max Hoiland

Thanks for your comment on stratified stability. I do think that Bronowski looks at stratified stability as being necessary to both biological evolution and the increase in complexity of non-biological systems over time. In non-biological systems, stratified stability performs a role analogous to selection — stable structures are “selected” by virtue of their persistence in time and become the basis for increasing complexity. Specifically, Bronowski explicitly explains the role of stratified stability in the formation of the chemical elements:

“Nature works by steps. The atoms form molecules, the molecules form bases, the bases direct the formation of amino acids, the amino acids form proteins, and proteins work in cells. The cells make up first of all the simple animals, and then sophisticated ones, climbing step by step. The stable units that compose one level or stratum are the raw material for random encounters which produce higher configurations, some of which will chance to be stable. So long as there remains a potential of stability which has not become actual, there is no other way for chance to go. Evolution is the climbing of a ladder from simple to complex by steps, each of which is stable in itself. Since this is very much my subject, I have a name for it: I call it Stratified Stability. That is what has brought life by slow steps but constantly up a ladder of increasing complexity – which is the central progress and problem in evolution.

And now we know that that is true not only of life but of matter. If the stars had to build a heavy element like iron, or a super-heavy element like uranium, by the instant assembly of all the parts, it would be virtually impossible. No. A star builds hydrogen to helium; then at another stage in a different star helium is assembled to carbon, to oxygen, to heavy elements; and so step by step up the whole ladder to make the ninety-two elements in nature.”

– Bronowski, Jacob (2011-07-31). The Ascent Of Man (Kindle Locations 3440-3450). Ebury Publishing. Kindle Edition.

Also, in the complexity paper, he says:

“Here then is a physical model which shows how simple units come together to make more complex configurations; how these configurations, if they are stable, serve as units to make higher configurations; and how these higher configurations again, provided they are stable, serve as units to build still more complex ones, and so on. Ultimately a heavy atom such as iron, and perhaps even a complex molecule containing iron (such as hemoglobin), simply fixes and expresses the potential of stability which lay hidden in the primitive building blocks of cosmic hydrogen.

The sequence of building up stratified stability is also clear in living forms. Atoms build the four base molecules, thymine and adenine, cytosine and guanine, which are very stable configurations. The bases are built into the nucleic acids, which are remarkably stable in their turn. And the genes are stable structures formed from the nucleic acids, and so on to the sub-units of a protein, to the proteins themselves, to the enzymes, and step by step to the complete cell. The cell is so stable as a topological structure in space and time that it can live as a self-contained unit. Still the cells in their turn build up the different organs which appear as stable structures in the higher organisms, arranged in different and more and more complex forms.”

– J. Bronowski, “New Concepts In The Evolution Of Complexity: Stratified Stability and Unbounded Plans” Synthese 21 (1970) Page 242.

I think that the subtle idea of stratified stability is something that is taken for granted in evolutionary theory where its role is not central, but in non-biological and pre-biological systems its role is quite crucial. Furthermore, I think that it can be extended to the formation of galaxies and stars under the influence of gravitational and other forces. As I said, natural selection (in a broader sense that includes selection of stable structures) is the only process that can create novel structure in the universe on reasonable time scales, effectively opposing the second law of thermodynamics.

@Colin,

I mean fitness exactly in the way you do, exactly as inclusive fitness is defined in standard evolutionary theory in respect to the organism's ecological niche. I didn’t explicitly mention any part of the fallacious idea of fitness that you highlighted, so I don't know where you got that from.

@Jackson Walters,

You have performed a giant's labor. The idea of "boundary pressure is very interesting, but…

When you constructed your transition matrix did you use the exact dynamics of the bars as described, or did you use some variant of it?

The algorithm should be:

1. Choose a random end of a given bar.

2. Find a random neighbor of that node that is within the grid. This is the new position of the bar end.

3. Move the other end to a random neighbor node of the first end, within the grid.

Hi Pradeep, thanks! Wouldn't have done it if it wasn't fun. I tried to use the exact dynamics you described, and tried to stay within the spirit of the problem. The algorithm I used is what you describe. Here is how I computed P_ab in the probability transition matrix:

– Write edge a = (a1,a2), b = (b1,b2) where a1, a2, b1, b2 are nodes in the grid.

– On the next time step, there are two cases: i) a1 moves to a neighboring node a1', then a2 moves to a neighbor of a1'. ii) a2 moves to a2', then a1 moves to a neighbor of a2'.

– The simplest way to get the probability of a->b is compute #{moves ending in (b1,b2)}/#{total moves}.

– #total moves = #double neighbors of a1 + #double neighbors of a2, where a 'double neighbor' is a neighbor of a neighboring node.

I had some concerns since it is possible for a bar to 'jump': {(0,0),(1,0)} –> {(2,0),(3,0)} is possible via a2=(1,0) –> (2,0), a neighbor. Then a1 goes to (3,0) a neighbor of (2,0).

All this means is that the maximum distance to a new position is a bit bigger, the behavior is a bit less local. On a bigger grid it wouldn't matter at all.

@Jackson Walters

The reason I asked about the dynamics is the following. I can believe that there are some dynamics where boundary pressure can arise, but in the dynamics I described, the possibility of boundary pressure is countered by the fact that there is a probability that a bar can remain in place, and this tendency increases at the boundary. Thus, a bar on the boundary at {(0,0),(0,1)} is more than twice as likely to remain in place by flipping its ends (the ends are neighbors, after all) than a bar in the center at {(2,2),(3,2)}.

It is easy to calculate the probabilities directly using equations if we reduce the universe to the simplest case: the 2×3 node universe of the central LCD. Let us constrain the movement of a single bar to this reduced space. By symmetry, the 7 positions in this case are of 3 types: the upper and lower horizontal bars (probability = a), the 4 vertical border bars (b) and the central bar (c). The equations for these turn out to be:

a = a/4 + 7b/12 + c/6

c = a/3 + 5b/9 + c/9

2a + 4b + c = 1

Note the probabilities of a bar staying in place, which is more at the border for "a" rather than for "c".

The solution for this system is simply a = b = c = 1/7.

There is no boundary pressure in this case, and I think the same principle extends to the 6×5 grid as well.

So I'm wondering if your method of calculating the #total moves includes the node itself as a double neighbor, and if not then the dynamics you modeled may be different.

@Pradeep Mutalik

Yes, a node is a double neighbor of itself in my method. This gives a nonzero probability for a bar to stay in place. More precisely, there are two ways for a bar to stay in place: (a1->a2 then a2->a1) or (a2->a1 then a1->a2) – there are two ways to flip. The diagonal of the probability transition matrix gives these probabilities:

diag(P) = {2/15, 2/15, 2/19, 2/23, 2/19, 2/25, 2/15, 2/23, 2/15, 2/23, 2/19,

2/29, 2/29, 2/29, 2/31, 2/23, 2/29, 2/19, 2/25, 1/10, 2/31, 1/15,

2/31, 1/16, 2/25, 1/15, 1/10, 2/25, 2/19, 2/31, 2/29, 2/31, 2/31,

2/25, 2/29, 2/19, 2/23, 2/15, 2/29, 2/23, 2/29, 2/25, 2/23, 2/23,

2/15, 2/15, 2/19, 2/19, 2/15}

I meant "boundary pressure" as a heuristic to describe the fact that the long term, stable probability distribution has lower probabilities for bars to be in the corners or on the walls. A bar in a corner *is* more likely to stay in place than a bar in the center, 2/15 vs. 2/31. However, a bar in the corner on average (take a weighted sum of the possible directions of motion) moves towards the center. A bar in the center can move in all directions, so its average movement on a single time step is close to zero.

It's a little unclear to me what those equations as well as 'a', 'b', 'c' represent. I would guess that they are the probabilities for the long term, stable distribution. If so, we would need to construct a 3×3 probability transition matrix P where P_aa represents the probability of 'a' staying put, etc.

P=[ (2/8, 4/8, 2/8), (2/8,5/8,1/8), (2/8,4/8,2/8)]

= [ (1/4,1/2,1/4), (1/4,5/8,1/8), (1/4,1/2,1/4)]

The stable solution to this system (lim_t->infty P^t) is (1/4, 4/7, 5/28). That is, probability 1/4 for top/bottom, 4/7 for border, 5/28 for center. We need to remember though that there are two a's, four b's, and one c:

a = 1/8 = .125

b = 1/7 ~ .142857

c = 5/28 ~ .178571

Almost a uniform distribution, but slightly higher probability of being in the center on or a border.

@Jackson Walters,

For a corner bar, there are indeed 15 possible transitions, two of which end up in the same location. However, the probabilities of these transitions are not all equal. There are two ways, for example, that a bar at {(0,0),(0,1)} can stay in place, each with probability 1/12. On the other hand, the farthest transition, to {(1,1),(2,1)} only has a probability of 1/24. So I don't think you can use 15 in the denominator.

I have the diag(P) entry for the above corner bar to be 2/12 or 1/6, while that for the center bar to be 2/32 or 1/16.

You said, "A bar in a corner *is* more likely to stay in place than a bar in the center… However, a bar in the corner on average (take a weighted sum of the possible directions of motion) moves towards the center. A bar in the center can move in all directions, so its average movement on a single time step is close to zero."

However, since these are random walks, both the center bar and the corner bar should move away from their points of origin with time, the greater "self-stickiness" of the corner bar causing it to move more slowly, compensating for the fact that it can only move in constrained directions.

As for the simple LCD system with a single bar, I did a simple simulation for 100,000 ticks with the starting bar in different starting positions, and the probabilities always converge to 1/7.

So it seems to me that your intriguing idea of boundary pressure does not apply to this system.

If anyone has an independent opinion on this, please do weigh in.

@Pradeep Mutalik

I think I have clarified for myself where and why we are getting different answers for the transition probabilities – your method seems correct.

We agree that from the corner there are 15 possible transitions. We agree that there are two ways for a bar to stay in place. Here is where we diverge:

I was saying that the probability to stay in place, i.e. P_corner->corner = 2/15 since out of 15 possibilities, 2 of them end in "corner". We also agree that out of these 15 possibilities, not all are equally likely. For instance, there is only one way to get to {(0,2),(0,3)}, so P_{(0,0),(0,1)}->{(0,2),(0,3)} = 1/15.

I believe you are computing the probability via the following: choose a random end of the bar, so 1/2 for (0,0) and 1/2 for (1,0). Let's say we pick (0,0). (0,0) has two neighbors, (1,0) and (0,1). To stay in place we need (0,1), so this contributes a factor of 1/2. (0,1) has three neighbors, and we need it to move to (0,0), so this contributes a factor of 1/3. Overall, the probability to stay in place is 1/12 for this transition. We also have the transition pick (1,0) w/ prob. 1/2, (1,0) -> (0,0) with prob. 1/3, followed by choosing the neighbor (0,1) of (0,0) w/ prob. 1/2, resulting in probability 1/12 for this way of staying put. Overall the probability to stay put is 1/6.

In a Markov model where the states are the positions of a bar in the grid, we need to compute P_ab, the probability to move from edge a to edge b. The key difference is that I was assuming each path a->b occurred with equal probability, whereas we really want each path a->b to have its own probability.

Thus, we arrive at two different probability transition matrices P for a Markov model. With your P, the limiting distribution is uniform. The way I was computing probabilities, we obtain an *almost* uniform distribution where the probabilities are close to 1/(#states for one bar) = 1/49, but not exactly. In your method, it seems we are keeping in line with the 1st law of thermodynamics in the sense that all micro-states are equally likely to occur. Either way one can follow the method I laid out to compute the expectation value of the entropy E[S].

My previous method: E[S] ~ .782615

Clarified method: E[S] ~ .828766

Considering entropy as a logarithm only is about 40 years old from the recently fashionable examples there are Tsallis, Kaniadakis, Abe and other non-logarithmic entropies, which in turn are particula cases of Sharma-Mittal entropy themselves.

@DC

You are probably thinking of Shannon and his definition of entropy in 1948? But entropy as a logarithm is over 100 years old in thermodynamics (Boltzmann and Gibbs). The thermodynamic entropy of Gibbs is proportional to the Shannon statistical entropy. You are right that other measures of entropy could be used – and this includes, for this puzzle, the Kolmogorov-Sinai entropy for dynamical systems.

However, I think this puzzle fails to make contact with any of the measures mentioned, including Shannon, Tsallis, Renyi, etc. This is because it defines "entropy" as a function of a single configuration of the system, rather than of a measure over configurations (whether statistical or dynamical or coarse-grainings or whatever). It has no intrinsic connotation of randomness or uncertainty, and should really be called something else (e.g., a "distance" of the configuration from the initial configuration).

While reading through this problem, which clearly laid out the rules for a simulation — I really wanted to just push 'play' and see what would happen. So I went ahead and built the universe ("Let there be LCD light!")

You can play around with the model of this puzzle here: http://testtubegames.com/quanta_entropy.html

Does it behave like you'd expect? And do your mathematical predictions hold?

@Michael,

I concede your point about my "entropy" measure. The problem is that, in my attempt to simplify, I conflated two somewhat different things that are both related to decreasing disorder: the first is classical entropy as a measure of randomness, and the second is the creation of complex structures which decrease entropy locally. Ideally, I should have used the standard logarithmic measure of statistical entropy, and had a separate measure of complexity based on structures forming, that demonstrated local decreasing entropy while it globally increased everywhere else. This other measure could have then been used to illustrate the idea of stratified stability.

But all that's hard to do in a toy universe of this size 🙂

@Pradeep

Yes, fair enough, your definition does what you want it to do to illustrate the idea.

I guess it also makes a connection with other definitions of entropy insofar as one can define a set of 'coarse-grained' regions in the space of configurations, which correspond to constant values of your definition. The system then spends time in each region of the configuration space (which is different from a region of the grid), which roughly scales as the number of configurations in that region. Where it gets a bit confusing is that, by construction, there is a rough correlation between these regions in configuration space, and corresponding regions in the grid. And, of course, there is the issue that an "8" in the centre has a different degree of "order" from an "8" anywhere else in the grid.

On the latter point, it is interesting that there are another 30 "8"s in addition to the initial configuration. The set of such "8"s forms a small region of the configuration space (since there are C(49)~10^12 distinct configurations). Thus, identifying an "8" as ordered, this is a natural way of seeing that the system will spend a relatively tiny amount of time in an ordered state.

One could perhaps try to "stratify" the configuration space via an "energy" function E – but chosen differently from your "entropy" function, so that it is constant under translation in the grid – thus, all "8"s would have the same, relatively small, energy. One possibility might be

E = (A-2)^2 + S^2,

where A is the smallest area (in grid cells) containing the configuration, and S is the number of bars sharing a segment. Minimum energy configurations are then an "8" – it takes energy to compress them, to make them share a segment, and also to stretch them out. This energy will then fluctuate during the dynamics, and achieve a long term average. Importantly, it will pass through an "8" configuration very rarely, and most higher-energy configurations much more often.

I am happy to admit this change to the model makes it more complex. But it does suggest that renaming your "entropy" function, as an "energy" function, might make a closer connection with statistical mechanics.

@Andy Hall

That's fun, clicking away! – especially waiting for that last piece to "stick" for question 2. On ten runs from the initial position, it took an average of 35s to get to maximum "entropy", twice my estimate of ~16s – but not too different from Jackson Walter's estimate of ~27s. The true average is likely to be even larger.

My upper bound of ~125s for the average time to get back in the sticky case looks OK though (the actual average seems closer to 80-90s) – probably because it takes the statistics of the dynamics into account better (rather than assuming distinguishable configurations are all equally likely).

Good day people, I also built a simulation for the puzzle,

https://gitlab.com/BridLeiva/physics-of-time

It's written on python with pygame, and I'm willing to further expand it's functionality if enough interest.

Greetings.