The “Australia day” category error

Australia’s national holiday commemorates not some heroic act, but the arrival of settler colonists who occupied, and settled that land, dispossessing the original and rightful inhabitants of the continent. Aboriginal sovereignty was never ceded; no treaty has ever been signed. Historic dispossession and violence, involving frontier wars and genocidal campaigns, decimated the Indigenous nations. There is struggle and heroism here, but mainly in the capacity of Indigenous peoples to resist and to survive.

Suppose I came to your home, invited myself in, made it my home, took your possessions, evicted or kidnapped or infected or murdered your family, and then celebrated the anniversary of my arrival each year — what would be the appropriate response?

And the answer is the same in the excruciatingly mind-numbing debate each year in Australia about whether the national holiday is appropriate.

(To avoid maximum excruciation, let us state the obvious. Clearly this analogy is not literal; no individual living today bears direct moral culpability for tragedies which unfolded in historical time. But it is precisely the symbolism, and national commemorations are pure symbolism, by design.)

The question in this mind-numbing debate may be an easy one, but even to ask it — of non-Indigenous Australians — contains a category error.

If I took over your home and then held a celebration there each year, it is not for me to say whether that celebration is appropriate. It is for you to say. I may well say it is not appropriate, but even if I think it is, your view counts for more; you have suffered the injustice. The correct answer is not just “no”, but also “it’s not for me to say”.

And so, to answer the question of the appropriateness of “Australia day”, the answers of Indigenous people are the most important. Everybody is entitled to their opinion, but an opinion on the question which does not take into account the views of Indigenous people cannot be taken seriously.

Views of Indigenous Australians can easily be found. The broadest data I’m aware of are poll results from 2017, a survey of 1,156 Indigenous Australians about “Australia day”. (If you know a better or more recent poll I would be happy to update.) It found that:

  • 54% of Indigenous Australians were in favour of a change of date. This may suggest that only a slim majority are against the event, but further results make it clear that the other 46% are far from being uniformly enthusiastic. For instance:
  • The survey asked participants to associate three words with Australia day. The most chosen words by Indigenous Australians were “invasion”, “survival” and “murder”.
  • A majority of Indigenous Australians said that the name “Australia day” should change.
  • 23% of Indigenous participants felt positive about Australia day, 30% had mixed feelings, and 31% had negative feelings.

Despite the above poll results, in January 2018 the Indigenous Affairs minister (who is not Indigenous) claimed that “no Indigenous Australian has told him the date of Australia Day should be changed other than a single government adviser”. This says more about a politician being out of touch, than it does about the distribution of opinion among Indigenous Australians.

In contrast, Jack Latimore, editor of IndigenousX, the prominent online platform for Indigenous voices, comes to a rather different conclusion.
Based on his extensive experience and engagement with Indigenous Australians from across the social and political spectrum, his conclusion is worth repeating:

When it comes to the subject of 26 January, the overwhelming sentiment among First Nations people is an uneasy blend of melancholy approaching outright grief, of profound despair, of opposition and antipathy, and always of staunch defiance.

The day and date is steeped in the blood of violent dispossession, of attempted genocide, of enduring trauma. And there is a shared understanding that there has been no conclusion of the white colonial project when it comes to the commonwealth’s approach to Indigenous people. We need only express our sentiments regarding any issue that affects us to be quickly reminded of the contempt in which our continued presence and rising voices are held.

Nor is our sentiment in regards to 26 January a recent phenomenon. I have witnessed it throughout my life in varied intensities. Evidence of it is even present in the recorded histories of White Australia.

Indeed, the long history of Indigenous protest against a January 26 celebration goes back at least to boycotts in 1888, and numerous actions on the 1938 sesquicentenary.

Returning to the present, numerous community leaders and representative bodies have also given their views, many of which are available online. Below are links to some such views; of course there plenty more are easily found.

Changing the date is an obvious, minimal, easy next step on the road to justice for Indigenous Australia. At the very least, maintaining the celebration in its current form is untenable. A minimal step towards respect for Indigenous Australia is to stop dancing on their ancestors’ graves.

Nor is it particularly opposed by the general Australian public. According to a December 2017 poll, most Australians are ignorant of the history of Australia Day, can’t guess what historical event happened on that day, and don’t really mind on what date it is celebrated. Half also think that the national holiday should not be held on a date offensive to Indigenous Australians (even though a plurality wrongly believes that January 26 is not offensive to Indigenous Australians).

As of a January 2017 poll, only 15% of Australians wanted to change the date. That number may well have increased by now, with the momentum of the movement to change the date.

And the survey apparently did not have “it’s not for me to say” as an option for non-Indigenous respondents — reinforcing the standard, annual category error.

I don’t believe in any patriotic holidays. But a patriotic holiday on such a terrible date needs to be moved, rebuilt, or abolished.

Topological entropy: information in the limit of perfect eyesight

Entropy is a notoriously tricky subject. There is a famous anecdote of John von Neumann telling Claude Shannon, the father of information theory, to use the word “entropy” for the concept he had just invented, because “nobody knows what entropy really is, so in a debate you will always have the advantage“.

Entropy means many different things in different contexts, but there is a wonderful notion of entropy which is purely topological. It only requires a space, and a map on it. It is independent of geometry, or any other arbitrary features — it is a purely intrinsic concept. This notion, not surprisingly, is known as topological entropy.

There are a few equivalent definitions; we’ll just discuss one, which is not the most general. As we’ll see, it can be described as the rate of information you gain about the space by applying the function, when you have poor eyesight — in the limit where your eyesight becomes perfect.

Let \(X\) be a metric space. It could be a surface, it could be a manifold, it could be a Riemannian manifold. Just some space with an idea of distance on it. We’ll write \(d(x,y)\) for the distance between \(x\) and \(y\). So, for instance, \(d(x,x) = 0\); the distance from a point to itself is zero. Additionally, \(d(x,y) = d(y,x)\); the distance from \(x\) to \(y\) is the same as the distance from \(y\) to \(x\); the triangle inequality applies as well. And if \(x \neq y\) then \(d(x,y) > 0\); to get from one point to a different point you have to travel over more than zero distance!

We assume \(X\) is compact, so roughly speaking, it has no holes, it doesn’t go off to infinity, its volume (if it has a volume) is finite.

Now, we will think of \(X\) as a space we are looking at, but we can’t see precisely. We have myopia. Our eyes are not that good, and we can only tell if two points are different if they are sufficiently far apart. We can only resolve points which have a certain degree of separation. Let this resolution be \(\varepsilon\). So if two points \(x,y\) are distance less than \(\varepsilon \) apart, then our eyes can’t tell them apart.

Rather than thinking of this situation as poor vision, you can alternatively suppose that \(X\) is quantum mechanical: there is uncertainty in the position of points, so if \(x\) and \(y\) are sufficently close, your measurement can’t be guaranteed to distinguish between them. Only when \(x\) and \(y\) are sufficiently far apart can your measurement definitely tell them apart.

We suppose that we have a function \(f \colon X \rightarrow X\). So \(f\) sends points of \(X\) to points of \(X\). We assume \(f\) is continuous, but nothing more. So, roughly, if \(x\) and \(y\) are close then \(f(x)\) and \(f(y)\) are close. (Making that rough statement precise is what the beginning of analysis is about.) We do not assume that \(f\) is injective; it could send many points to the same point. Nor do we assume \(f\) is surjective; it might send all the points of \(X\) to a small region of \(X\). All we know about \(f\) is that it jumbles up the points of \(f\), moving them around, in a continuous fashion.

We are going to define the topological entropy of \(f\), as a measure of the rate of information we can get out of \(f\), under the constraints of our poor eyesight (or our quantum uncertainty). The topological entropy of \(f\) is just a real number associated to \(f\), denoted \(h_{top}(f)\). In fact it’s a non-negative number. It could be as low as zero, and it can be infinite; and it can be any real number in between.

We ask: what is the maximum number of points can we distinguish, despite our poor eyesight / quantum uncertainty? If the answer is \(N\), then there exist \(N\) points \(x_1, \ldots, x_N\) in \(X\), such that any two of them are separated by a distance of at least \(\varepsilon\). In other words, for any two points \(x_i, x_j\) (with \(i \neq j\)) among these \(N\) points, we have \(d(x_i, x_j) \geq \varepsilon\). And if the answer is \(N\), then this is the maximum number; so there do not exist \(N+1\) points which are all separated by a distance of at least \(\varepsilon\).

Call this number \(N(\varepsilon)\). So \(N(\varepsilon)\) is the maximum number of points of \(X\) our poor eyes can tell apart.

(Note that the number of points you can distinguish is necessarily finite, since they all lie in the compact space \(X\). There’s no way your shoddy eyesight can tell apart infinitely many points in a space of finite volume! So \(N(\varepsilon)\) is always finite.)

Clearly, if our eyesight deteriorates, then we see less, and we can distinguish fewer points. Similarly, if our eyes improve, then we see more, so we can distinguish more points. Eyesight deterioration means \(\varepsilon\) increases: we can only distinguish points if they are further apart. Similarly, eyesight improvement means \(\varepsilon\) decreases: we can tell apart points that are closer together.

Therefore, \(N(\varepsilon)\) is a decreasing function of \(\varepsilon\). As \(\varepsilon\) increases, our eyesight deteriorates, and we can distinguish fewer points.

Now, we haven’t yet used the function \(f\). Time to bring it into the picture.

So far, we’ve thought of our eyesight as being limited by space — by the spatial resolution it can distinguish. But our eyesight also applies over time.

We can think of the function \(f\) as describing a “time step”. After each second, say, each point \(x\) of \(X\) moves to \(f(x)\). So a point \(x\) moves to \(f(x)\) after 1 second, to \(f(f(x))\) after 2 seconds, to \(f(f(f(x)))\) after 3 seconds, and so on. In other words, we iterate the function \(f\). If \(f\) is applied \(n\) times to \(x\), we denote this by \(f^{(n)}(x)\). So, for instance, \(f^{(3)}(x) = f(f(f(x)))\).

The idea is that, if you stare at two moving points for long enough, you might not be able to distinguish them at first, but if eventually you may be able to. If they move apart at some point, then you may be able to distinguish them.

So while your eyes are encumbered by space, the are assisted by time. Your shoddy eyes have a finite spatial resolution they can distinguish, but over time points may move apart enough for you to resolve them.

(You can also think about this in a “quantum” way. The uncertainty principle says that uncertainties in space and time are complementary. If you look over a longer time period, you allow a greater uncertainty in time, which allows for smaller uncertainty in position. But from now on I’ll stick to my non-quantum myopia analogy.)

We can then ask a similar question: what is the maximum number of points we can distinguish, despite our myopia, while viewing the system for \(T\) seconds? If the answer is \(N\), then there exist \(N\) points \(x_1, \ldots, x_N\) in \(X\), such that at some point over \(T\) seconds, i.e. \(T\) iterations of the function \(f\), any two of them become separated by a distance of at least \(\varepsilon\). In other words, for any two points \(x_i, x_j\) (with \(i \neq j\)) among these \(N\) points, there exists some time \(t\), where \(0 \leq t \leq T\), such that \(d(f^{(t)}(x_i), f^{(t)}(x_j)) \geq \varepsilon\). And if the answer is \(N\), then this is again the maximal number, so there do not exist \(N+1\) points which all become separated at some instant over \(T\) seconds.

Call this number \(N(f, \varepsilon, T)\). So \(N(\varepsilon)\) is the maximum number of points of \(X\) our decrepit eyes can distinguish over \(T\) seconds, i.e. \(T\) iterations of the function \(f\).

Now if we allow ourselves more time, then we have a better chance to see points separating. As long as there is one instant of time at which two points separate, we can distinguish them. So as \(T\) increases, we can distinguish more points. In other words, \(N(f, \varepsilon, T)\) is an increasing function of \(T\).

And by our previous argument about \(\varepsilon\), \(N(f, \varepsilon, T)\) is a decreasing function of \(\varepsilon\).

So we’ve deduced that the number of points we can distinguish over time, \(N(f, \varepsilon, T)\), is a decreasing function of \(\varepsilon\), and an increasing function of \(T\).

We can think of the number \(N(f, \varepsilon, T)\) as an amount of information: the number of points we can tell apart is surely some interesting data!

But rather than think about a single instant in time, we want to think of the rate of information we obtain, as time passes. How much more information do we get each time we iterate \(f\)?

As we iterate \(f\), and we look at our space \(X\) over a longer time interval, we know that we can distinguish more points: \(N(f, \varepsilon, T)\) is an increasing function of \(T\). But how fast is it increasing?

To pick one possibility out of thin air, it might be the case, that every time we iterate \(f\), i.e. when we increase \(T\) by \(1\), that we can distinguish twice as many points. In that case, \(N(f, \varepsilon, T)\) doubles every time we increment \(T\) by 1, and we will have something like \(N(f, \varepsilon, T) = 2^T\). In this case, \(N\) is increasing exponentially, and the (exponential) growth rate is given by the base 2.

(Note that doubling the number of points you can distinguish is just like having 1 extra bit of information: with 3 bits you can describe \(2^3 = 8\) different things, but with 4 bits you can describe \(2^4 = 16\) things — twice as many!)

Similarly, to pick another possibility out of thin air, if it were the case that \(N(f, \varepsilon, T)\) tripled every time we incremented \(T\) by \(1\), then we would have something like \(N(f, \varepsilon, T) = 3^T\), and the growth rate would be 3.

But in general, \(N(f, \varepsilon, T)\) will not increase in such a simple way. However, there is a standard way to describe the growth rate: look at the logarithm of \(N(f, \varepsilon, T)\), and divide by \(T\). For instance, if \(N(f, \varepsilon, T) \sim 2^T\), then we have \(\frac{1}{T} \log N(f, \varepsilon, T) \sim 2\). And then see what happens as \(T\) becomes larger and larger. As \(T\) becomes very large, you’ll get an asymptotic rate of information gain from each iteration of \(f\).

(In describing a logarithm, we should technically specify what the base of the logarithm is. It could be anything; I don’t care. Pick your favourite base. Since we’re talking about information, I’d pick base 2.)

This leads us to think that we should consider the limit
\[
\lim_{T \rightarrow \infty} \frac{1}{T} \log N (f, \varepsilon, N).
\]
This is a great idea, except that if \(N (f, \varepsilon, N)\) grows in an irregular fashion, this limit might not exist! But that’s OK, there’s a standard analysis trick to get around these kinds of situations. Rather than taking a limit, we’ll take a lim inf, which always exists.
\[
\liminf_{T \rightarrow \infty} \frac{1}{T} \log N (f, \varepsilon, N).
\]

(The astute reader might ask, why lim inf and not lim sup? We could actually use either: they both give the same result. In our analogy, we might want to know the rate of information we’re guaranteed to get out of \(f\), so we’ll take the lower bound.)

And this is almost the definition of topological entropy! By taking a limit (or rather, a lim inf), we have eliminated the dependence on \(T\). But this limit still depends on \(\varepsilon\), the resolution of our eyesight.

Although our eyesight is shoddy, mathematics is not! So in fact, to obtain the ideal rate of information gain, we will take a limit as our eyesight becomes perfect! That is, we take a limit as \(\varepsilon\) approaches zero.

And this is the definition of the topological entropy of \(f\):
\[
h_{top}(f) = \lim_{\varepsilon \rightarrow 0} \liminf_{T \rightarrow \infty} \frac{1}{T} \log N(f, \varepsilon, n).
\]
So the topological entropy is, as we said in the beginning, the asymptotic rate of information we gain in our ability to distinguish points in \(X\) as we iterate \(f\), in the limit of perfect eyesight!

As it turns out, even though we heavily relied on distances in \(X\) throughout this definition, \(h_{top}(f)\) is completely independent of our notion of distance! If we replace our metric, or distance function \(d(x,y)\) with a different one, we will obtain the same result for \(h_{top}\). So the topological entropy really is topological — it has nothing to do with any notion of distance at all.

This is just one of several ways to define topological entropy. There are many others, just as wonderful and surprising and which scratch the tip of an iceberg.

References:

Abstract algebra nursery rhyme

In the spirit of hilariously advanced baby books like Chris Ferrie’s Quantum Physics for Babies, I have taken to incorporating absurdly sophisticated concepts into nursery rhymes.

To the tune of the ABC song (or, equivalently, Twinkle Twinkle Little Star):

The axioms of a group go 1, 2, 3
Identity, inverse, associativity!
The identity times any element g is g,
Inverse of g times g is identity,
Associativity says ab times c
is equal to a times bc.

The last resort of scoundrels

Samuel Johnson said it was “the last resort of scoundrels“; Emma Goldman, a menace to liberty. Leo Tolstoy said it “as a feeling is bad and harmful, and as a doctrine is stupid“. Patriotism, at least in its usual sense of love of one’s country over others, veneration of the virtue of its people over others, and adoration of its flag, is awful, irrational nonsense.

How on earth one can deduce moral values, or even a positive emotional response, from a geographic entity — indeed, such powerful emotions as to move men to war (yes, usually men) — has always eluded me.

It may be that there may be various administrative reasons to divide a geographical area (like the earth, or a continent) into official or legal sub-regions (like countries, or states).

More importantly, it may be that, for one born in a land oppressed by a colonist, an occupier, or other oppressor, the natural solidarity among those oppressed peoples in their legitimate resistance may be expressed in the language of patriotism.

And it may be that there can be good, even uniquely good, things about a nation’s culture, and that it is worth recalling them occasionally — though there will equally be bad, even uniquely bad aspects also. One must never forget that people everywhere are roughly equally good and equally bad.

It may also be that countries may have sporting teams, or the like, and it can be fun to barrack for them.

Beyond that, there is nothing positive to say about patriotism.

Even if a country is physically beautiful, others are too. Even if a country’s culture or people are wonderful, others are too. There are wonderful people and wonderful ideas everywhere, just as there are horrible people everywhere. Venerating only those nearby, to the exclusion of others, is insular, narcissistic, and leads naturally to racism, chauvinism, and xenophobia.

Even if the highly dubious conceit of orthodox patriotism is true for a country — that this nation is great and to be preferred over others, despite all the other ones believing the same — it does not follow that that one ought to venerate this nation: if one wants to venerate something, one should venerate good things and good people, whether here, there or anywhere.

(Incredibly, orthodox patriotism means that vast numbers people in every land can believe precisely this, despite those elsewhere thinking the same. They cannot all be right, but they can all be wrong — living “in a gross and harmful delusion“. It is the same with all religions claiming to be the one true religion, of course. It discloses something deep, and deeply worrisome, about the human condition, that vast numbers of people are capable of this conceit.)

What matters are universal moral values, equity, justice, freedom, and so on; not the country in which they are expressed. One’s specific birthplace or homeland or nation is irrelevant.

This is kindergarten level morals; except that the corresponding kindergarten situation, of a group of children each boasting they are the best, will be resolved by a game or by a distraction, rather than by oppression, detention archipelagos, or war.

Perhaps the worst aspect of patriotism is in the cultural realm. It creates mythologies, with deep and powerful emotions latent within its manufactured communities. These emotions, fueled also by resentment of outsiders, can be manipulated by regressive political forces to reinforce inequalities, persecute outsiders, and stoke wars.

These mythologies are created when a nation’s history is recounted as virtuous, dramatic and heroic. But it is the same with other nations; and if retelling the story of one nation excludes other peoples and nations (or worse, disparages or invokes hatred of them), then it leads in the direction of, at best, insularity and stagnation, and at worst, militarism, oppression and war.

Then there is Australia.

Here, the magnitude of the artifice required to tell the nation’s history as a virtuous story is itself heroic. The result is an increasingly viciously enforced cultural orthodoxy, together with a crushing cultural cringe.

An island continent, home to hundreds of Indigenous nations, until colonised by an imperial power to create an antipodean jail; the original inhabitants and rightful owners dispossessed by the accumulation of property and capital and microbes, by genocidal policy, and by over a century of smouldering frontier war; no galvanizing wars fought for independence, only complicity in the motherland’s imperial ambitions, and a standard role in humanity’s propensity for worldwide violence; with all the bravery, heroism, obedience, murder and atrocity that entails. The overall arc of post-settlement history must be twisted beyond recognition to confect an orthodox patriotic mythology.

There are plenty of heroic Australians, to be sure; just as there are plenty of villains, and everything in between. And there are plenty of legitimate sources of pride in that nation’s achievements, just as there are plenty of horrific sources of shame.

Nothing more and nothing less; special in some ways and not in others; which is precisely the negation of every orthodox patriotic myth.

Limitless as that space too narrow for its inspirations

On 22 February, 1877, James Joseph Sylvester gave an “Address on Commemoration day at Johns Hopkins University”.

Sylvester, the very excellent English mathematician, worked in areas of what we would today call algebra, number theory, and combinatorics. He is known for his algebraic work in invariant theory; he is known for his work in combinatorics, such as Sylvester’s Problem in discrete geometry; and for much else. He invented several terms which are commonplace in mathematics today — “matrix”, “graph” (in the sense of graph theory) and “discriminant”. He was also well known for his love of poetry, and indeed his poetic style. (He in fact published a book, The Laws of Verse, attempting to reduce “versification” to a set of axioms.)

I came across this address of Sylvester, not through mathematical investigations or in the references of a mathematical book, but rather in the footnotes of the book “Awakenings”, in which the late neurologist Oliver Sacks discusses, in affectionate and literary detail, the case histories of a number of survivors of the 1920s encephalitis lethargica (“sleeping sickness”) epidemic — an interesting and mysterious event in itself — as those patients are treated in the 1960s with the then-new drug L-DOPA and experience wondrous “awakenings”, often after decades of catatonia, although often followed by severe tribulations. (These awakenings were the subject of the 1990 Oscar-nominated movie of the same name.) These tribulations, in each patient, form an odyssey through the depths of human ontology, in which the effects of personality, character, physiology, environment, and social context are all present and deeply intertwined.

Sacks comes to the conclusion that a reductionist approach to medicine, focusing on the cellular and the chemical, is wholly deficient:

What we do see, first and last, is the utter inadequacy of mechanical medicine, the utter inadequacy of a mechanical world-view. These patients are living disproofs of mechanical thinking, as they are living exemplars of biological thinking. Expressed in their sickenss, their health, their reactions, is the living imagination of Nature itself, the imagination we must match in our picturing of Nature. They show us that Nature is everywhere real and alive and that our thinking about Nature must be real and alive. They remind us that we are over-developed in mechanical awareness; and that it is this, above all, that we need to regain, not only in medicine, but in all science.

Indeed, Sacks quotes from W H Auden’s “The Art of Healing”:

‘Healing,’
Papa would tell me,
‘is not a science,
but the intuitive art
of wooing Nature.

In an accompanying footnote, Sacks notes that mathematical thinking is real and alive, in just the same way. He quotes the aforementioned address of Sylvester.

Mathematics is not a book confined within a cover and bound between brazen clasps, whose contents it needs only patience to ransack; it is not a mine, whose treasures may take long to reduce into possession, but which fill only a limited number of veins and lodes; it is not a soil, whose fertility can be exhausted by the yield of successive harvests; it is not a continent or an ocean, whose area can be mapped out and its contour defined: it is limitless as that space which it finds too narrow for its aspirations; its possibilities are as infinite as as the worlds which are forever crowding in and multiplying upon the astronomer’s gaze; it is as incapable of being restricted within assigned boundaries or being reduced to definitions of permanent validity, as the consciousness, the life, which seems to slumber in each monad, in every atom of matter, in each leaf and bud and cell, and is forever ready to burst forth into new forms of vegetable and animal existence.

Sylvester is right, and if anything his argument is not forceful enough. Mathematics has always been limitless — and even more limitless than the seemingly (to Sylvester, at least) infinite possibilities of astronomy and biology — for, unlike the experimental or observational sciences, it requires no substrate in reality beyond the imagination of those who think it. Liberated from the necessity to study only this world, mathematics studies all the worlds it can imagine, which include our own but go far beyond our own one. (It is perhaps surprising, and even “unreasonable”, as Wigner argued, that we can count our own world as among those which are mathematical; but it is not surprising that its worlds transcend ours.)

The progress of science has displayed, in an absolute sense, how mathematics outstrips the limitlessness of other sciences.

However many may be the worlds of the astronomer — now teeming also with exoplanets and gravitational waves — they are still finite; the observable universe has a finite radius.

Sylvester’s panpsychism (everything has consciousness) is now out of fashion, but seems focused on biology — and we now know that biological life is constrained by genetics, and at the molecular level by DNA and related biochemistry. Mathematics knows no such constraint.

Taking panpsychism more generally, there is an argument — and a strong one, in my view — that understanding consciousness will eventually require a radical revision of our understanding of physics. But even then, I very much doubt any such radical revision would completely transcend mathematics — and I very much doubt that mathematics would not encompass infinitely more.

It is worth noting, though, that mathematics is, in a certain sense, reductionism par excellence. Even accepting what we know about incompleteness theorems and the like, mathematics, theoretically at least, can be reduced to sets of axioms and logical arguments, in the end consisting only of formal logic, modus ponens and the like. That is not how mathematicians do mathematics in practice, but that is the orthodox view on what mathematics formally is. Even the standard theorems that mathematics “knows no bounds” — the Godel incompleteness theorems, the Cantor diagonalisation argument, the set-theoretic paradoxes like Russell’s, for instance — can themselves be expressed, reductionistically, in this formal way.

All the infinite possibilities, the unboundedness, of mathematics, then, can be expressed in a very finite, very discrete, very reductionistic way. This is not surprising — even with finitely many letters one can construct an infinity of sentences, one can burst all brazen clasps, one can empty all veins and lodes, one can exhaust all soils, there is no end to the harvest, however dizzying and rarefied the altitude at which it is sown.

And as for definitions of permanent validity? At least in terms of the experience of learning, doing and discovering mathematics, I cannot go past Ada Lovelace’s definition of the “poetical science” as “the language of the unseen relations between things”.

There is much else of interest — and not just historical interest — in Sylvester’s address. Mathematics impedes public speaking; university study and research ought to avoid monetary reward and public recognition; students should avoid “disorder or levity”; all researchers should simultaneously engage in teaching; anecdotes of arithmetic in the French revolution; every science improves as it becomes more mathematical; and the taste for mathematics is much broader than one might think. So argues Sylvester, poet, mathematician; perhaps I will return to these arguments one day.

The Doors of Crime Perception

Crime is uniquely susceptible to the manipulation of perceptions.

It is common, it is bad, it is fascinating.

A wide spectrum of this common, bad, fascinating activity exists, and the fixation of fascinated attention on certain narrow portions of this spectrum serves numerous powerful political interests. Those numerous, already-aligned, authoritarian political interests — tabloid media, conservative politicians — are only too happy to indulge the public’s fascination. No similar political interest is usually served by attempting to understand other portions of the spectrum. Attempting to understand the spectrum as a whole, or the overall picture and causes of crime, might serve the purpose of building a better society, but that purpose is one which, all entrenched political powers agree, must remain unthinkable.

Which types of crime are they, on which power so fixates attention? Preferably those which are sensational, preferably involving violence and fear, preferably with perpetrators who are suitably villainous and “not like us”, where “we” means the “good folk” who are normalised within society. Powerless, marginalised groups form perfect villains: immigrants, ethnic minorities, racial minorities, Indigenous people, and in general, “others”.

In Australia at present, that means asylum seekers and refugees, it means African Australians, it means Aboriginal Australians.

Accordingly, the fixated attention of society on this narrow portion of crime — and its villains — blows it out of all proportion. Perceptions of crime in society can warp radically, tending towards fear and paranoia of the fixated type of crime, and the fixated vilains — and generalises to a fear of society at large.

The propaganda powers of media campaigns, their political protagonists, and their guerrilla online counterparts, are substantial. The far right delights in it.

A fearful populace is one that is easier to control. It is one which will more easily submit to existing oppression as justified or necessary, and accept further devolution towards a surveillance or police state. Fearful people will tend to look out only for themselves, diminishing the bonds of social solidarity, and furthering capitalist atomisation. And as the public holds a paranoid, distorted idea of reality, the desire to understand society, and in particular the root causes of crime, diminishes, or becomes unthinkable. Hysterical overreaction to the villains is the urgent goal, anything else is wasting time against this menace. The already marginalised will be oppressed further.

* * *

What is the situation in Victoria?

Crime statistics are freely available in Victoria.

What do they say?

(Let us put aside broader questions, such as whether existing laws are good laws, whether the criminal justice system is a good one, what better systems might exist, and so on.)

We can, for the moment, put aside subtle questions of methodology. (Do people report more crimes now, especially domestic violence? Should we refer to the number of criminal incidents, recorded offences, or offenders?) Because in any case it the statistics tell a fairly clear story.

To a first approximation, in Victoria, crime rates have decreased since 2016. They were roughly level from 2009 to 2015, at a rate of just under 6,000 incidents per 100,000 population, with a jump in 2016 to over 6,600. The rate has since decreased, and the current crime rate is similar to the rate of 2009-15. This crime rate is roughly similar to other states in Australia.

Some categories of crime, however, have not decreased from 2016-18. Assaults have remained steady at around 610 incidents per 100,000 population, and sexual offences have increased from about 110 to 132 incidents per 100,000 population. On the other hand, theft and burglary have decreased dramatically (from about 2,500 to 2,100, and from about 840 to 620 incidents per 100,000 population, respectively).
(More detail can be found from the Age here or in the statistics themselves.)

These numbers are too high. They mean thousands of sexual offences, tens of thousands of assaults and burglaries, and hundreds of thousands of thefts, happen each year, in Victoria. Each such crime is potentially a source of outrage.

Society ought to work so that these numbers, in the long run, tend to zero. It is not at all clear that more draconian laws or policing will help that goal. It requires addressing the root causes, which include, among others, poverty, misogyny, racism, authoritarianism, capitalism, and a culture which glorifies greed and violence.

But nonetheless, the point about perception remains. If one felt that, despite the continual rate of ongoing crime, that Victoria was a generally safe place to live in 2015, and one is consistent, then (putting aside local variations) one must feel the same at the beginning of 2019.

Indeed, Melbourne ranked in the top 10 safest cities in the world, in a 2017 Economist study.

If one feels that “African gangs” are a menace to society, as right-wing politicians and tabloid media continue to claim, despite the protestations even of the police to the contrary, then one is living in an alternate reality — a reality that at least has provided some social media entertainment, but whose racism is profoundly damaging to African communities in Melbourne.

The algebra and geometry of contact categories, Melbourne July 2018

On Monday July 23 2018 I gave a talk in the Geometry and Topology seminar at the University of Melbourne.

The slides from the talk are available here.

Title:

The algebra and geometry of contact categories

Abstract:

Contact categories, introduced by Ko Honda, are a type of cobordism category related to 3-dimensional contact geometry. Geometrically, they encode contact structures in an elementary combinatorial way. Algebraically, they are related to triangulated categories, A-infinity algebras, Floer homology, and other wholesome fun. In this talk I’ll tell you something about them and report on some recent developments. No knowledge of contact geometry or topology will be assumed.

melbourne_talk

The Brain makes Contact with Contact Geometry

It’s always nice, intellectually, when two apparently unrelated areas collide.

I had an experience of this sort recently with an area of mathematics — one very familiar to me — and an ostensibly completely distinct area of science.

On the one hand, contact geometry — a field of pure mathematics, pure geometry.

And on the other hand, the brain and its functioning. More particularly, the visual cortex, and how it processes incoming signals from the eyes.

Now, contact geometry has lots of applications: arguably it goes back to Huygens’ work on optics. It is closely related to thermodynamics. It is the odd-dimensional sibling of symplectic geometry, which is related to classical mechanics and almost every part of physics.

But applications to neurophysiology? Now that’s new.

Well, it’s only new to me. It’s been in the scientific literature for some time. It goes back at least to a paper from 1989:

And the discussion below is largely based on this article:

What’s the connection?

Contact geometry is the study of contact structures. And a contact structure on a 3-dimensional space \(M\) consists of a plane at each point satisfying some conditions. That is, at each point in the space, we have a plane sitting there. But not just any plane at each point. The planes have to vary smoothly from point to point — having such smoothly varying planes forms a (smooth) plane field. But moreover, the plane field, which we can call \(\xi\), is required to be non-integrable.

There are various ways to explain non-integrability. To “integrate” a 2-plane field is to find a smooth surface \(S\) in space so that, at every point of \(S\), the tangent plane to \(S\) is given by the plane of \(\xi\) there. At every point \(p\) of the 2-dimensional surface \(S\), the tangent plane is a 2-dimensional plane, which we write as \(T_p S\). If we write \(\xi_p\) for the plane of \(\xi\) at the point \(p\), then the integrability condition can be written as \(\xi_p = T_p S\).

Well that’s what integrability means (roughly) — \(\xi\) is integrable if you can always find a surface tangent to \(\xi\) in this way.

But a contact structure is just the opposite: you can never find a surface tangent to it in this way! The planes of the plane field \(\xi\) somehow twist and turn so much that you can’t every find a surface tangent to it. You can always find a surface tangent to \(\xi\) at a single point, and you might even be able to find a surface which is tangent to \(\xi\) at some of its points, (perhaps even along a curve on \(S\)), but you’ll never be able to find a surface which is tangent to \(\xi\) at all its points.

(If you’re familiar with differential forms, then the plane field \(\xi\) can be described (locally, at least) as the kernel of a 1-form, \(\xi = \ker (\alpha)\), and then the non-integrability condition is that \(\alpha \wedge d\alpha\) is a volume form. If you’re not familiar with differential forms, don’t worry.)

Contact structures can be hard to visualise. Here is a picture of one contact structure on 3-dimensional space:

The standard contact structure on R^3.
A contact structure on 3-dimensional space.
Public Domain, wikipedia

You’ll note that, if you consider going from left to right in this picture in a straight line, you can actually stay tangent to the contact planes. A curve like this is called a Legendrian curve. Let’s call the curve/line \(C\). But the planes twist around \(C\) as you travel along \(C\). This is a characteristic property of contact structures (and in fact, with a few extra technicalities, can be made into an equivalent characterisation).

Another example of a contact structure is a projectivised tangent bundle. Let’s say what this means. (Actually we’ll only consider one such contact structure: on the projectivised tangent bundle of a plane.)

Consider a 2-dimensional plane; let’s call it \(P\). Let’s even be concrete and call it the \(xy\)-plane, complete with coordinates. So all the points on \(P\) can be written as \((x,y)\).

Now, lay \(P\) flat on the ground, in 3-dimensional space. (More precisely, embed it into \(\mathbb{R}^3\).) We would usually denote points in 3-dimensional space by \((x,y,z)\), but I want to suggestively call the third coordinate \(\theta\), because it will denote an angle. In any case, the points of \(P\) now lie horizontally along \(\theta = 0\); so they lie at the points \((x,y,0)\) in 3-dimensional space.

Now in 3-dimensional space, through every point of \(P\) there is a vertical line. For instance, through the point \((1,2,0)\) of \(P\) is a line, and the points on this line are all the points of the form \((1,2,\theta)\).

And now the “projective” part of the situation comes in. Pick a point on the plane \(P\): let’s say \((1,2,0)\) again. Now consider lines on \(P\) through this point. There are many such lines; in fact, infinitely many. But we can specify a line by specifying its direction. And that direction can be specified by an angle \(\theta\). We could have various conventions to measure the angle \(\theta\), but let’s do it in the standard way: \(\theta\) is the angle (measured anticlockwise) from the positive \(x\)-direction, round to the line.

Now at each point \(p = (x,y, \theta)\) in 3-dimensional space, we’ll define a plane \(\xi_p\) as follows. The plane \(\xi_p\) contains the vertical line (i.e. in the \(\theta\) direction) through \(p\); and it also contains a horizontal line through \(p\) in the direction given by the angle \(\theta\). The result is as shown below.

Image by Patrick Massot.

Starting from \(p\) (and the plane there), if you move vertically upward you get to other points of the form \(p’ = (x,y,\theta’)\), with the same \(x,y\) coordinates but different \(\theta\) coordinates. The plane at \(p’\) still contains a vertical line, but the horizontal line has rotated from angle \(\theta\) to angle \(\theta’\). Thus, as you move upwards along a vertical curve, the planes spin around the vertical curve — just as shown in the animation.

It’s a contact structure. Indeed, you can even, if you want, identify the point \((x,y,\theta)\) with the line through \((x,y)\) in the plane \(P\) with direction given by \(\theta\). In this way, the points in 3-dimensional space correspond to the lines in the plane through various points, and this is the thing referred to as the “projectivised tangent bundle”. (Strictly speaking though, a line at angle \(\theta\) and a line at angle \(\theta + \pi\) point in the same direction, so we should identify points \((x,y,\theta) \sim (x,y,\theta+\pi)\).)

What does this have to do with the brain?

Well I’m no neurophysiologist, but the claim is that the neurons in the visual cortex can be regarded functionally as exactly this kind of contact structure. This is not to say that the neurons are planes, or spin around quite like the picture above. But it is to say that neurons in some ways, functionally, behave like this contact structure.

When you look at an image, the photoreceptors in your eye send signals into your brain. These signals are processed, at a low level, in your visual cortex. They are then processed at a higher level, extracting features, objects and eventually reaching the level of consciousness as the unified visual field which is part of ordinary human experience. However, here we are only interested in the lower-level processing, which extracts basic information from the image projected on the retina. This low-level processing extracts features like which areas of the visual field are light and dark, the shapes of light and dark areas, and importantly for us here, the orientation of any lines or curves that we see.

The particular area of interest in the visual cortex seems to be an area called “V1”. This area of the brain contains many structures. It contains several “horizontal” layers 1-6, each divided into sublayers; the most important is apparently the sublayer 4C. We’ll call this the “cortical layer”, as it’s the one important for our purposes.

Now it turns out that different points on this cortical layer relate to different points on the retina. Each point in your visual field corresponds gets projected to a different point on your retina, which (roughly speaking) connects to a different point in the cortical layer. The map from the retina to the cortical layer is called a retinotopy. In fact, beautifully, this map from the retina (which is a surface at the back of your eye) to the cortical layer (Which is a surface in your brain) is a map which appears to preserve angles (but not lengths). In other words, the retinotopy is a conformal map.

Even better, the cells of the cortex are organised into structures called columns and hypercolumns. Along each hypercolumn, the cells detect curves which point in the same orientation. So there are not only cells which are specialised to detect images arriving at particular points on your retina; there are also cells which are specialised to detect a curve at a particular in a particular orientation.

Functionally, then, the visual cortex behaves like a contact structure. The neurons aren’t arranged in a contact structure, but they behave like one. And this means that various processes in low-level visual processing can be understood in terms of contact geometry.

In particular, the “association field” can be understood in terms of contact geometry, as perhaps also can certain hallucinations — including those seen under the influence of psychedelics like LSD.

Well, it’s definitely the most psychedelic application of contact geometry I’ve seen.

Some further references: