Learning from Nature with System Dynamics

24 April, 2026

 

We have left the Holocene and entered a new epoch, the Anthropocene, in which the biosphere is rapidly changing due to human activities. We do not need to decide to address these issues. They are already addressing us: grabbing us by the collar, so to speak. Our only choice is how to respond.

In the process we can learn a lot from nature, which has had far more time than human civilization to develop flourishing complex systems, and has successfully weathered many crises. Nature has many lessons to teach us, which we are just beginning to learn.

In what follows I’ll talk about a few aspects of this: biomimetic technologies, ecological economics, ecological engineering, and the theory of leverage points. I’ll explain how most of these are connected to “system dynamics”: a modeling tradition that applies to interacting social and biological systems. And I’m including a ton of references, to learn more.

The Anthropocene

Climate change is just one part of a much broader process where humans are destabilizing the biosphere that supports us. For example:

• About 1/4 of all chemical energy produced by plants is now used by humans [KEGH].

• Humans now take more nitrogen from the atmosphere and convert it into nitrates than all other processes on land [GK]

• 8-9 times as much phosphorus is flowing into oceans than the natural background rate [RS].

• 24 times as much sediment is flowing into the oceans due to mining than the amount created by natural erosion [Co]

• The rate of species going extinct is 100-1000 times the usual background rate [dV].

These changes are not isolated “problems” of the sort routinely “solved” by existing human institutions. They are part of a shift from the exponential growth phase of human impact on the biosphere to a new, uncharted phase. Institutions and attitudes will change dramatically, like it or not. Before, western civilization tended to treat “nature” as distinct from “civilization”. Now there is no nature separate from civilization. Before, economic growth could be our main goal, with many side-effects ignored. Now, many forms of growth are pushing the biosphere toward tipping points [RS], and we are groping for new goals that take this into account.

We’ve gotten into this situation because our current civilization is extremely crude in many ways. Ironically, this is good news, since it means that plausible changes in our technology—and more importantly, our culture—can dramatically change the path we have been on.

For example, currently the largest human activities of all, measured in sheer mass, are burning carbon and making concrete. In 2025, our civilization extracted about 10.4 gigatonnes of carbon from the Earth, burnt it to power our technologies, and put 38 gigatonnes of CO2 into the atmosphere [FoS]. We also dug up over 40 gigatonnes of rocks, gravel and sand [P], making about 30 gigatonnes of concrete [XZ]. This is just a snapshot; it is also good to take a longer view. Over the course of history we have burnt about 700 gigatonnes of carbon [FoS]. Since the dawn of agriculture we have also reduced the total biomass of the planet, mainly plants, from about 900 gigatonnes of carbon to 550 gigatonnes [BPM].

Thus, viewed from afar as a biological and geological process, current-day human civilization consists largely of killing plants, burning carbon, and building structures out of concrete. If these activities defined us — if this is what truly made us human — we would be in real trouble. But if they are merely a means to our deeper goals, perhaps goals not clearly formulated yet, then perhaps we can change course in a way that leads to a flourishing of both civilization and the biosphere.

In seeking to reorient our goals, we have a lot to learn from nature. Nature has been successfully growing complex systems for billions of years, while we have been doing it for only thousands.

Our methods of production typically create useless or even harmful byproducts: “waste”. When the final product wears out, it too becomes waste. This waste is typically ignored until its causes so much damage that we cannot turn our heads away. Nature works differently. One organism’s waste is another’s food, and most chemicals get recirculated and reused.

It is important to note that natural systems developed this remarkable ability to recycle only through millennia of trial and error. For example, when some bacteria first began to photosynthesize, the highly reactive free oxygen they produced was toxic to all living creatures. As it built up in the atmosphere, this led to a crisis known as the Oxygen Catastrophe. In response, new organisms evolved that could not only tolerate oxygen but even use it in their metabolism. But this took about 400 million years.

Humans are affecting the biosphere at a much faster rate, so we do not have the luxury of millions of years. Luckily we have the ability to adapt much faster as well. We do not always deploy this ability until we are desperate, since our current crude technologies are often simply dreamt up and widely adopted before considering their consequences. To do better we can proceed more proactively, looking to flourishing natural ecosystems as role model, and gauging the effects of each potential new technology or societal shift on the web of relations connecting us to the rest of the biosphere.

Biomimetic technologies

How can we learn from nature? One of the most obvious ways is to look at natural systems and design technologies based on them. These are called biomimetic technologies. A single example can illustrate some of the issues that arise.

Termites maintain nearly constant internal temperatures in their mounds through a system of channels. They don’t need fans that require power. For a time, it was believed that they used a simple convective cooling system, where hot air rises through the central chimney, drawing in cool air at the base. In 1996, a large office and retail building was built based on this idea: the Eastgate Centre in Harare, Zimbabwe, designed by the architect Mick Pearce [TS]. It has chimneys and ventilation channels that draw cool night air through the building’s thermal mass. It uses roughly 90% less energy for climate control than a conventional building of comparable size. That translates directly into far lower carbon emissions from heating and cooling.

This success inspired emulation. Pearce himself used similar termite-chimney-inspired designs in a Melbourne office building [HB]. More recently the Startup Lions Campus in Kenya, designed by Kéré Architecture on the banks of Lake Turkana, features three tall terracotta-colored ventilation towers modeled after local termite mounds.

Why hasn’t this technology spread more widely? First, it is climate-specific. It works well in sites with warm days, cool nights, and low humidity year-round. In a humid subtropical or continental climate, passive ventilation alone often cannot maintain comfortable conditions. Second, termite mounds integrate structure and ventilation into a single system. Reconciling this with building codes, fire regulations, and the way architects and engineers are trained is a cultural and institutional challenge, since currently structures such as beams, columns, trusses and walls tend to be designed and permitted in a way that is isolated from the systems that make a building habitable.

Both these points illustrate important general issues. Our current civilization favors simple technologies that work uniformly in many environments, while biological systems have had time to adapt in intricate ways to local situations, creating a huge diversity of specialized forms. And while existing human technologies are often made of modular parts that are designed in isolation, biological systems evolve as a whole, with each part evolving in interaction with the rest. Thus, while in current human technologies we feel free to say “the purpose of this feature is to accomplish that task”, when we examine biological systems, we almost invariably discover that each feature has multiple functions. Indeed, the very concept of “function” becomes a problematic abstraction [Th]. Of course it is still useful to declare that different features have different functions, but we should recognize all such assignments as tentative. The deeper we look at any biosystem, the more it has to teach us.

For example, in 2015 a group of researchers at Harvard [KOM] showed that termite mounds work in a subtler way than Pearce thought. The chimney is just as much about flushing CO₂ from the termite colony as cooling it! During the day the outer flutes of the termite mound warm up faster than the central chimney, while at night this temperature profile inverts, driving cyclic convective flows that flush CO₂ from the nest. So, the mound breathes in and out on a daily cycle rather than drawing a steady one-way draft. But Pearce’s Eastgate building, based on a simpler earlier understanding of termite mounts, still works.

Indeed, a common aspect of biomimetic technologies is that they choose one aspect of how a biological system works and ignore most others. For another example, while termites create their mounds from the surrounding soil, the Eastgate Centre is built of concrete and locally manufactured brick, with a glass-roofed atrium supported by a steel framework. This is not surprising, because the termite mounds can’t be scaled up to the desired size. But it means that the human-made copy is far more energy-intensive to produce, even per kilogram.

To see forms of technology that absorb the lessons of natural systems more deeply, we should turn to “ecological engineering”. But this grew out of the discipline of “system dynamics”, which we describe first.

System dynamics

There is a general theory of systems—from cells to ecosystems to businesses and economies—called “system dynamics”. It began with Jay Forrester at MIT in the late 1950s. Forrester, an electrical engineer who had built one of the first digital computers, realized that the same feedback-loop thinking used in control engineering could model the behavior of factories, cities, and entire economies. His books Industrial Dynamics [Fo1], Urban Dynamics [Fo2], and World Dynamics [Fo3] laid the foundations for the subject

John Sterman, also at MIT, became the field’s leading figure in the next generation, writing the textbook Business Dynamics [St] in 2000, and applying system dynamics to climate change, energy transitions, and public health. His work on the “climate bathtub”—showing that even educated people fail to grasp the difference between the amount of CO₂ in the atmosphere (a stock) and the rate at which we are putting CO₂ into the atmosphere (a flow)—was particularly relevant to climate change.

One key insight of system dynamics is that complex systems are dominated by feedback loops, delays, and nonlinearities—and that human intuition is notoriously bad at predicting the behavior of such systems. But there is much more to system dynamics: it is systematic practice for modeling. Its practitioners often use three kinds of diagrams to model systems. In order of increasing complexity, they are:

Causal loop diagrams. These show variables connected by arrows labeled with “polarities” (that is, + or − signs) indicating how increasing one variable tends to increase or decrease another. These are the most informal and accessible of the three — good for group model-building and participatory settings — but they do not distinguish stocks from flows. A loop in such a diagram represents a “feedback loop”, which is either positive or negative depending on the product of the polarities labeling its edges.

System structure diagrams. These diagrams distinguish between two kinds of variables: “stocks” (accumulations, like carbon in forests or carbon in the atmosphere), and “flows” (like the flow of carbon from the atmosphere to forests). Stocks are drawn as boxes, while flows are drawn as pipes going from one box to another. Besides stocks and flows there are usually “auxiliary variables”. In addition there are “links”: edges labeled by polarities, which represent the causal connections between variables.

Stock and flow diagrams. A stock and flow diagram is a system structure diagram equipped with formulas that say precisely how each variable is a function of those linked to it. Thus a system structure diagram is purely qualitative, while a stock and flow diagram contains further quantitative information. Stock and flow diagrams can be directly translated into differential equations and simulated.

In the 1990’s, system dynamics expanded beyond expert-built models toward more participatory approaches. In Group Model Building, Jac Vennix [Ve] argued that the biggest obstacle to implementing system dynamics insights isn’t model quality—it’s buy-in. Forrester’s tradition produced elegant models that often sat on shelves because the people who needed to act on them hadn’t been involved in building them and didn’t trust or understand the results. The solution was bringing more stakeholders into the modeling process. When people discover through simulation that their intuitive policy levers don’t work—or that someone else’s view of the feedback structure explains the data better—that is a more powerful form of learning than being handed a consultant’s report.

In 2014, Peter Hovmand [Ho] developed this idea further in Community-Based System Dynamics. The idea here is to have expert modelers work with community members and other stakeholders to collaboratively build and discuss models. Modeling becomes a way collecting information that is spread among many people, negotiating a shared understanding of the problems they face, and helping them discuss possible solutions.

Separately from the systems dynamics tradition, researchers in molecular biology have developed their own diagrammatic methods for modeling complex systems. They have been drawing metabolic pathways on wall charts since the mid-20th century. In the 1990s, Kurt Kohn [Ko] developed “molecular interaction maps”, a rigorous notation for signaling and regulatory networks at the molecular level, and used it to describe a model of the mammalian cell cycle and DNA repair machinery. Shortly thereafter, Hiroaki Kitano helped develop “systems biology” as a named field [Ki]. He argued that biology needed to become more like engineering in its formal rigor, and that standardized notations were essential. By 2009, this idea came to fruition in a large project called Systems Biology Graphical Notation [lN]. The key design decision here was to create three styles of diagrammatic notation, each capturing a different view of the same biological system. However, none of these three subsumes stock and flow diagrams, and none can be automatically translated into systems of differential equations. Thus, there is work left to be done to unify our diagram languages for the micro-world of molecular biology and the macro-world of system dynamics. This is as much due to the siloing of intellectual disciplines as the difficulty of this particular task.

Another line of work important to system dynamics is “systems ecology”, which was initiated in the 1950’s by Howard Odum. Initially, Odum described systems using diagrams modeled after electrical circuit diagrams. Eventually he developed these into a more general framework, which he called “energy systems language” [Od]. This never achieved wide adoption, but some of his ideas were taken up by the fields of ecological engineering and ecological economics. We turn to these next.

Ecological engineering

The term “ecological engineering” was coined by Odum, and the field was pioneered by Odum’s student William J. Mitsch in collaboration with Sven Jørgensen [MiJ], but it is the work of many. Its goal is to design systems that work with ecosystems rather than replacing them. Its central insight is that ecosystems have a self-designing capability: given the right conditions, nature assembles and maintains its own populations of species, food chains, and biogeochemical cycles, running on solar energy rather than fossil fuels. The ecological engineer’s job is thus not to build and control a system from scratch, as a conventional engineer would, but to act as a facilitator between human needs and natural processes, letting the ecosystem do most of the work. Doing this requires deep ecological knowledge.

However, Käthe Seidel of the Max Planck Institute did not need Odum’s theoretical framework to practice what would later be considered one of the prime examples of ecological engineering [Se]. In the 1950s she began using wetland plants like bulrushes to treat wastewater, trying to improve the poor performance of rural septic tanks and pond systems. By the early 1980s the technology had been introduced to Denmark, and by 1987 nearly 100 systems were in operation there. The UK, France, Netherlands, and Austria followed. By now, constructed wetlands are recognized as a reliable treatment technology suitable for many types of wastewater [EG,Vy]. In Europe, Seidel’s system has become the norm: waste water percolates through basins filled with coarse sand and planted with bulrushes or reeds. In North America and Australia, open ponds with marsh plants are more popular, thanks in part to Odum’s work on recycling partially treated sewage in cypress swamps. To run any of these systems successfully requires detailed ecological experties—not just “wetland plants treat water” but which wetland plants, in which climate, supporting which groups of microbes to carry out which activities.

Another good example of ecological engineering is river and wetlands restoration. The Skjern, Denmark’s largest river by water flow, once had a huge expanse of marshland at its mouth, full of meandering watercourses, reed beds, and meadows. It was a habitat for thousands of migratory birds, along with stable breeding populations of local birds, plus otters and Atlantic salmon. All this was virtually destroyed following a campaign of land reclamation and river channelization in the 1960s. Part of the river was straightened into a canal, and the wetlands were drained for agricultural purposes. In only 25 years the area lost its agricultural value. The drained peat soils subsided and degraded, and the farmland was not productive enough to justify its maintenance costs. The channelization also caused sedimentation and eutrophication at the river’s outflow. The rationale for restoration was therefore clear. The goals were to reinstate the natural flow conditions, allow species to return, and develop the area’s recreational and tourist potential.

The restoration was carried out from 1999 to 2002. It transformed 19 kilometers of channelized river into 26 kilometers of meandering river. The river valley changed rapidly from agricultural fields into meadows, with weeds typical of arable land displaced by natural wetland plants. Birds returned, along with otter, and the number of salmon coming to the Skjern River to spawn grew tenfold [PANL].

The project did not attempt to bring the Skjern back to an imagined “state of nature” separate from the Danish economy. Reeds are harvested across 250 hectares for commercial sale. The restored river valley is also popular among tourists. The Royal Danish Agricultural University concluded that the project was a good public investment at a 3% discount rate and a time horizon of 20 years, or even a 7% discount rate if we allow an indefinite time horizon [DKPL]. Their calculation did not attempt to put a value on the 15,000 tonne annual reduction in CO2 emissions — not because the reduction was uncertain, but because Denmark’s international obligations at the time did not allow reductions of this kind to be counted in the national CO₂ account. They did, however, put a value on the reduced amounts of nitrates and phosphates flowing out of the Skjern, and the increased biodiversity.

This leads naturally to our next topic, another field pioneered by Odum and his students.

Ecological economics

Ecological economics is a diverse and controversial field. Any attempt to summarize it here would be woefully inadequate. But its central claim is that the human economy is a subsystem of the biosphere, and any economics that ignores this is doomed to fail in the long run [DF].

Indeed, the biosphere has been solar-powered and close to waste-free for billions of years. Our current human civilization has been running a fossil-fueled, waste-producing economy for about 200 years. Ecosystems maintain stability through redundancy and diversity: multiple species perform overlapping functional roles, so that if one is knocked out others can compensate. Conventional economics prizes a specific kind of efficiency that tends to produce monocultures and brittle supply chains. Ecological economics says the question isn’t whether we will transition from our current model toward something more like the biosphere, but whether we will do it by design or by collapse.

Beyond these general points, instead of listing doctrines of ecological economics, it is more reasonable to list a few questions this subject is concerned with:

• How should we value ecosystem services and natural capital? Standard economics often uses monetary valuation through revealed or stated preferences. Some ecological economists question the substitutability assumption behind this approach: for example, if you put a dollar figure on a marshland, the implication is that enough dollars can compensate for its loss. Some believe that “ecosystem services” is an overly anthropocentric concept; some believe monetary valuation is wholly inappropriate for irreplaceable parts of the biosphere [MMO].

• What is the appropriate discount rate for future costs and benefits? Standard economics uses market-based discount rates. Some ecological economists argue for lower or even zero rates when dealing with irreversible ecological losses and intergenerational equity [Das,He].

• Is perpetual economic growth compatible with ecological limits? Standard economics generally holds that it is, via technological progress, efficiency gains, and substitution. Ecological economists question this, pointing to the economy’s irreducible material and energy throughput. This leads to debates around “degrowth,” “steady-state economics”, and whether GDP growth is an appropriate policy goal at all [Dal,J].

In its milder forms, ecological economics is a corrective to standard economics. In its stronger forms, it calls for a fundamental rethinking. The struggle to sort out its precise role is not a mere academic dispute: it concerns the future of our civilization. Insofar as economics is prescriptive, telling us how we should conduct our affairs, this struggle is a social and political one. But insofar as economics is descriptive, telling us how things actually work, a useful framing might be that the dispute over ecological economics is part of a process of learning lessons from nature. We are trying to develop an integrated science that describes how economic systems behave in interaction with biological systems.

For example, a 3% discount rate on future costs and benefits is well-adapted to decisions that affect a single human, since at this rate a dollar received 23 years from now has a present value of fifty cents, and this amount of time is a substantial fraction of a human lifetime—so the individual doing the discounting can plausibly claim to be trading their own future against their own present. But biological systems operate on multiple timescales. Beyond the lifespan of an individual organism, there is the much longer lifespan of a species, or an ecosystem. These are the time scales routinely studied in paleontology and evolutionary biology. When humans make decisions at these scales, a 3% discount rate effectively erases the future: a benefit or harm a thousand years hence is written down by a factor of nine trillion.

Leverage points

When trying to confront the Anthropocene, we are everywhere faced with the difficulty of wisely intervening in complex systems. Here another idea from system dynamics becomes important: “leverage points”, which are places in a system where a small change can make a big difference.

Leverage points were brought to the fore by one of the most prominent practitioners of system dynamics, Donella Meadows [Me3]. Meadows learned a lot from Forrester at MIT in the early 1970s, and she was deeply concerned with environmentalism and sustainability. In 1972 she helped write the famous study The Limits to Growth [Me1]. The huge controversy surrounding this should make clear that any model is no more accurate than its assumptions. It also shows that system dynamics is less helpful as a method of long-term prediction than as a focal point for community discussion and strategizing.

In the early 1990s, while attending a meeting on international trade, Meadows compiled a typology of leverage points [Me2]. One of her key observations was that less effective interventions tend to be quantitative—essentially, turning knobs—while more effective ones involve restructuring the system, or changing its entire goal. Many, but by no means all, of her leverage points are neatly framed in the language of system dynamics.

In order of increasing effectiveness, this is her original list of nine kinds of leverage points (which she later expanded to twelve):

9) Constants, parameters, numbers. These are numerical settings—rates, standards, thresholds, quotas, etc. They absorb enormous attention but rarely change a system’s fundamental behavior.

8) Negative feedback loops. These are self-correcting mechanisms that pull a stock back toward a goal whenever it strays.

7) Positive feedback loops. These are self-reinforcing mechanisms where more produces more. Reducing the gain on a runaway positive loop is typically a more powerful intervention than strengthening whatever negative loop is trying to contain it.

6) Material flows. These are the physical plumbing of the system. Once built this is expensive and slow to change, so the leverage is concentrated in the original design; afterward one mainly works around its bottlenecks.

5) Information flows. Who sees what, and when. Delivering the right signal to the right actor at the right moment is often cheap relative to rebuilding physical structure, and missing information is one of the commonest causes of malfunction.

4) The rules of the system (incentives, punishments, constraints). These are the agreements that fix the system’s scope, degrees of freedom, and what counts as a legitimate move. They sit above parameters and information because they determine which parameters exist and which channels of information matter.

3) The distribution of power over the rules of the system. This refers to who gets to write, change, interpret, and enforce the rules. Control over rule-making is more consequential than any particular rule, because it governs how the entire rule set can evolve.

2) The goals of the system. These are what the whole system is actually optimizing for. A shift in goal cascades downward: stocks, flows, feedbacks, rules, and even the distribution of power reorganize to serve it.

1) The mindset or paradigm. This is the deep, usually unstated view of how reality works from which goals, power structures, rules, and culture all descend. Changing this is the most radical intervention available, and also the one most fiercely resisted at the collective level.

Some, but by no means all, of these leverage points can be neatly framed in the language of system dynamics. This is easiest for items 5-9. Parameters and the strengths of positive and negative feedback loops can be read off a stock and flow diagram. Similarly, positive and negative feedback can be read off from a causal loop diagram. What Meadows calls “material flows” are simply what we call “flows” in a system structure diagram, while her “information flows” are called “links”. On the other hand, items 1-4—paradigms, goals, distributions of power and rules—are not visible in any of the diagrammatic models used in system dynamics. They are more difficult to precisely define.

Meadows described her list as hastily drawn up, based on personal experience, and subject to revision [Me1]. Given this, we might hope for it to be merely the seed for an extensive theory of leverage points, rigorously formulated and experimentally tested. Unfortunately this is not yet quite the case. While her ideas have been further developed [A,Mu,MuJ1,MuJ2], there is still much to be done to understand leverage points.

There is by now a useful quantitative theory of items 7-9 on Meadows’ list: that is, the effects of parameters and feedback loops. There are methods to find feedback loops and predict the response of a system to changes in the strength of its feedback loops [G,Ka], determine which nodes in a network have most control over its overall behavior [LSB], and infer parameters from observed data [ROO].

Less is known about the more impactful items 5 and 6: that is, the response of a system to changes in its structure, such as adding or removing a feedback loop. Important work has been done, from Mason’s gain formula [Mas], to results putting fundamental limits on what additional feedback loops can achieve [SBG], to work on “food web rewiring” of ecosystems in a changing world [Bar,Ma]. But a general theory of structural changes in a network that can dramatically transform its behavior in a chosen way seems to be in its infancy. New research on the mathematics of building networks from smaller parts [Bae,LPMO] and the emergent feedback loops that result [BC] may be helpful here.

The most impactful leverage points of all, items 1-4—namely mindset, goals, distribution of power and rules—are also the hardest to formalize and study systematically. Nonetheless, these were an explicit focus of the 2019 Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services Global Assessment and its follow-on synthesis [Ch], which list eight leverage points for saving biodiversity. The focus was on high-impact forms of social transformation, such as change in mindset. For example, one was “visions of a good life”: visions that downplay GDP growth and focus on trust in neighbors, access to care, opportunities for creative expression, and the like.

Leverage exploits tipping points: critical points beyond which a significant and often unstoppable change takes place. There is already extensive work on how our interventions in the biosphere may trigger unwanted tipping points, and how to spot these before they happen, for example through the slowing of the return to equilibrium after perturbations [Sc]. We have learned much about tipping points through observations of the natural world. But now researchers are starting to apply these lessons to “positive tipping points”: ways in which social and biological systems can fall into better states [Ot,Ta]. Farmer and others have called for more research on these [Fa], and it will be important to integrate them into the theory of system dynamics.

Conclusions

System dynamics seeks to be a general framework for thinking about both social and biological systems. It is still in the process of being developed: we have much more to learn about it. But it is already a useful tool for taking lessons from nature and applying them to the world we now inhabit. It is not so much a formalism for making long-range predictions about what will happen, as a way to find what can happen, and seek leverage points.

References

[B] Baez, J.C. (2025). Double categories of open systems: the cospan approach. To appear in Applied Categorical Structures. Available at https://arxiv.org/abs/2509.22584

[BC] Baez, J.C. & Chaudhuri, A. (2026). Motifs and emergent feedback in labeled graphs. To appear in Compositionality. Available at https://arxiv.org/abs/2506.23375

[BPM] Bar-On, Y.M., Phillips, R. & Milo, R. (2018). The biomass distribution on Earth. PNAS 115(25), 6506–6511. https://doi.org/10.1073/pnas.1711842115

[Bar] Bartley, T.J., et al. (2019). Food web rewiring in a changing world. Nature Ecology & Evolution 3(3), 345-354. https://doi.org/10.1038/s41559-018-0772-3

[Ch] Chan, K.M.A., et al. (2020). Levers and leverage points for pathways to sustainability. People and Nature 2(3), 693–717. https://doi.org/10.1002/pan3.10124.

[Co] Cooper, A.H., et al. (2018): Humans are the most significant global
geomorphological driving force of the 21st century. The Anthropocene Review 1-8. http://doi.org/10.1177/2053019618800234

[Dal] Daly, H.E. (1996). Beyond Growth: The Economics of Sustainable Development. Boston: Beacon Press.

[DF] Daly, H.E. & Farley, J. (2011). Ecological Economics: Principles and Applications. Washington, DC: Island Press.

[Das] Dasgupta, P. (2008). Discounting climate change. Journal of Risk and Uncertainty 37, 141–169. https://doi.org/10.1007/s11166-008-9049-6

[DKPL] Dubgaard A., Kallesøe, M., Petersen, M. L. & Ladenburg, J. (2002). Cost-benefit analysis of the Skjern River restoration project, Social Science Series 10, Department of Economics and Natural Resources, Royal Veterinary and Agricultural University, Copenhagen.

[dV] De Vos, J.M., Joppa, L. N., Gittleman, J. L., Stephens, P. R. & Pimm, S. L. (2015). Estimating the normal background rate of species extinction. Conservation Biology 29(2), 452-462. https://doi.org/10.1111/cobi.12380

[EG] Entnier C. & Guterstam B. (1996). Ecological Engineering for Wastewater Treatment. Boca Raton: CRC Press.

[Fa] Farmer, J.D., et al. (2019). Sensitive intervention points in the post-carbon transition. Science 364(6436), 132-134. https://doi.org/10.1126/science.aaw7287

[Fo1] Forrester, J.W. (1961). Industrial Dynamics. Cambridge, MA: MIT Press.

[Fo2] Forrester, J.W. (1969). Urban Dynamics. Cambridge, MA: MIT Press

[Fo3] Forrester, J.W. (1971). World Dynamics. Cambridge, MA: Wright-Allen Press.

[FoS] Friedlingstein, P., O’Sullivan M. et al. (2025). Global carbon budget 2025. Earth System Science Data, preprint. Available at https://essd.copernicus.org/preprints/essd-2025-659/

[G] Goncalves, P. (2006). Eigenvalue and eigenvector analysis of dynamic systems. Proceedings of the 2006 International System Dynamics Conference. Albany, NY: System Dynamics Society. https://proceedings.systemdynamics.org/2006/proceed/papers/GONCA394.pdf.

[GK] Gruber, N. & Galloway, J. N. (2008). An Earth-system perspective of the global nitrogen cycle. Nature 451(7176), 293-296. https://doi.org/10.1038/nature06592

[He] Heal, G. (2000). Valuing the Future. New York: Columbia.

[HB] Hes, D. & Bayudi, R. (2005). Council House 2 (CH2), Melbourne CBD: a green building showcase in the making. Proceedings of Conference on Sustainable Building South East Asia, pp. 231-241.

[Ho] Hovmand, P.S. (2014). Community Based System Dynamics. New York: Springer. ISBN 978-1-4614-8762-3. https://doi.org/10.1007/978-1-4614-8763-0

[J] Jackson, T (2017). Prosperity Without Growth: Foundations for the Economy of Tomorrow. London: Routledge. https://doi.org/10.4324/9781315677453

[Ka] Kampmann, C.E. (2012). Feedback loop gains and system behavior. System Dynamics Review 28(4), 370–95. https://doi.org/10.1002/sdr.1483

[KKS] Kania, J., Kramer, M., & Senge, P. (2018). The Water of Systems Change. FSG. Available at https://www.fsg.org/resource/water_of_systems_change

[KOM] King, H., Ocko S., & Mahadevan L. (2015). Termite mounds harness diurnal temperature oscillations for ventilation, PNAS 112(37), 11589–11593. https://doi.org/10.1073/pnas.1423242112 Also available at https://arxiv.org/abs/1703.08067

[Ki] Kitano, H. (2002). Systems biology: a brief overview. Science 295(5560), 1662–1664.

[Ko] Kohn, K.W. (1999). Molecular interaction map of the mammalian cell cycle control and DNA repair systems. Molecular Biology of the Cell 10(8), 2703–2734. https://doi.org/10.1091/mbc.10.8.2703

[KEGH] Krausmann, F., & Erb K-H. et al. (2013). Global human appropriation of net primary production doubled in the 20th century. PNAS 110(25), 10324-10329. https://doi.org/10.1073/pnas.121134911

[lN] Le Novère, N. et al. (2009). The Systems Biology Graphical Notation. Nature Biotechnology 27(8), 735–741. https://doi.org/10.1038/nbt.1558

[LPMO] Li, X., Patterson, E., Mabry, P. L., & Osgood, N. D. (2025). Compositional system dynamics: the higher mathematics underlying system dynamics diagrams and practice. Available as https://arxiv.org/abs/2509.18475

[LSB] Liu, Y., Slotine, J. & Barabási, A, (2011). Controllability of complex networks. Nature 473(7346), 167–73. https://doi.org/10.1038/nature10011.

[Ma] Ma, A., et al. (2025). Network rewiring conserves the topology of drought-impaired food webs. Communications Biology 8(1), 1641. https://doi.org/10.1038/s42003-025-09035-2

[Mas] Mason, S.J. (1953). Feedback theory: some properties of signal flow graphs. Proceedings of the IRE 41(9), 1144–56.

[MMO] Martinez-Alier, J., Munda, G. & O’Neill, J. (1998). Weak comparability of values as a foundation for ecological economics. Ecological Economics 26 (1998): 277–286. https://doi.org/10.1016/S0921-8009(97)00120-1

[Me1] Meadows, D. H., et al. (1972). The Limits to Growth. Falls Church, Virginia: Potomac Associates, 1972.

[Me2] Meadows, D. H. (1999). Leverage points: Places to intervene in a system. Hartland, Vt., The Sustainability Institute, 1999. Available at https://donellameadows.org/wp-content/userfiles/Leverage_Points.pdf

[Me3] Meadows, D. H. 2008. Thinking in Systems: A Primer. Edited by Diana Wright. White River Junction, VT: Chelsea Green Publishing.

[MiJ] Mitsch, W.J. & Jørgensen, S.E. (2004). Ecological Engineering and Ecosystem Restoration. New York: Wiley.

[Mu] Murphy, R. J. A. (2022). Finding (a theory of) leverage for systemic change: A systemic design research agenda. Contexts: The Journal of Systemic Design 1. https://systemic-design.org/contexts/vol1/v1004/

[MuJ1] Murphy, R. J. A., & Jones, P. H. (2020). Leverage analysis: A method for locating points of influence in systemic design decisions. FormAkademisk 13(2), 1–25.

[MuJ2] Murphy, R. J. A., & Jones, P. (2021). Towards systemic theories of change: High-leverage strategies for managing wicked problems. Design Management Journal 16(1), 49–65.

[Od] Odum, H.T. (1983). Systems Ecology: An Introduction. New York: Wiley.

[Ot] Otto, I. M., et al. (2020). Social tipping dynamics for stabilizing Earth’s climate by 2050. PNAS 117(5), 2354-2365. https://doi.org/10.1073/pnas.1900577117

[PANL] Pedersen, M. L., Andersen, J. M., Nielsen, K., & Linnemann, M. (2007). Restoration of Skjern River and its valley: project description and general ecological changes in the project area. Ecological Engineering 30(2), 131-144.

[P] Peduzzi, P. (2014). Sand, rarer than one thinks. Environmental Development 11, 208–218. https://doi.org/10.1016/j.envdev.2014.04.001

[ROO] Rahmandad, H., Oliva, R. & Osgood, N.O., eds. (2015). Analytical Methods for Dynamic Modelers. Cambridge, Massachusetts: MIT Press.

[RS] Rockström, J., Steffen, W., et al. (2009). Planetary boundaries: exploring the safe operating space for humanity. Ecology and Society, 14(2). Available at http://www.ecologyandsociety.org/vol14/iss2/art32/

[Sc] Scheffer, M. (2009). Critical Transitions in Nature and Society. Princeton, NJ: Princeton University Press.

[Se] Seidel, K. (1976). Macrophytes and water purification. Biological Control of Water Pollution. J. Tourbier & R.W. Pierson Jr. (eds.), Philadelphia: University of Pennsylvania Press, pp. 109–121.

[SBG] Seron, M.M., Braslavsky J. H. & Goodwin, G.C. (1997) Fundamental Limitations in Filtering and Control. London: Springer.

[St] Sterman, J.D. (2000). Business Dynamics: Systems Thinking and Modeling for a Complex World. Boston: Irwin McGraw-Hill.

[Ta] Tàbara, J. D., et al. (2018). Positive tipping points in a rapidly warming world. Current Opinion in Environmental Sustainability 31, 120-129. https://doi.org/10.1016/j.cosust.2018.01.012

[Th] Thomas, P.D. (2017). The Gene Ontology and the meaning of biological function. Methods in Molecular Biology 1446, 15–24.

[TS] Turner, J.S. & Soar, R.C. (2008). Beyond biomimicry: What termites can tell us about realizing the living building, Proceedings of the First International Conference on Industrialized, Intelligent Construction (I3CON), Loughborough University, 14–16 May 2008, p. 18.

[V] Vennix, J.A.M. (1996). Group Model Building: Facilitating Team Learning Using System Dynamics. Chichester: Wiley.

[Vy] Vymazal, J. (2011). Constructed wetlands for wastewater treatment: five decades of experience, Environmental Science & Technology 45(1), 61–69 (2011). https://doi.org/10.1021/es101403q

[XZ] Xiao, J., Zou, S., Poon, C.S., Sham, M.L., Li, Z. & Shah, S.P. (2025). We use 30 billion tonnes of concrete each year — here’s how to make it sustainable. Nature 638, 888–890. https://doi.org/10.1038/d41586-025-00568-4


Feldspars

11 April, 2026

 

Returning from a trip to New Mexico to explore some Puebloan ruins, I picked up this beautiful chunk of labradorite in the town of Quartzsite. This mineral creates an eerie blue shimmer in the sunlight: a phenomenon called ‘labradorescence’.

Reading up on it, I discovered it’s a form of feldspar. I didn’t know at first that it would lead me into studying the classification of crystals, and a nice application of the cohomology of groups.

60% of the Earth’s crust is feldspar. I knew so little about this stuff! I had to learn more. It turns out there are 3 fundamental kinds:

orthoclase is potassium aluminosilicate
albite is sodium aluminosilicate
anorthite is calcium aluminosilicate

Then there are lots of feldspars that contain different amounts of potassium, sodium and calcium. We get a triangle of feldspars with orthoclase, albite and anorthite at the corners. You can find labradorite on this triangle:

But not all points in this triangle are possible kinds of feldspar! There’s a big region called the ‘miscibility gap’, where as you cool the molten mix it separates out. I don’t understand why.

And there are also subtler issues. When you cool down the feldspar called labradorite, it separates out a little, forming tiny layers of two different kinds of stuff. When the thickness of these layers is the wavelength of visible light, you get a weird optical effect: labradorescence! You really need a movie to see the strange shimmer as you turn a piece of labradorite in the sunlight.

In fact there are 3 kinds of feldspar that separate out slightly as they cool and harden, forming thin alternating layers of two substances:

• The ‘peristerite gap’ produces layers in feldspars with 2-16% anorthite and the rest albite: these layers create the beauty of moonstone!

• The ‘Bøggild gap’ produces layers in feldspars with 47-58% anorthite and the rest albite: these are labradorites!

• The ‘Huttenlocher gap’ produces layers in feldspars with 67-90% anorthite and the rest albite: these are called ‘bytownites’. For some reason these layers do not seem to produce an interesting visual effect. Maybe their thickness is too far from the wavelength of visible light.

All these gaps are ‘miscibility gaps’. That is, feldspars with these concentrations of anorthite and albite are unstable: they want to separate out into parts with more anorthite and parts with more albite. That’s why they form layers.

The physics and math of all this stuff is fascinating. Crystals try to do whatever it takes to minimize free energy, which is energy minus entropy times temperature. That’s why many feldspars have different high- and low-temperature forms. But sometimes when molten rock cools quickly, it doesn’t have time to reach its free energy minimizing state.

For feldspar all of these issues are complex, because feldspar crystals are complicated structures, as drawn here by Anna Pakhomova et al:

All Si and Al atoms are bonded to four oxygen atoms to form tetrahedra. SiO4 and AlO4 tetrahedra (given in grey) form a three-dimensional framework by sharing common vertices. Al atoms occupy half of the tetrahedral sites in anorthite (CaSi2Al2O8) and a quarter of the sites in albite (NaAlSi3O8) and microcline (KAlSi3O8). Large cations (Ca2+, Na+, K+) located in the framework voids are represented grey spheres. Oxygen atoms are given in red. Black lines outline the unit cell of the aristotype structure.

Aluminum and silicon have to be distributed among the corners of the tetrahedra here, and there are various ways to do this. The distribution is determined by the relative amounts of potassium, sodium and calcium, which are the white balls. The distribution of aluminum and silicon in turn controls the symmetry of the crystal, which can be either ‘monoclinic’ or the less symmetrical ‘triclinic’.

A triclinic crystal is built out of units that are the least symmetrical parallelipipeds possible. So, all the angles α, β, γ, are different, and none are right angles:

A monoclinic crystal is more symmetrical. Two of the angles are right angles, but the third is not:

But the pictures here don’t fully capture the symmetry group of an actual crystal—because there’s more to a crystal than just a shape of a parallelipiped! There may be the same atoms at all corners of the parallelipiped, or not, and there may also other atoms not on the corners.

Let’s get into a bit of the math.

The symmetry group G of a crystal, called its ‘space group’, fits into a short exact sequence:

0 → T → G → P → 1

where T ≅ ℤ³ is the group of translational symmetries and P is the group of symmetries that fix a point, called the ‘point group’. This sequence may or may not split! It splits iff G is a semidirect product of P and T.

For a triclinic crystal, there are only two possible space groups G, and both are semidirect products. P is either trivial or ℤ/2, acting by negation.

For a monoclinic crystal, there are 3 choices of the point group P as a subgroup of O(3):

• P = ℤ/2 (a single 180° rotation)
• P = ℤ/2 (a single reflection)
• P = ℤ/2 × ℤ/2 (generated by a 2-fold rotation and inversion
(𝑥,𝑦,𝑧) ↦ -(𝑥,𝑦,𝑧): their product is a reflection).

For each choice of P there are 2 fundamentally different choices of lattice T ≅ ℤ³ it can act on. One is made up of copies of the parallelipiped I showed you. The other is twice as dense; then we call the lattice ‘base-centered monoclinic’:

So, we get 3 × 2 = 6 space groups G that are semidirect products.

But there are 7 other non-split extensions! These other 7 give nontrivial elements of the cohomology group H²(P, T). It’s not obvious that there are just 7 options. Thus, the hardest part of the classification of all 13 monoclinic space groups is essentially the computation of H²(P, ℤ³) for all 6 choices of groups P and their actions on ℤ³.

I knew that cohomology rocks. But it turns out cohomology helps classify rocks!

Now, which of these various groups are symmetry groups of feldspars?

Apparently all the feldspars in the triangle have just two different symmetry groups:

• For the monoclinic feldspars (including sanidine, orthoclase, and high-temperature albite), the crystal has a 2-fold rotational symmetry, a mirror plane, and inversion symmetry

(𝑥,𝑦,𝑧) ↦ -(𝑥,𝑦,𝑧).

The point group is the Klein four-group ℤ/2 × ℤ/2. The lattice is base-centered monoclinic, so there’s an extra translational symmetry shifting by half a cell diagonally across one face of the parallelipiped.

• For the triclinic feldspars (including microcline, low-temperature albite, and anorthite), the only symmetry beyond translation is inversion. So the point group is just ℤ/2. And there are no extra generators of translation symmetry beyond the three edges of the parallelipiped.

Alas, each of these space groups G is the semidirect product of their point group P and their translation symmetry group T ≅ ℤ³. So, no interesting cohomology classes show up!

Nontrivial cohomology classes show up only in crystals where you can’t cleanly separate the translations from the symmetries that fix a point of the crystal. This happens when your crystal has ‘screw axes’ or ‘glide planes’. A screw axis is an axis where you’ve got a symmetry of translating along that axis, but only if you also rotate around it:

A glide plane is a plane where you’ve got a symmetry of translating along that plane, but only if you also reflect across it:

But wait! There’s a rarer kind of feldspar made with barium. It’s called celsian, after Anders Celsius, the guy who invented the temperature scale. Chemically it’s barium aluminosilicate. And its crystal structure has both screw axes and glide planes! So its space group G is not a semidirect product! It’s an extension of ℤ³ by the point group P = ℤ/2 × ℤ/2 that gives a nonzero element of H²(P, ℤ³). See the end of this post for some details.

All this is lots of fun to me: you start with a pretty rock, and before long you’re doing group cohomology. But the classification of symmetry groups is just the start. For mathematical physicists, one fun thing about feldspars is their phase transitions, especially the symmetry-breaking phase transition from the more symmetrical monoclinic feldspars to the less symmetrical triclinic ones! There’s a whole body of work—by Salje, Carpenter, and others—applying Landau’s theory of symmetry-breaking phase transitions to map out the space of different possible feldspar crystals! Here’s one way to get started:

• Ekhard Salje, Application of Landau theory for the analysis of phase transitions in minerals, Physics Reports 215 (1992), 49–99.

Even if you don’t particularly care about feldspars, there are a lot of good general principles of physics to learn here!

Details

Let me sketch out why barium aluminosilicate, or celsian, has a space group G that’s described by a non-split short exact sequence:

0 → T → G → P → 1

Its point group is P = {e, r, m, i} ≅ ℤ/2 × ℤ/2, where we can take r to be a 180° rotation about the y axis and m to be a reflection that negates the y coordinate, so that i = rm is inversion. In coordinates:

r acts as (x, y, z) ↦ (−x, y, −z)
m acts as (x, y, z) ↦ (x, −y, z)
i acts as (x, y, z) ↦ (−x, −y, −z)

We can take the translation lattice T ≅ ℤ³ to be the lattice generated by

f₁ = (1,0,0), f₂ = (0,1,0), f₃ = (½,½,½)

Note that (0,0,½) is not in T.

To compute the 2-cocycle we need a set-theoretic section s: P → G. We choose

s(e) = identity
s(m) = a glide reflection: (x, y, z) → (x, −y, z + ½)
s(i) = inversion: (x, y, z) → (−x, −y, −z)
s(r) = s(i)·s(m): (x, y, z) → (−x, y, −z + ½)

As usual, the 2-cocycle c: P2 → G is defined by

c(g,h) = s(g)·s(h)·s(gh)⁻¹

The interesting value is c(m, m): the glide composed with itself gives (x, y, z) → (x, −y, z+½) → (x, y, z+1), so s(m)² = translation by (0, 0, 1), while s(m²) = s(e) is the identity. Thus c(m, m) = (0, 0, 1). The other values are trivial: c(i, i) = 0, c(r, r) = 0.

Now, is this cocycle nontrivial in H²(P, T)? It would be trivial if we could find a different section that makes the cocycle zero—that is, find a function b: P → T such that replacing s(g) with s'(g) = s(g) + b(g) makes

c'(g,h) = s'(g)·s'(h)·s'(gh)⁻¹

be the identity for all g,h. I will spare you the calculation proving this is impossible. The idea is simply this: the reflection m squares to the identity in the point group, but no matter how we choose b, s'(m) is a glide reflection, so it squares to a nontrivial translation. On the other hand, s'(m2) is trivial since m2 is, so

c'(m,m) = s'(m)·s'(m)·s'(m2)⁻¹

is nontrivial.


Vector Meson Dominance

29 March, 2026

I’m only now learning about ‘vector meson dominance’—a big idea put forth by Sakurai and others around 1960.

Here’s a family of 9 mesons called the ‘vector nonet’. Each one is made of an up, down or strange quark and an antiup, antidown or antistrange antiquark. That’s 3 × 3 = 9 choices.



In this chart, S is strangeness (the number of strange quarks minus the number of antistrange antiquarks in the particle) and Q is electric charge. I’ll focus on the neutral rho meson, the ρ⁰, which has no strangeness and no charge.

But why are these called ‘vector’ mesons? It’s because the quark and antiquark have spin 1/2, and in this kind of meson their spins are lined up, so together they have spin 1. A spin-1/2 particle is described by a spinor, which is a bit weird, but spin-1 particle is described by something more familiar: a vector!

The most familiar spin-1 particle is a photon. And in fact, the photons around us are slightly contaminated by neutral rho mesons! That in fact is the point of vector meson dominance. But more on that later.

First, if you’ve read a bit about mesons, you may wonder why your friends the pion and kaon weren’t on that last chart. Don’t worry: they’re on this chart! This is the ‘pseudoscalar nonet’.



In these mesons, the spins of the quark and antiquark point in opposite directions, so the overall spin of these mesons is 0. That means they don’t change when you rotate them, like a ‘scalar’. But these mesons do change sign when you reflect them, because then you’re switching the quark and antiquark, and those are fermions so you get a minus sign whenever you switch two of them. So these mesons are ‘pseudoscalars’.

If you don’t get that, don’t worry. I’m going to tell the tale of rho mesons and especially the neutral one, the ρ⁰.



A photon will sometimes momentarily split into a quark-antiquark pair. Since the neutral rho meson is the lightest meson with the same charge, spin and other quantum numbers as a photon, this quark-antiquark pair will usually be a neutral rho! This is basic idea behind ‘vector meson dominance’.

In short, the light you see around you is subtly spiced by a slight mix of neutral rho mesons!

More precisely, real-world photons are a superposition of the ‘bare’ photons we’d have in a world without quarks, and neutral rho mesons.

But you might ask: how do we know this?

When you shoot a low-energy photon at a proton, its wavelength is long, so it sees the proton almost as a point particle.

But a high-energy photon has a short wavelength, so it notices that the proton is made of quarks. And the photon may interact with these as if it were a rho meson—because sometimes it is! This changes how high-energy photons interact with protons, in a noticeable way.

The same thing happens when you slam charged pions at each other. You’d expect them to interact electromagnetically, by exchanging a photon. But if you collide them at high energies you get deviations from purely electromagnetic behavior, since the photon is slightly contaminated by a bit of neutral rho!

In fact this is how the neutral rho was found in the first place. In 1959, William Franzer and Jose Fulco used results of pion collisions to correctly predict the existence and mass of the neutral rho!



They used a lot of cool math, too—complex analysis:

• William R. Frazer and Jose R. Fulco, Effect of a pion-pion scattering resonance on nucleon structure, Phys. Rev. Lett. 2 (1959), 365–368.

Then in 1960, Sakurai argued that the three rho mesons ρ⁺,ρ⁰,ρ⁻ form an SU(2) gauge field!

The idea is this: since they’re vector mesons each one is described by a vector field, or more precisely a 1-form. But these rho mesons are made only of up and down quarks and antiquarks—not strange ones. And isospin SU(2) is a symmetry group that mixes up and down quarks. So we expect SU(2) to act on the three rho mesons, and it does: it acts on them just like it does on its Lie algebra 𝔰𝔲(2), which is 3-dimensional.

So: we can combine these 3 vector mesons into an 𝔰𝔲(2)-valued 1-form… which describes an 𝔰𝔲(2) connection! If you don’t know what I mean, just take my word for it: this is how gauge theory works.

Now, Sakurai’s paper showed up before quantum chromodynamics appeared (1973), or even quarks (1964). But Yang–Mills theory had been known since 1954, so it was natural for him to cook up a Yang–Mills theory with rho mesons as the gauge bosons.

Just one big problem: they’re not massless, as Yang–Mills theory says they should be.

This didn’t stop Sakurai. He tried to treat the rho mesons as gauge bosons in a Yang–Mills theory of the nuclear force, and give them a mass ‘by hand’.

You can tell he was very excited, because he starts by mocking existing work on particle physics, with the help of a long quote by Feynman:

• J.J. Sakurai, Theory of strong interactions, Ann. Phys. 11 (1960), 1–48.

Sakurai’s theory had successes but also problems.

The Higgs mechanism for giving gauge bosons mass was discovered around 1964. People tried it for the rho mesons, but it was never clear which particle should play the role of the Higgs boson!

Only in 1985, after quantum chromodynamics had solved the fundamental problem of nuclear forces, did people come up with a nice approximate theory in which the rho mesons were gauge bosons for the strong force, with a Higgs serving to give them mass.

• M. Bando, T. Kugo, S. Uehara, K. Yamawaki and T. Yanagida, Is the ρ meson a dynamical gauge boson of hidden local symmetry?, Phys. Rev. Lett. 25 (1985), 1215–1218.

Later a subset of these authors developed a theory where all 9 vector mesons serve as gauge bosons for a U(3) gauge theory:

• Masako Bando, Taichiro Kugo and Koichi Yamawaki, Composite gauge bosons and “low energy theorems” of hidden local symmetries,
Prog. Theor. Phys. 73 (1985), 1541–1559.

In my youthful attempts to learn particle physics I skipped over most of the long struggle to understand mesons, and went straight for the Standard Model. And that’s what many textbooks do, too. But this misses a lot of the fun, and a lot of physics that’s important even now. I just learned this stuff about the rho mesons today, and I find it very exciting!


Geometry and the Exceptional Jordan Algebra

27 March, 2026

I’m giving a talk online tomorrow at the 2026 Spring Southeastern Sectional Meeting of the American Mathematical Society, in the Special Session on Non-Associative Rings and Algebras. The organizers are Layla Sorkatti and Kenneth Price. I doubt the talk will be recorded, but here are my slides:

Projective geometry and the exceptional Jordan algebra.

Abstract. Dubois-Violette and Todorov noticed that the gauge group of the Standard Model of particle physics is the intersection of two maximal subgroups of \text{F}_4, which is the automorphism group of the exceptional Jordan algebra \mathfrak{h}_3(\mathbb{O}). Here we conjecture that these can be taken to be any subgroups preserving copies of \mathfrak{h}_2(\mathbb{O}) and \mathfrak{h}_3(\mathbb{C}) that intersect in a copy of \mathfrak{h}_2(\mathbb{C}). Given this, we show that the Standard Model gauge group consists of all isometries of the octonionic projective plane that preserve an octonionic projective line and a complex projective plane intersecting in a complex projective line. This is joint work with Paul Schwahn.

This is an introductory talk for mathematicians. Physicists may prefer the two talks here. Those go much further in some ways, but they don’t cover the new ideas that Paul Schwahn and I are in the midst of working on.


Standard Model 7: Pions

23 March, 2026

This time I’m talking about pions:

Pions were a revolutionary discovery in the 1930s—part of the first wave of the ‘particle zoo’—but I’m explaining them as a way to work toward the math and physics concepts needed for the Standard Model.

As soon as the neutron was discovered in 1932, Heisenberg invented the idea of ‘isospin’, and the idea that the proton and neutron are two different isospin states of a single particle, the ‘nucleon’. This is why I spent 3 videos explaining the math of spin-1/2 particles: in order to talk about isospin.

Three years later, Yukawa came up with the idea that the force holding nuclei together is carried by a new particle. Even better, he predicted the mass of this yet-unseen particle! It’s a fun bit of physics but also a step toward the concept of gauge bosons.

Later, the pion was discovered: in fact, three kinds of pions! I begin to explain how these three pions form a basis of \mathfrak{sl}(2,\mathbb{C}) just as the proton and neutron form a basis of \mathbb{C}^2.

These theories are outdated, but their math gets reused in the Standard Model. We’ll eventually get around to that!


The Agent that Doesn’t Know Itself

20 March, 2026

guest post by William Waites

The previous post introduced the plumbing calculus: typed channels, structural morphisms, two forms of composition, and agents as stateful morphisms with a protocol for managing their state. The examples were simple. This post is about what happens when the algebra handles something genuinely complex.

To get there, we need to understand a little about how large language models work. These models are sequence-to-sequence transducers: a sequence of tokens comes in, a sequence comes out. Text is tokenised and the model operates on the tokens.

From the outside, the morphism is simple: !string → !string. A message goes in, a message comes out. But the client libraries (the code that calls the LLM provider) maintain the conversation history and send it back with every call. The actual morphism is
(!string, ![Message]) → (!string, ![Message]): the input message and the accumulated history go in, the response and the updated history come out. The history feeds back. This is a trace in the sense of traced monoidal categories: the feedback channel is hidden from the user, who sees only !string → !string.

Crucially, the model has a limited amount of memory. It is not a memoryless process, but the memory it has is not large: 200,000 tokens for current models, perhaps a million for the state of the art. This sounds like a lot. It is not. An academic paper is roughly 10,000 tokens. A literature review that needs to work with thirty papers has already exceeded the context window of most models, and that is before the model has produced a single word of output.

If you have used any of these agent interfaces, you will have noticed that after talking back and forth for a while, the agent will compact. This is a form of memory management. What is happening is that some supervisory process has noticed the context window filling up, and has intervened to shorten its contents. A naïve approach is to truncate: discard everything before the last N exchanges. A better approach is to feed the entire context to another language model and ask it to summarise, then put the summary back.

This is normally done by specialised code outside the agent, invisible to it.

How to manage agent memory well is an active research area. We do not, in general, do it very well. Truncation loses information. Summarisation loses nuance. Pinning helps but the right pinning strategy depends on the task. These are open questions, and to make progress we need to be able to experiment with different schemes and mechanisms: express a memory management strategy, test it, swap it for another, compare. Not by recompiling specialised code or hardcoding behaviour, but by writing it down in a language designed for exactly this kind of composition. Memory management should be a plumbing program: modular, type-checked, swappable.

So we built an implementation of compaction using the plumbing calculus, and the first thing we did was test it. I ran the protocol on a very short cycle: a single message caused a compaction, because the threshold was set to zero for testing. The compressor fired, produced a summary, rebuilt the agent’s context. The logs showed [compressor] 3404 in / 541 out. The protocol worked.

Then I asked the agent: "have you experienced compaction?"

The agent said no. It explained what compaction is, accurately. Said it hadn’t happened yet. It was confident.

I asked: "do you have a context summary in your window?"

Yes, it said, and described the contents accurately.

"How did that context summary get there if you have not yet compacted?"

The agent constructed a plausible, confident, and completely wrong explanation: the summary was "provided to me by the system at the start of this conversation" as a "briefing or recap." When pressed, it doubled down:

"The context-summary is not evidence that compaction has occurred. It’s more like a briefing or recap that the system gives me at the start of a conversation session to provide continuity."

The agent was looking at the direct evidence of its own compaction and confidently explaining why it was not compaction. We will return to why it gets this wrong, and how to fix it. But first: how do we build this?

The compacting homunculus

At a high level, it works like this. An agent is running: input comes in, output goes out. Together with the output, the agent emits a telemetry report. The telemetry includes token counts: with each transaction, the entire train of messages and responses is sent to the LLM provider, and back comes a response together with a count of the tokens that went in and the tokens that came out. Our agent implementation sends this telemetry out of the telemetry port to anybody who is listening.

The construction involves a second agent. This second agent is a homunculus: the little man who sits on your shoulder and watches what your mind is doing. Here is the topology:

Topology of the compacting homunculus. Two boxes: Agent (large, bottom) and Compressor (smaller, top). The Agent has input and output ports for the main data flow (dark blue arrows). Three channels connect the Agent to the Compressor: telemetry flows up from the Agent (the Compressor watches token counts), ctrl_out flows up from the Agent (the Compressor receives acknowledgements), and ctrl_in flows down from the Compressor to the Agent (the Compressor sends commands). The Agent does not know the Compressor exists. It just receives control messages and responds to them.

The homunculus listens to the telemetry and says: the memory is filling up. The token count has crossed a threshold. It is time to compact. And then it acts:

• Send pause to the agent’s control port. Stop accepting input.
• Send get memory. The agent produces the contents of its context window.
• Summarise that memory (using another LLM call).
• Send set memory with the compacted version.
• Send resume. The agent continues processing input.

Each step requires an acknowledgement before the next can proceed. This is a protocol: pause, acknowledge, get memory, here is the memory, set memory, acknowledge, resume, acknowledge.

It is possible to express this directly in the plumbing calculus, but it would be painfully verbose. Instead, we use session types to describe the protocol. This is not pseudocode. There is a compiler and a runtime for this language. Here is the protocol:

protocol Compaction =
  send Pause . recv PauseAck .
  send GetMemory . recv MemoryDump .
  send SetMemory . recv SetMemoryAck .
  send Resume . recv ResumeAck . end

The protocol is eight lines. It reads as a sequence of steps:
send, receive, send, receive, and so on. The compiler knows what
types each step carries. Now we wire it up:

let compact : (!CtrlResp, !json) -> !CtrlCmd =
  plumb(ctrl_out, telemetry, ctrl_in) {

    (ctrl_out, ctrl_in)  Compaction as session

    telemetry
      ; filter(kind = "usage" && input_tokens > 150000)
      ; map(null) ; session@trigger

    session@trigger ; map({pause: true})
      ; session@send(Pause)
    session@done(PauseAck) ; map({get_memory: true})
      ; session@send(GetMemory)
    session@recv(MemoryDump) ; compressor
      ; session@send(SetMemory)
    session@done(SetMemoryAck) ; map({resume: true})
      ; session@send(Resume)
}

The first line binds the protocol to the agent’s control ports:

(ctrl_out, ctrl_in) <-> Compaction as session.

This says: the Compaction protocol runs over the control channel, and we refer to it as session. The telemetry line is the trigger: when token usage crosses a threshold, the protocol begins. Each subsequent line is one step of the protocol, wired to the appropriate control
messages.

Here is a direct depiction of the protocol as wired. You can trace it through:

Diagram of the compaction protocol wired between the homunculus and the bot agent. Shows the telemetry stream flowing from the bot to the homunculus, a filter checking token usage against a threshold, and then a sequence of control messages: Pause flows to ctrl_in, PauseAck returns on ctrl_out, GetMemory flows in, MemoryDump returns, passes through a compressor agent, SetMemory flows in, SetMemoryAck returns, Resume flows in, ResumeAck returns. The protocol steps are connected in sequence. This is a direct transcription of the session type protocol into a wiring diagram.

And here is how we wire the homunculus to the agent:

let main : !string -> !string =
  plumb(input, output) {
    let ctrl : !CtrlCmd = channel
    let ctrl_out : !CtrlResp = channel
    let telem : !json = channel

    spawn bot(input=input, ctrl_in=ctrl,
              output=output, ctrl_out=ctrl_out,
              telemetry=telem)
    spawn compact(ctrl_out=ctrl_out,
                  telemetry=telem, ctrl_in=ctrl)
}

The main morphism takes a string input and produces a string output. Internally, it creates three channels (control commands, control responses, telemetry) and spawns two processes: the bot agent and the compact homunculus. The homunculus listens to the bot’s telemetry and control responses, and sends commands to the bot’s control input. The bot does not know the homunculus exists. It just receives control messages and responds to them.

There are two nested traces here. The first is the one from before, inside the agent: messages go in, the output accumulates with everything that came before, and the whole history feeds back on the next turn. We do not see this trace. It is hidden inside the client library. The second trace is the one we have just built: the homunculus. What goes around the outer loop is control: telemetry flows out, commands flow in, acknowledgements come back. The memory dump passes through the control channel at one point in the protocol, but the feedback path is control, not conversation history. Nested traces compose; the algebra has identities for this and it is fine. But they are different loops carrying different things.

Session types as barrier chains

The connection between the protocol above and what the compiler actually produces is the functor from session types into the plumbing calculus. This functor works because of barrier.

Why do we need the barrier? Because the protocol is about sending a message and waiting for a response. We can send a message, but we need the response to arrive before we proceed. The barrier takes two streams, one carrying the "done" signal and one carrying the response, and synchronises them into a pair. Only when both are present does the next step begin.

Each session type primitive has a direct image in the plumbing category, and the structure is prettier than it first appears. The primitives come in dual pairs:

In the diagrams below, session types are on the left in blue; their images in the plumbing calculus are on the right in beige.

send and recv are dual. They map to map and filter, which are also dual: send wraps the value with map, then synchronises via barrier with the done signal from the previous step. Recv filters the control output by step number, synchronises via barrier, then extracts the payload with map.

select and offer are dual. They map to tag and case analysis, which are also dual: select tags the value with a label via map, synchronises via barrier, and routes to the chosen branch chain. Offer copies the control output and filters each copy by label, routing to the appropriate branch chain.

Diagram showing the two dual pairs of session type primitives and their images in the plumbing calculus. Session types are shown in blue on the left; plumbing calculus images in beige on the right. Top section: send T is a simple arrow carrying type T; its image is map(wrap) followed by barrier with a done signal, then routing to the control input. recv T is its dual: filtering ctrl_out by step number, barrier with done, then map to extract the payload. Bottom section: select with labelled alternatives maps to coproduct injection via map(tag), barrier, then routing to the chosen branch chain. offer is its dual: copy the control output, filter each copy by label, and route to the corresponding branch chain.

• The sequencing operator (.) maps to a barrier chain. Each send-then-recv step becomes a barrier that synchronises the outgoing message with the incoming acknowledgement, and these barriers chain together to enforce the protocol ordering.

Diagram showing how the sequencing operator in session types maps to a barrier chain in the plumbing calculus. Top: the session type sequence "send T1 . recv T2 . send T3" shown as three boxes in a row connected by dots. Bottom: the plumbing image, a chain of barriers. Each send-recv pair becomes a barrier that takes the outgoing message on one arm and the incoming acknowledgement on the other, producing a synchronised pair. The done signal from one barrier feeds into the next, creating a chain that enforces protocol ordering. The trigger input starts the chain; the done output signals completion. Filters select responses by step number; maps construct outgoing messages.

rec maps to a feedback loop: merge takes the initial
arm signal and the last done signal from the previous iteration, feeds them into the barrier chain body, and copy at the end splits done into output and feedback. The trigger serialisation gate starts

end is implicit: the chain simply stops. Discard handles any remaining signals.

Diagram showing three more session type primitives and their plumbing images. Top: rec X . S (recursion) maps to a feedback loop. A merge node takes two inputs: the initial arm signal and the last-done signal fed back from the end of the body. The body is a barrier chain (S). At the output, copy splits the done signal into an output arm and a feedback arm that loops back to merge. Middle: end is shown as simply discarding any remaining signal. Bottom: the trigger and serialisation gate, which starts the protocol. A trigger input feeds through a barrier that synchronises with a copy of the done signal, ensuring only one instance of the protocol runs at a time.

This mapping is a functor. It is total: every session type primitive has an image in the plumbing category, using only the morphisms we already have. Session types are a specification language; the plumbing calculus is the execution language. The compiler translates one into the other.

The reason we do this becomes obvious from the diagram below. It is scrunched up and difficult to look at. If you click on it you can get a big version and puzzle it out. If you squint through the spaghetti, you can see that it does implement the same compaction protocol above. We would not want to implement this by hand. So it is nice to have a functor. If you have the patience to puzzle your way through it, you can at least informally satisfy yourself that it is correct.

Thumbnail of the fully desugared compaction protocol as produced by the compiler. The diagram is intentionally dense: a large network of barriers, filters, maps, copy and merge nodes, all connected by typed wires. Each step of the compaction protocol (pause, get memory, set memory, resume) is visible as a cluster of barrier chains with filters selecting response types and maps constructing commands. The full-size version is linked for readers who want to trace the individual connections, but the point is that this is what the compiler produces from the eight-line session type specification above, and you would not want to construct it by hand.

Document pinning

There is another feature we implement, because managing the memory of an agent is not as simple as just compressing it.

The problem with compression is that it is a kind of annealing. As the conversation grows, it explores the space of possible conversation. When it gets compacted, it is compressed, and that lowers the temperature. Then it grows again, the temperature rises, and then it is compressed again. With each compression, information is lost. Over several cycles of this, the agent can very quickly lose track of where it was, what you said at the beginning, what it was doing.

We can begin to solve this with document pinning. The mechanism is a communication between the agent and its homunculus, not shown in the protocol above. It is another protocol. The agent says: this document that I have in memory (technically a tool call and response, or just a document in the case of the prompts), pin it. What does that mean? When we do compaction, we compact the contents of memory, but when we replace the memory, we also replace those pinned documents verbatim. And of course you can unpin a document and say: I do not want this one any more.

Either the agent can articulate this or the user can. The user can say: you must remember this, keep track of this bit of information. And the agent has a way to keep the most important information verbatim, without it getting compacted away.

This accomplishes two things.

First, it tends to keep the agent on track, because the agent no longer loses the important information across compaction cycles. The annealing still happens to the bulk of the conversation, but the pinned documents survive intact.

Second, it has to do with the actual operation of the underlying LLM on the GPUs. When you send a sequence of messages, this goes into the GPU and each token causes the GPU state to update. This is an expensive operation, very expensive. This is why these things cost so much. What you can do with some providers is put a cache point and say: this initial sequence of messages, from the beginning of the conversation up until the cache point, keep a hold of that. Do not recompute it. When you see this exact same prefix, this exact same sequence of messages again, just load that memory into the GPU directly. Not only is this a lot more efficient, it is also a lot cheaper, a factor of ten cheaper if you can actually hit the cache.

So if you are having a session with an agent and the agent has to keep some important documents in its memory, it is a good idea to pin them to the beginning of memory. You sacrifice a little bit of the context window in exchange for making sure that, number one, the information in those documents is not forgotten, and number two, that it can hit the cache. This is explained in more detail in a separate post on structural prompt preservation.

The agent that doesn’t know itself

Why does the agent get this wrong? In one sense, it is right. It has not experienced compaction. Nobody experiences compaction. Compaction happens in the gap between turns, in a moment the agent cannot perceive. The agent’s subjective time begins at the summary. There is no "before" from its perspective.

The summary is simply where memory starts. It is like asking someone "did you experience being asleep?" You can see the evidence, you are in bed, time has passed. But you did not experience the transition.

The <context-summary> tag is a structural marker. But interpreting it as evidence of compaction requires knowing what the world looked like before, and the agent does not have that. It would need a memory of not having a summary, followed by a memory of having one. Compaction erases exactly that transition.

Self-knowledge as metadata

The fix is not complicated. It is perfectly reasonable to provide, along with the user’s message, self-knowledge to the agent as metadata. What would it be useful for the agent to know?

The current time. The sense of time that these agents have is bizarre. We live in continuous time. Agents live in discrete time. As far as they are concerned, no time passes between one message and the next. It is instantaneous from their point of view. You may be having a conversation, walk away, go to the café, come back two hours later, send another message, and as far as the agent is concerned no time has passed. But if along with your message you send the current time, the agent knows.

How full the context window is. The agent has no way of telling, but you can provide it: this many tokens came in, this many went out.

Compaction cycles. So the agent knows how many times it has been compacted, and can judge the accuracy of the contents of its memory, which otherwise it could not do.

With the compaction counter, the agent immediately gets it right:

"Yes, I have experienced compaction. According to the runtime context, there has been 1 compaction cycle during this session."

No hedging, no confabulation. Same model, same prompts, one additional line of runtime context.

Context drift

This matters beyond the compaction story, because many of the failures we see in the news are context failures, not alignment failures.

While we were writing this post, a story appeared in the Guardian about AI chatbots directing people with gambling addictions to online casinos. This kind of story is common: vulnerable people talking to chatbots, chatbots giving them bad advice. The response of the industry is always the same: we need better guardrails, better alignment, as though the chatbots are badly aligned.

I do not think that is what is happening. What is happening is a lack of context. Either the chatbot was never told the person was vulnerable, or it was told and the information got lost. Someone with a gambling addiction may start by saying "I have a gambling problem." Then there is a four-hour conversation about sports. Through compaction cycles, what gets kept is the four hours of sports talk. The important bit of information does not get pinned and does not get kept. Context drift. By the time the user asks for betting tips, the chatbot no longer knows it should not give them.

The way to deal with this is not to tell the language model to be more considerate. The way to deal with it is to make sure the agent has enough information to give good advice, and that the information does not get lost. This is what document pinning is for: pinned context survives compaction, stays at the top of the window, cannot be diluted by subsequent conversation. This is discussed further in a separate post on structural prompt preservation.

But pinning is only one strategy. The field is in its infancy. We do not really know the right way to manage agent memory, and we do not have a huge amount of experience with it. What we are going to need is the ability to experiment with strategies: what if compaction works like this? What if pinning works like that? What if the homunculus watches for different signals? Each of these hypotheses needs to be described clearly, tested, and compared. This is where the formal language earns its keep. A strategy described in the plumbing calculus is precise, checkable, and can be swapped out for another without rewriting the surrounding infrastructure. We can experiment with memory architectures the way we experiment with any other part of a system: by describing what we want and seeing if it works.

Why has nobody done this?

When the first draft of this post was written, it was a mystery why the field had not thought to give agents self-knowledge as a routine matter: what they are doing, who they are talking to, what they should remember. Prompts are initial conditions. They get compacted away. There are agents that save files to disc, in a somewhat ad hoc way, but we do not give them tools to keep track of important information in a principled way.

Contemporaneously with this work, some providers have started to do it. For example, giving agents a clock, the ability to know what time it is. This is happening now, in the weeks between drafting and publication. The field is only now realising that agents need a certain amount of self-knowledge in order to function well. The compressed timeline is itself interesting: the gap between "why has nobody done this?" and "everybody is starting to do this" was a matter of weeks.

The mechanisms we have presented here allow us to construct agent networks and establish protocols that describe rigorously how they are meant to work. We can describe strategies for memory management in a formal language, test them, and swap them out. And perhaps beyond the cost savings and the efficiency increases, the ability to experiment clearly and formally with how agents manage their own memory is where the real value lies.


Standard Model 6: Pauli Matrices

16 March, 2026

Wolfgang Pauli invented his famous matrices to describe the angular momentum of a spin-1/2 particle back in 1927. You’ll see them in most courses on quantum mechanics. We tend to take them for granted. But where do they come from? Here I derive them from scratch!

There are lots of ways to derive them, and the method I use is not ultimately the best, but it’s the easiest—given that we already have a recipe to describe states of a spin-1/2 particle where it spins in any direction we want.


Standard Model 5: Spin-1/2 Particles

13 March, 2026

One of the simplest quantum systems is a spin-1/2 particle, also known as a spinor. If we measure the angular momentum of a spin-1/2 particle along any axis, there are two possible outcomes: either the angular momentum along that axis is +1/2, or it’s -1/2.

How is it possible for this to be true along every axis? Here I explain this, using the basic rules of quantum physics described last time. In particular, I say how any point on a sphere of radius 1/2 gives a quantum state of the spin 1/2 particle—and vice versa!

Using this, we can understand things like the famous Stern–Gerlach experiment, where we measure the angular momentum of a spin-1/2 particle first along one axis, and then along another.


A Typed Language for Agent Coordination

11 March, 2026

guest post by William Waites

Agent frameworks are popular. (These are frameworks for coordinating large language model agents, not to be confused with agent-based modelling in the simulation sense.) There are dozens of them for wrapping large language models in something called an agent and assembling groups of agents into workflows. Much of the surrounding discussion is marketing, but the underlying intuition is old: your web browser identifies itself as a user agent. What is new is the capability that generative language models bring.

The moment you have one agent, you can have more than one. That much is obvious. How to coordinate them is not. The existing frameworks (n8n, LangGraph, CrewAI, and others) are engineering solutions, largely ad hoc. Some, like LangGraph, involve real thinking about state machines and concurrency. But none draws on what we know from mathematics and computer science about typed composition, protocol specification, or structural guarantees for concurrent systems.

This matters because it is expensive. Multi-agent systems are complicated concurrent programs. Without structural guardrails, they fail in ways you discover only after spending the compute. A job can go off the rails, and the money you paid for it is wasted; the providers will happily take it regardless. At current subscription rates the cost is hidden, but a recent Forbes investigation found that a heavy user of Anthropic’s $200/month Claude Code subscription can consume up to $5,000/month measured at retail API rates. For third-party tools like Cursor, which pay close to those retail rates, these costs are real. Wasted tokens are wasted money.

To address this, we built a language called plumbing. It describes how agents connect and communicate, in such a way that the resulting graph can be checked before execution: checked for well-formedness, and within limits for deadlocks and similar properties. It is a statically typed language, and these checks are done formally. There is a compiler and a runtime for this language, working code, not a paper architecture. In a few lines of plumbing, you can describe agent systems with feedback loops, runtime parameter modulation, and convergence protocols, and be sure they are well-formed before they run. This post explains how it works.

The name has a history in computing. Engineers have always talked informally about plumbing to connect things together: bits of software, bits of network infrastructure. When I was a network engineer I sometimes described myself as a glorified plumber. The old Solaris ifconfig command took plumb as an argument, to wire a network interface into the stack. Plan 9 had a deeper version of the same idea. The cultural connection goes back decades.

This is the first of two posts. This one introduces the plumbing calculus: what it is, how it works, and a few simple examples. Motifs for adversarial review, ensemble reasoning, and synthesis. The second post will tackle something harder.

The calculus

The plumbing language is built on a symmetric monoidal category, specifically a copy-discard category with some extra structure. The terminology may be unfamiliar, but the underlying concept is not. Engineers famously like Lego. Lego bricks have studs on top and holes with flanged tubes underneath. The studs of one brick fit into the tubes of another. But Lego has more than one connection type: there are also holes through the sides of Technic bricks, and axles that fit through them, and articulated ball joints for the fancier kits. Each connection type constrains what can attach to what. This is typing.

In plumbing, the objects of the category are typed channels: streams that carry a potentially infinite sequence of values, each of a specific type (integer, string, a record type, or something more complex). We write !A to mean "a stream of As", so !string is a stream of strings and !int is a stream of integers. The morphisms, which describe how you connect channels together, are processes. A process has typed inputs and typed outputs.

There are four structural morphisms. Copy takes a stream and duplicates it: the same values appear on two output streams. Discard throws values away, perhaps the simplest thing you can do with a stream, and often needed. These two, together with the typed channels and the laws of the category, give us a copy-discard category.

To this we add two more. Merge takes two streams of the same type and interleaves them onto a single output stream. This is needed because a language model’s input is a single stream. There is nothing to be done about that. If you want to send two different things into it, you must send one and then the other. One might initially give merge the type !A ⊗ !B → !(A + B), taking two streams of different types and producing their coproduct. This works, but it is unnecessarily asymmetrical.

As Tobias Fritz has observed, it is cleaner to do the coproduct injection first, converting each stream to the coproduct type separately, and then merge streams that already have the same type. This gives:

merge : !A ⊗ !A → !(A + A)

Barrier takes two streams, which may be of different types, and synchronises them. Values arrive unsynchronised; the barrier waits for one value from each stream and produces a pair.

barrier : !A ⊗ !B → !(A, B)

(A mathematician would write A × B for the product. We cannot easily do this in a computer language because there is no × symbol on most keyboards, so we use (A, B) for the product, following Haskell’s convention.)

This is a synchronisation primitive. It is important because it unlocks session types, which we will demonstrate in the second post.

Two further morphisms are added to the category (they are not derivable from the structural ones, but are needed to build useful things): map, which applies a pure function to each value in a stream, and filter, which removes values that do not satisfy a predicate. Both are pure functions over streams. Both will be familiar from functional programming.

Here is a graphical representation of the morphisms. We can glue them together freely, as long as the types and the directions of the arrows match up.

Diagram showing all six morphisms as boxes with typed input and output wires. Top row: copy Δ (one input, two outputs of the same type), merge ∇ (two inputs of copyable type, one output of sum type), discard ◇ (one input, no output). Bottom row: barrier ⋈ (two inputs, one paired output, synchronises two streams), map f (one input, one output, applies a function), filter p (one input, one output, removes values failing a predicate). Each morphism shows its type signature using the !A notation for copyable streams.

There are two forms of composition. Sequential composition connects morphisms nose to tail, the output of one feeding the input of the next. Parallel composition places them side by side, denoted by ⊗ (the tensor product, written directly in plumbing source code). So: four structural morphisms, two utilities, two compositional forms, all operating on typed channels.

Because the channels are typed, the compiler can check statically, at compile time, that every composition is well-formed: that outputs match inputs at every boundary. This gives a guarantee that the assembled graph makes sense.

Two diagrams side by side. Left: sequential composition, showing two morphisms connected end-to-end, the output wire of the first feeding into the input wire of the second, forming a pipeline. Right: parallel composition (tensor product), showing two morphisms stacked vertically with no connection between them, running simultaneously on independent streams. Both forms produce a composite morphism whose type is derived from the types of the components.

A composition of morphisms is itself a morphism. This follows from the category laws (it has to, or it is not a category) but the practical consequence is worth stating explicitly. We can assemble a subgraph of agents and structural morphisms, and then forget the internal detail and use the entire thing as a single morphism in a larger graph. This gives modularity. We can study, test, and refine a building block in isolation, and once satisfied, use it as a component of something bigger.

What we have described so far is the static form of the language: concise, point-free (composing operations without naming intermediate values), all about compositions. This is what you write. It is not what the runtime executes. A compiler takes this static form and produces the underlying wiring diagram, expanding the compositions into explicit connections between ports. The relationship is similar to point-free style in functional programming: the concise form is good for thinking and writing; the expanded form is good for execution.

Agents

An agent is a special kind of morphism. It takes typed input and produces typed output, like any other morphism, and we can enforce these types. This much is a well-known technique; PydanticAI and the Vercel AI SDK do it. Agents implement typing at the language model level by producing and consuming JSON, and we can check that the JSON has the right form. This is the basis of the type checking.

Unlike the structural morphisms and utilities, an agent is stateful. It has a conversation history, a context window that fills up, parameters that change. You cannot sensibly model an agent as a pure function. You could model it using the state monad or lenses, and that would be formally correct, but it is the wrong level of abstraction for engineering. Instead, we allow ourselves to think of agents as opaque processes with a typed protocol for interacting with them. We mutate their state through that protocol, and we know how to do that purely from functional programming and category theory. The protocol is the right abstraction; the state management is an implementation detail behind it. How this works in practice, and what happens when it goes wrong, is the subject of the second post.

In addition to their main input and output ports, agents in plumbing have control ports (control in and control out) for configuring the agent at runtime. For example, the temperature parameter governs how creative a language model is: how wide its sampling distribution when choosing output. At zero it is close to deterministic; at one it becomes much less predictable. A control message might say set temperature to 0.3; the response on the control out wire might be acknowledged. The control port carries a typed stream like anything else.

Agents also have ports for operator-in-the-loop (often called human-in-the-loop, though there is no reason an operator must be human), tool calls, and telemetry. The telemetry port emits usage statistics and, if the underlying model supports it, thinking traces. We will not detail these here. Suffice it to say that an agent has several pairs of ports beyond what you might imagine as its regular chat input and output.

Diagram of a generic agent morphism showing all port pairs. The agent is a central box. On the left: input (main data stream), ctrl_in (control commands), tool_in (tool call responses), oitl_in (operator-in-the-loop responses). On the right: output (main data stream), ctrl_out (control responses), tool_out (tool call requests), oitl_out (operator-in-the-loop requests), telemetry (usage and diagnostic data). Each port pair carries a typed stream. Most programs use only a few of these ports; unused ports are elided via the don't-care-don't-write convention.

An agent has many ports, but most programs use only a few of them. We adopt a convention from the κ calculus: don’t care, don’t write. Any output port that is not mentioned in the program is implicitly connected to discard. If a port’s output cannot matter, there is no reason to write it down.

Example: adversarial document composition

Suppose the problem is to write a cover letter for a job application. You provide some background material (a CV, some notes, some publications) and a job advert. You want a network of agents to produce a good cover letter. A good cover letter has two constraints: it must be accurate, grounded in the source materials, not making things up; and it must be compelling, so that the reader wants to give you an interview.

These two constraints are in tension, and they are best served by different agents with different roles. A composer drafts from the source materials. A checker verifies the draft against those materials for accuracy, producing a verdict: pass or fail, with commentary. A critic, who deliberately cannot see the source materials, evaluates whether the result is compelling on its own terms, producing a score.

The feedback loops close the graph. If the checker rejects the draft, its commentary goes back to the composer. If the critic scores below threshold, its review goes back to the composer. Only when the critic is satisfied does the final draft emerge.

Here is the plumbing code:

type Verdict = { verdict: bool, commentary: string, draft: string }
type Review  = { score: int, review: string, draft: string }

let composer : !string -> !string = agent { ... }
let checker  : !string -> !Verdict = agent { ... }
let critic   : !Verdict -> !Review = agent { ... }

let main : !string -> !string = plumb(input, output) {
  input   ; composer ; checker
  checker ; filter(verdict = false)
          ; map({verdict, commentary}) ; composer
  checker ; filter(verdict = true) ; critic
  critic  ; filter(score < 85)
          ; map({score, review}) ; composer
  critic  ; filter(score >= 85).draft ; output
}

And here is a graphical representation of what’s going on:

Vertical diagram of the adversarial document composition pipeline. Flow runs top to bottom. Input feeds into a composer agent. The composer's output goes to a checker agent. The checker splits two ways via filter: if verdict is false, the verdict and commentary are mapped back to the composer as feedback (loop). If verdict is true, the draft goes to a critic agent. The critic also splits two ways: if score is below 85, the score and review are mapped back to the composer for revision (second loop). If score is 85 or above, the draft is extracted via map and sent to the output. Two feedback loops, two quality gates, one output.

The agent configuration is elided. The main pipeline takes a string input and produces a string output. It is itself a morphism, and could be used as a component in something larger.

Notice what the wiring enforces. The critic receives verdicts, not the original source materials. The information partition is a consequence of the types, not an instruction in a prompt. The feedback loops are explicit: a failed verdict routes back to the composer with commentary; a low score routes back with the review. All of this is checked at compile time.

Example: heated debate

The previous example shows sequential composition and feedback loops but not parallel composition. An ensemble of agents running simultaneously on the same input needs the tensor product.

Ensembles are common. Claude Code spawns sub-agents in parallel to investigate or review, then gathers the results. This is a scatter-gather pattern familiar from high-performance computing.
But this example, due to Vincent Danos, adds something less common: modulation of agent behaviour through the control port.

The input is a proposition. Two agents debate it, one advocating and one sceptical, running in parallel via the tensor product. Their outputs are synchronised by a barrier into a pair and
presented to a judge. The judge decides: has the debate converged? If so, a verdict goes to the output. If not, a new topic goes back to the debaters, and a temperature goes to their control inputs.

The intuition is that the debaters should start creative (high temperature, wide sampling) and become progressively more focused as the rounds continue. The judge controls this. Each round, the
judge decides both whether to continue and how volatile the next round should be. If the debate appears to be converging, the judge lowers the temperature, preventing the system from wandering
off in new directions. Whether this actually causes convergence is a research question, not a proven result.

type Verdict = { resolved: bool, verdict: string,
                 topic: string, heat: number }
type Control = { set_temp: number }

let advocate : (!string, !Control) -> !string = agent { ... }
let skeptic  : (!string, !Control) -> !string = agent { ... }
let judge    : !(string, string) -> !Verdict  = agent { ... }

let cool : !Verdict -> !Control = map({set_temp: heat})

let main : !string -> !string = plumb(input, output) {
  input ; (advocate ⊗ skeptic) ; barrier ; judge
  judge ; filter(resolved = false).topic ; (advocate ⊗ skeptic)
  judge ; filter(resolved = true).verdict ; output
  judge ; cool ; (advocate@ctrl_in ⊗ skeptic@ctrl_in)
}

And here is the graphical representation:

Diagram of the heated debate example. Two agent boxes (advocate and skeptic) are placed in parallel via tensor product, both receiving the same input proposition. Their outputs feed into a barrier which synchronises them into a pair. The pair goes to a judge agent. The judge has two outputs: a verdict (going to the main output) and a feedback loop. The feedback loop carries both a new topic (routed back to the debaters' inputs) and a temperature setting (routed to both debaters' control input ports via ctrl_in). The diagram shows parallel composition, barrier synchronisation, and a control feedback loop in one system.

The operator is the tensor product: parallel composition. (The grammar also accepts * for editors that cannot input unicode.) The advocate and skeptic run simultaneously on the same input. The barrier synchronises their outputs into a pair for the judge. The last line is the control feedback: the judge’s verdict is mapped to a temperature setting and sent to both agents’ control inputs. Notice that advocate@ctrl_in addresses a specific port on the agent, the control port rather than the main input.

This is a small program. It is also a concurrent system with feedback loops, runtime parameter modulation, and a convergence protocol. Without types, getting the wiring right would be a matter of testing and hope. With types, it is checked before it runs.

What this shows

In a few lines of code, with a language that has categorical foundations, we can capture interesting agent systems and be sure they are well-formed before they run.

The upshot: when we have guarantees about well-formedness, systems work more stably and more predictably. With static typing, entire classes of structural errors are impossible. You cannot wire an output of one type to an input of another. You cannot forget a connection. The job you pay for is more likely to actually work, and you get more useful work per dollar spent. Runtime budget controls can put a ceiling on cost, but they do not prevent the waste. Static typing prevents the waste. But there is a lot more to do. What we have so far is already useful as a language for constructing agent graphs with static type checking. But we have given short shrift to the complexity and internal state of the agent morphism, which is really all about memory architecture and context management. That is where the real power comes from. For that we need more than a copy-discard category with some extra structure. We need protocols—and that is the subject of the sequel, soon to appear here.

The plumbing compiler, runtime, and MCP server are available as binary downloads for macOS and Linux:

Download plumbing version 0.

Here is the research paper describing the broader programme of work:

• William Waites, Artificial organisations (arXiv:2602.13275).


Un Bar aux Folies-Bergère

8 March, 2026

Manet’s famous painting Un Bar aux Folies-Bergère never appealed to me. But now I realize its genius, and my spine tingles every time I see it.

The perspective looks all wrong. You’re staring straight at this barmaid, but her reflection in the mirror is way off to right. Even worse, her reflection is facing a guy who doesn’t appear in the main view!

But in 2000, a researcher showed this perspective is actually possible!!! To prove it, he did a reconstruction of this scene:

• Malcolm Park, Manet’s Bar at the Folies-Bergère: one scholar’s perspective.

Here is Park’s reconstruction of the scene in Manet’s painting. How does it work? In fact the woman is viewed from an angle! While the man cannot be seen directly, his reflection is visible!

This diagram, created by Park with help from Darren McKimm, shows how the perspective works:

We are not directly facing the mirror, and while the man is outside our field of view, his reflection can be seen.

Astounding! But it’s not just a technical feat. It allowed Manet to make a deep point. While the woman seems to be busy serving her customer, she is internally completely detached—perhaps bored, perhaps introspective. She is split.

To fully understand the painting you also need to know that many of the barmaids at the Folies Bergère also served as prostitutes. Standing behind the oranges, the champagne and a bottle of Bass ale, the woman is just as much a commodity as these other things. But she is coldly detached from her objectification.

The woman in the painting was actually a real person, known as Suzon, who worked at the Folies Bergère in the early 1880s. For his painting, Manet posed her in his studio.

Before I understood this painting, I wasn’t really looking at it: I didn’t see it. I didn’t even see the green shoes of the trapeze artist. I can often grasp music quite quickly. But paintings often fail to move me until someone explains them.

When Édouard Manet came out with this painting in 1882, some critics mocked him for his poor understanding of perspective. Some said he was going senile. It was, in fact, his last major painting. But he was a genius, and he was going… whoosh… over their heads, just like he went over mine.