Hilbert’s Sixth Problem and the Stabilization of Learning

Dec 24

The Ghost in the Machine: Hilbert’s Unfinished Quest

As we close out 2025, it is worth reflecting on how long-standing foundational questions have begun to resurface across disciplines. We are currently witnessing the resolution of a challenge issued over a century ago, one that clarifies not just the nature of matter, but the future of mind.

When David Hilbert posed his Sixth Problem in 1900, he was not asking for better equations. He was asking for foundations. Specifically, he asked whether physics could be axiomatized in the same way mathematics had been. At the time, this sounded like a question about mechanics, probability, and statistical physics. In hindsight, it was something more subtle. Hilbert was asking whether the messy, empirical laws of the physical world could be derived from deeper organizing principles rather than discovered piecemeal. More than a century later, we are confronting the same question again, but in a different domain. Not physics, but intelligence.

Modern machine learning has achieved astonishing empirical success. Large language models generate text, code, proofs, and plans that appear intelligent. Yet we still lack a unifying theory that explains why these systems work, when they fail, and why new capabilities emerge suddenly rather than gradually. This tension mirrors the state of physics before Hilbert. We have powerful tools and convincing results, but no foundational framework that explains how the pieces fit together.

Two dominant responses have emerged. One focuses on structure. The other focuses on learning dynamics. Understanding why both matter, and why neither is sufficient alone, is the key to seeing why Hilbert’s Sixth Problem has quietly resurfaced in Artificial Intelligence (AI).

Grammar of Intelligence: Category Theory and Structure

Category Theory has recently been proposed as a unifying structural framework for deep learning. Its appeal is obvious. Category Theory is the mathematics of composition. It describes how objects relate, how transformations preserve structure, and when systems can be composed without contradiction. In deep learning, this has led to powerful insights. Geometric deep learning showed that when models are built to respect symmetries, such as translation or permutation invariance, generalization improves dramatically. These architectures succeed not because they are larger, but because they are constrained.

From a categorical perspective, this makes sense. Equivariance and invariance are not tricks. They are structure preserving maps. Weight sharing, recursion, and compositionality can all be expressed cleanly once computation is framed as morphisms between structured spaces. In this view, neural networks are not arbitrary function approximators but homomorphisms that preserve the algebra of the domain they operate in. Geometric deep learning then appears not as an isolated breakthrough, but as a special case of a more general principle.

However, structure alone does not explain intelligence. It explains correctness. It explains when transformations are valid. It explains why certain architectures generalize better than others. What it does not explain is emergence. It does not explain why symbolic behavior appears suddenly, why learning stabilizes, or why models begin to reason only after long periods of apparently shallow pattern fitting. It also does not explain why the same model oscillates between brilliance and hallucination, or why tools and prompts eventually plateau. This is where learning dynamics enter the picture. Category Theory tells us what transformations are valid, but it does not tell us which transformations a learning system will converge toward.

From Interpolation to Internalization: Learning Stabilizes

In the first of the two prior LatentSpin blogs, Symbolic Systems were framed not as something injected into neural networks, but as something that emerges from continuity when learning becomes active and constrained. The central claim was simple but nontrivial. Symbolic behavior does not arise from scale alone. It arises when a learning system internalizes invariants and stops needing to adapt. Symbols are not learned representations. They are stabilized ones.

This perspective reframes the failure modes of large language models. Arithmetic failures, carry errors, and brittle reasoning are not mysteries. They are symptoms of systems that interpolate but do not internalize. Carry is not information stored in a state. It is information stored in a transition. Without a mechanism to preserve and stabilize that transition, no amount of pattern density will yield reliable computation. External tools can mask this temporarily, but they do not resolve it. They move the computation outside the learning system rather than changing how learning settles inside it.

The second essay approached the same issue from a systems perspective. Emergence was framed not as magic, but as a phase transition. When feedback, capacity, and constraint align, learning pressure collapses. At that point, behavior changes qualitatively. The system stops adapting locally and begins preserving globally. This is when symbols appear, when abstractions harden, and when reasoning becomes possible. Importantly, this transition is irreversible. Once a system has stabilized around an invariant, it no longer explores alternatives freely.

This is the missing piece in purely structural theories. Category Theory can tell us what structures are admissible. It can tell us what must be preserved for correctness. It cannot tell us when preservation replaces learning. It is silent about time, pressure, and convergence. It describes the grammar of computation, not the thermodynamics of learning.

The Conceptual Adaptation Theory (CAT) framework was introduced to address precisely this gap. CAT treats learning as a physical process rather than a static mapping. Systems such as self-improving agents adapt under pressure. That pressure dissipates as invariants are discovered. When the cost of adaptation exceeds the benefit, learning stops locally and structure freezes. Intelligence emerges not because the system has learned everything, but because it has learned what not to change. Once learning pressure collapses, this stabilization is irreversible in the same way entropy defines an arrow of time.

This distinction matters because it reframes the role of structure. Structure is not primary. It is an outcome. Constraints do not merely shape representations. They shape learning trajectories. A symmetry built into an architecture reduces the hypothesis space. But whether that symmetry becomes a symbol depends on whether learning converges around it. Two systems can share the same categorical structure and exhibit radically different behavior depending on how learning pressure is applied and relieved.

Great Unification: Symmetry, Entropy, and Inevitability

Seen this way, Category Theory and CAT are not competing frameworks. They operate at different levels. Category Theory formalizes what must be preserved for computation to be valid. CAT explains why certain preservations emerge and others do not. One describes structure. The other describes stabilization. This brings us back to Hilbert’s Sixth Problem.

Hilbert asked whether the laws of physics could be derived from axioms. What he implicitly assumed was that physical laws were static objects waiting to be formalized. What twentieth century physics revealed instead is that laws emerge from deeper principles of symmetry, invariance, and entropy. Statistical mechanics did not replace mechanics. It explained why mechanics works at certain scales and fails at others.

We are now at the same juncture with intelligence. Category Theory plays the role of symmetry. It tells us what computations must preserve. CAT plays the role of statistical mechanics. It tells us why systems settle into those computations in the first place.

Without CAT, categorical deep learning risks becoming another elegant but incomplete formalism. It can tell us how to design architectures that respect structure. It cannot tell us why those architectures suddenly begin to reason, or why adding more structure sometimes degrades performance. Without Category Theory, CAT risks becoming ungrounded dynamics. It can explain emergence, but not correctness. Intelligence requires both.

Hilbert’s Sixth Problem was not solved by writing down better equations. It was addressed by recognizing that physical law is an emergent phenomenon governed by constraints and equilibria. In the same way, intelligence will not be unified by better prompts, larger models, or more clever tools. It will be unified by understanding how learning systems stabilize, when adaptation ceases, and why structure becomes invariant.

In that sense, the unification of intelligence is not a question of symbols versus neural networks, or structure versus scale. It is a question of learning versus preservation. Category Theory gives us the language of preservation. CAT gives us the law of learning that makes preservation inevitable.

Year-in-Review: Toward a Foundation of Intelligence

Hilbert’s Sixth Problem was never just about physics equations; it was an invitation to derive the lawful behavior of the world from fundamental axioms. Just as the 2025 solution proved that smooth fluid dynamics emerge inevitably from the "messy" collisions of particles, we are discovering that intelligence is an emergent phenomenon governed by constraints and equilibria.

The unification of AI requires two forces working in tandem: Category Theory to provide the language of structural preservation, and Conceptual Adaptation Theory to provide the learning pressure that makes that preservation inevitable. Intelligence is not found in the scale of the "mess," but in the moment the system stops adapting and begins preserving. We are finally moving past empirical guesswork toward a foundation where reasoning is no longer a miracle, but a mathematical necessity.

Thomas Hazel