Skip to main content
// Thesis · v0.3 · revised May 2026

A routable intelligence layer.

Three observations about scaling, capability discovery, and operational precedent that, taken together, motivate Sonata42's work on a protocol layer for federated inference.


I. Scaling laws make the monolithic path unsustainable.

The empirical scaling laws describing the cost of training and inference have been well studied for several years. The training cost of a dense transformer rises with roughly the fifth power of neuron count, and inference cost with the fourth. These exponents are not yet at the limit of what can be amortised by improvements in silicon, but they will be within a decade, and the gap between the projected curve and the available energy budget is no longer a debate among practitioners.

The community's response has been architectural. Mixture-of-experts and similar approaches reduce the activated parameter count per token without reducing the total parameter count of the system. These approaches are real progress. They do not, however, address the layer Sonata42 is concerned with: the layer at which one system finds another system, decides whether to delegate, and routes a query along a path it can verify. That layer is currently a monolithic assumption inside the largest models, and it is the assumption most exposed to the scaling curves above.

II. Capability discovery is the unsolved problem.

Retrieval augmented generation, the Model Context Protocol, and the current generation of agentic frameworks have all gestured at the same need. They want a specialist system, somewhere outside the immediate context window, to be involved in the response to a query. They differ in how they express that gesture. They agree in not solving the underlying problem, which is the discovery of capability.

The discovery problem has a known solution space. Networks of cooperating systems have been solving it for thirty years in a different domain: the public Internet. The lesson from that domain is that capability advertisement is its own protocol layer, separate from transport and separate from policy. Sonata42's position is that the same separation will be needed in the inference layer before federated systems can compose at scale.

III. Internet routing is a working precedent.

Border Gateway Protocol, Open Shortest Path First, and Multi-Protocol Label Switching together describe a planet-scale system in which no participant holds a global view, and yet in which traffic flows reliably between any two endpoints. The protocols have been incrementally extended, deployed without flag days, and operated continuously for longer than most of the engineers working on contemporary inference frameworks have been writing code. They are not pretty. They are robust.

The contribution Sonata42 is working toward is the translation of this engineering tradition into the inference layer: a capability advertisement format, a route selection algorithm, a session establishment handshake, and a label-switched forwarding plane that lets specialist generative systems federate without any of them holding a global view of the network. The early sketches look more like RFC 4271 than like a model card. That is the right shape for the problem.


// References
  1. Kaplan, J. et al. Scaling laws for neural language models. arXiv:2001.08361, 2020.
  2. Hoffmann, J. et al. Training compute-optimal large language models. arXiv:2203.15556, 2022.
  3. Rekhter, Y., Li, T., Hares, S. A Border Gateway Protocol 4 (BGP-4). RFC 4271, IETF, 2006.
  4. Moy, J. OSPF Version 2. RFC 2328, IETF, 1998.
  5. Rosen, E., Viswanathan, A., Callon, R. Multiprotocol Label Switching Architecture. RFC 3031, IETF, 2001.
  6. Anthropic. Model Context Protocol specification. 2024.
  7. Shazeer, N. et al. Outrageously large neural networks: the sparsely-gated mixture-of-experts layer. arXiv:1701.06538, 2017.
  8. Lewis, P. et al. Retrieval-augmented generation for knowledge-intensive NLP tasks. arXiv:2005.11401, 2020.