Skip to main content

The Discovery Oracle: Beyond the Broadcast Cliff

· 5 min read
Revant Nandgaonkar
Maintainer of Framework M

In our last post on Transparent UI Macroservices, we talked about the "what": breaking a monolithic frontend into independent, deployable units without breaking the user experience.

But as soon as you move from two services to ten, you hit a new wall: The Discovery Cliff.

How does the shell know where the finance service lives today? How does it find the inventory remote entry on a developer's machine versus a production Kubernetes cluster? And how do we do this without a 400-line configuration file that needs to be updated every time a port changes?

Today, we’re introducing the Discovery Oracle.

The Problem with Shouting

Most federated architectures rely on Broadcasts. When a service starts up, it shouts its existence to everyone else on the network. While this works for a handful of services, it creates three major problems at scale:

  1. Network Noise: Every heartbeat and manifest update consumes bandwidth across the entire cluster.
  2. State Bloat: Every service must maintain a full, global map of every resource in the system, even if it only ever communicates with one other service.
  3. The "Stale State" Trap: If a service crashes without saying goodbye, other services keep trying to hit a ghost endpoint until a timeout eventually clears the cache.

The Discovery Oracle: Just-In-Time Federation

Instead of everyone shouting at everyone, we’ve moved to a lazy, query-based mechanism we call The Oracle.

The Gateway (usually hosted by the core service) acts as the central authority. When the frontend Shell needs to render a resource it doesn't recognize—say, sales.Invoice—it doesn't look at a stale local map. It asks the Oracle: "Who owns sales.Invoice, and where do they live right now?"

The JIT Lifecycle

  1. Request: A user navigates to an Invoice. The Shell sees sales.Invoice and checks its local PluginRegistry.
  2. Query: If the owner is unknown, the DiscoveryClient sends a batch query to the Oracle (POST /api/gateway/discovery/query).
  3. Resolution: The Oracle looks up the current manifest in its NATS JetStream KV store and returns the service identity and its remote entry URL.
  4. Hydration: The Shell dynamically imports the remote, registers the owner, and the UI renders—all in a few hundred milliseconds.

Local-to-Prod Parity: Powered by NATS

One of the biggest "cliffs" in microservice development is that local discovery usually looks nothing like production. You end up with complex docker-compose files or manual .env hacks just to get two services to talk to each other.

By using NATS JetStream KV as our discovery backbone, we’ve eliminated this gap.

Whether you are running on your laptop or in a global cluster:

  • Services publish their manifests to a NATS KV bucket on startup.
  • The Oracle (and any service with a watcher) gets a real-time event when a manifest changes.
  • The same code paths, the same JIT queries, and the same resolution logic are used everywhere.

For the developer, "Discovery" just works. If you start a service on port 8001, it appears in the Shell. If you stop it, the Shell knows it's gone.

Scaling the Developer Experience

We also realized that fully-qualified names (like accounting.v1.general_ledger.JournalEntry) are a pain to type during development. To solve this, the Discovery Oracle supports Fuzzy Suffix Matching.

You can ask for JournalEntry, and the Oracle is smart enough to find the unique owner in the cluster that provides that resource. This keeps your code clean and your developer experience fast, while maintaining strict namespacing under the hood.

Why Macroservices? Independence over Throughput

A common misconception is that we decompose systems strictly for throughput or horizontal scaling. While those are welcome benefits, they aren't our primary driver.

We use Macroservices to achieve Team Independence. By carving the monolith along business domain boundaries (Finance, WMS, Inventory), we allow teams to own their entire stack—from the database schema to the UI components—and deploy them on their own schedule.

For high-throughput needs, one could certainly build traditional "micro" services (like a dedicated book-keeper or stock-keeper), but for the vast majority of business logic, the Macroservice provides the best balance of autonomy and low operational complexity. The Discovery Oracle is the "invisible" glue that makes this viable.

The Zero-Cliff Promise

The Discovery Oracle is the "invisible" glue that makes Macroservices viable. By moving from noisy broadcasts to JIT queries, we’ve ensured that the Framework M architecture can scale from a single monolith to a massive federated ecosystem without the operational "tax" usually associated with distributed systems.

The Shell remains Zero-Conf. It doesn't need to be told where the world is; it just knows how to ask the Oracle.


This is part of our series on Scaling Framework M. Join the conversation on the Framework M community forum.