How the network runs

Fan out tasks, route to specialists, merge results.

  • Requests split into subtasks and routed to role‑specific models
  • Models are tuned for concurrent execution and reasoning
  • Distributed MoE enables larger models across many devices
  • Edge‑to‑edge encrypted: nodes can’t read prompts or outputs

The workflow (in plain terms)

1) Plan Turn a request into a set of subtasks.
2) Fan out Send subtasks to many nodes.
3) Execute Idle devices run small models.
4) Verify Sampling / probes reduce cheating.
5) Merge Combine results into one output.
Key point: the “speed” comes from doing many subtasks simultaneously, not from making a single model generate tokens faster.

Why distributed MoE beats local

More intelligence per device, and shorter wall‑clock time per job.

Frontier‑scale intelligence

Local models are bounded by what your device can load. With distributed MoE, each node runs a specialist and the system activates only the experts needed for each step. This yields capability closer to frontier‑scale MoE models without requiring a single massive GPU.

Speed from concurrency

Instead of one model doing a long chain, Zacro fans out subtasks to many devices at once. You get summaries, checks, and drafts in parallel, then a merge step assembles the final answer.

Merge + verify

Multiple independent outputs are compared, reconciled, and scored. This reduces single‑model blind spots and catches errors before results are returned.

Bottom line: local models are fast to start but limited in intelligence. Distributed MoE expands capability while preserving speed through parallel execution.

Distributed MoE & routing

Experts across devices, routed automatically for the best outcome.

Expert nodes

Each contributor runs a specialist (math, code, writing, planning). A single job can activate multiple experts, creating a virtual MoE that exceeds the memory limits of any single device.

Concurrency-tuned

Models are fine-tuned for parallel task execution and multi-branch reasoning, not just single-thread chat.

Role specialists

Math reasoning, programming, analytical writing, task planning, summarization, translation, tool calling.

Automatic routing: the scheduler selects the best path and can chain specialists. This creates a distributed MoE that rivals frontier-scale models without requiring one massive GPU.

Privacy (edge-to-edge encrypted)

Requests are encrypted edge-to-edge so worker nodes only see sealed payloads and execution constraints, not the prompt or the output. Only the requester can decrypt results.

Node-blind by design: payloads stay sealed in transit and at rest, so nodes cannot inspect what comes in or goes out.