feat: add distributed mode by mudler · Pull Request #9124 · mudler/LocalAI

mudler · 2026-03-23T23:47:35Z

Description

The objective of this PR is to make LocalAI scalable horizontally, and delegate processing to remote gRPC LocalAI workers.

Distributed mode enables horizontal scaling of LocalAI across multiple machines using PostgreSQL for state and node registry, and NATS for real-time coordination. Unlike P2P mode, distributed mode is designed for production deployments and Kubernetes environments where you need centralized management, health monitoring, and deterministic routing. To enable this, you have to pass --distributed to LocalAI. A docker compose file is provided as well to start quickly the full stack with a single command.

Note: unlike other ways to run LocalAI, distributed mode requires authentication enabled with a PostgreSQL database — SQLite is not supported. This is because the node registry, job store, and other distributed state are stored in PostgreSQL tables.

Architecture:

                    ┌─────────────────┐
                    │   Load Balancer  │
                    └────────┬────────┘
                             │
              ┌──────────────┼──────────────┐
              │              │              │
      ┌───────▼──────┐ ┌────▼─────┐ ┌─────▼──────┐
      │  Frontend #1 │ │ Frontend │ │ Frontend #N│
      │  (LocalAI)   │ │  #2      │ │  (LocalAI) │
      └──────┬───────┘ └────┬─────┘ └─────┬──────┘
             │              │              │
     ┌───────▼──────────────▼──────────────▼───────┐
     │              PostgreSQL + NATS               │
     │  (node registry, jobs, coordination)         │
     └───────┬──────────────┬──────────────┬───────┘
             │              │              │
      ┌──────▼──────┐ ┌────▼─────┐ ┌─────▼──────┐
      │  Worker #1  │ │ Worker   │ │ Worker #N   │
      │  (generic)  │ │ #2       │ │  (agent)  │
      └─────────────┘ └──────────┘ └────────────┘

Frontends are stateless LocalAI instances that receive API requests and route them to worker nodes via the SmartRouter. All frontends share state through PostgreSQL and coordinate via NATS.

Workers are generic processes that self-register with a frontend. They don't have a fixed backend type — the SmartRouter dynamically installs the required backend via NATS backend.install events when a model request arrives.

Scheduling Algorithm

The SmartRouter uses idle-first scheduling:

If the model is already loaded on a node → use it (least in-flight)
If no node has the model → prefer truly idle nodes (zero models, zero in-flight), trying to fit in nodes reported free VRAM/RAM

Nodes page:

Screenshot 2026-03-24 at 00-20-06 LocalAI

Notes for Reviewers

TODO:

Follow-ups:

The S3 implementation is provided as-is and I did not tested it (as requires another layer of machinery to be introduced here) however wirings are already in place - and it is of merely two files that are implementers of the relative interfaces, so the impact is really minimum. I'll make sure to mark it as experimental and gather feedback once merged in master.

Signed commits

Yes, I signed my commits.

netlify · 2026-03-23T23:47:46Z

❌ Deploy Preview for localai failed.

Name	Link
🔨 Latest commit	`318814a`
🔍 Latest deploy log	https://app.netlify.com/projects/localai/deploys/69c5d73ce8a1a1000856d101

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

mudler mentioned this pull request Mar 24, 2026

add local remote llamas #9122

Open

mudler force-pushed the feat/distributed-mode branch 3 times, most recently from 23f3831 to 5aa34de Compare March 24, 2026 22:53

mudler changed the title ~~feat: add distributed mode (experimental)~~ feat: add distributed mode Mar 24, 2026

feat: add distributed mode (experimental)

f3db5fd

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

mudler force-pushed the feat/distributed-mode branch from 5aa34de to f3db5fd Compare March 25, 2026 13:50

mudler added 3 commits March 25, 2026 22:53

feat: wire agent service over nats

bac9b71

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

No need of pgsql connection on agent workers

7cbae99

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

Route mcp endpoint and MCP CI jobs into agent workers

9d62d6a

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

mudler force-pushed the feat/distributed-mode branch 3 times, most recently from 274cbed to 3bc78b0 Compare March 26, 2026 08:51

fixing tests

715aace

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

mudler force-pushed the feat/distributed-mode branch from 3bc78b0 to 715aace Compare March 26, 2026 09:44

cleanups

b61308e

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

mudler force-pushed the feat/distributed-mode branch 6 times, most recently from 935cf29 to e35dbea Compare March 27, 2026 00:56

Small fixes

b5dafe8

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

mudler force-pushed the feat/distributed-mode branch from e35dbea to b5dafe8 Compare March 27, 2026 01:00

Wait for healty workers before starting agent pool

318814a

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add distributed mode#9124

feat: add distributed mode#9124
mudler wants to merge 8 commits intomasterfrom
feat/distributed-mode

mudler commented Mar 23, 2026 •

edited

Loading

Uh oh!

netlify bot commented Mar 23, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

mudler commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Scheduling Algorithm

Uh oh!

netlify bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

❌ Deploy Preview for localai failed.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mudler commented Mar 23, 2026 •

edited

Loading

netlify bot commented Mar 23, 2026 •

edited

Loading