Local AI inference,
across your entire fleet.

vimin turns a network of machines into a coordinated AI inference cluster. No cloud dependency. Your data stays on your network.

$ pip install vimin-core[mlx]
$ vimin-core start-center
# on each inference node:
$ vimin-core start-agent --center http://<center-ip>:8080

How it works

One center node. Many inference nodes. Your data never leaves.

01

Start a center node

Run vimin-core start-center on any machine in your network. It acts as the routing hub — no inference happens here.

02

Connect inference nodes

Run vimin-core start-agent on each machine. Agents load models locally and register with the center — no shared storage needed.

03

Send tasks

POST to /api/broadcast. The center routes work to agents, collects results, and returns them — all on your network.

# 1. Start the center node
pip install vimin-core[mlx]
vimin-core start-center

# 2. Connect an agent (on any machine)
VIMIN_CENTER_URL=http://192.168.1.10:8080 vimin-core start-agent

# 3. Broadcast a task
curl -X POST http://192.168.1.10:8080/api/broadcast \
  -H "Authorization: Bearer $API_KEY" \
  -d '{"prompt": "Summarize Q3 results.", "model_id": "meta-llama/Llama-3.2-3B-Instruct"}'

Built for serious deployments

vimin's enterprise features are designed for teams that need more than a broadcast.

Hierarchical node orchestration

Assign department-level administrative nodes that coordinate their own cluster of inference machines. Any graph topology works.

Data sovereignty controls

Choose exactly what flows back to the center node. Sensitive inference results can stay on the edge device entirely.

Plugin integrations

Connect vimin to your existing stack. First-party integrations include LiveKit for real-time audio and video inference pipelines. Build your own with the plugin API.

Fleet pipelines & workflows

Chain inference tasks across nodes so the output of one model becomes the input to the next. Define multi-step pipelines with conditional branching and per-step hardware routing.

OpenClaw agent coordination

vimin-core lets OpenClaw nodes receive broadcast tasks. The full vimin distribution adds agent-to-agent coordination: nodes can delegate work and pass context between inference calls without routing through the center node.

Custom models & hardware adaptation

Run fine-tuned or proprietary models alongside standard registry models. Hardware backends not covered by vimin-core can be added with dedicated integration work.

Priority support & SLA

Direct access to the engineering team. Response time SLAs with guaranteed escalation for production incidents.

vimin-core vs. vimin

Start free and self-hosted. Upgrade when you need more.

vimin-core
Free · Open source
vimin
Enterprise
Max nodes 10 Unlimited
Broadcast dispatch
SSO & Role-based Access Control (RBAC)
Audit Logging & Compliance Reporting (SOC2)
High Availability (HA) Center Nodes
Enterprise Telemetry & Observability Export
Air-gapped deployment support
Per-node task targeting
Hierarchical & graph node topology
Fleet pipelines & workflows
Data sovereignty controls
OpenClaw node support broadcast only ✓ + agent coordination
LiveKit & plugin integrations
MLX · llama-cpp · ONNX backends
Custom models & hardware backends
Priority support & SLA

Ready to scale beyond 10 nodes?

Book a 15-minute call to talk through your deployment.

Or email us at