Local AI inference,
across your entire fleet.

Run open-source AI across all your machines. No cloud service. Your data stays on your own hardware and network.

$ pip install "vimin-core[mlx] @ git+https://github.com/pberlizov/vimin-public.git"
$ vimin-core start-center && vimin-core start-agent

Your devices, working together

Most AI tools run on one machine at a time. vimin-core changes that.

Most homes and offices have more computing power sitting idle than they realise. vimin-core treats all of it as one: send a task to every machine at once, each one runs it locally with its own data, and results come back in seconds. You decide what gets sent back to the center and what stays on the device.

In the enterprise version, agents talk to each other rather than just receiving tasks. One machine can break a problem into pieces and hand them off to others. You can route tasks to specific devices, set rules about what data moves where, and plug the whole thing into your existing tools and workflows.

How it works

One machine routes the work. The rest run the models.

01

Start a center node

Run vimin-core start-center on any machine in your network. It routes tasks. No inference happens here.

02

Connect inference nodes

Run vimin-core start-agent on each machine. Each one loads its own models and registers with the center. No shared storage needed.

03

Send tasks

POST to /api/broadcast. The center routes work to agents, collects results, and returns them. All on your network.

# 1. Install (pick your hardware)
pip install "vimin-core[mlx] @ git+https://github.com/pberlizov/vimin-public.git"       # Apple Silicon
pip install "vimin-core[llamacpp] @ git+https://github.com/pberlizov/vimin-public.git"  # Linux / Windows / CUDA / CPU

# 2. Start the center node
vimin-core start-center                         # localhost only (single machine)
vimin-core start-center --host 0.0.0.0          # accept agents from other machines

# 3. Connect agents (run on each inference machine)
vimin-core start-agent                          # same machine as center
vimin-core start-agent --center http://<center-ip>:8080  # remote machine

# 4. Broadcast a task
vimin-core broadcast "Summarize Q3 results." --mode return

Built for serious deployments

Nodes that talk to each other and run multi-step jobs without a human in the loop.

Agent-to-agent coordination

Nodes can break a task into pieces and pass them to other nodes on the network. Each step runs locally. OpenClaw nodes join the same fleet alongside standard agents.

Fleet pipelines & workflows

Chain inference tasks across nodes so the output of one model becomes the input to the next. Define multi-step pipelines with conditional branching and per-step hardware routing.

Hierarchical node orchestration

Assign department-level administrative nodes that coordinate their own cluster of inference machines. Any graph topology works.

Data sovereignty controls

Choose exactly what flows back to the center node. Sensitive inference results can stay on the edge device entirely.

Plugin integrations

Connect vimin to your existing stack. First-party integrations include LiveKit for real-time audio and video inference pipelines. Build your own with the plugin API.

Custom models & hardware adaptation

Run fine-tuned or proprietary models alongside standard registry models. Hardware backends not covered by vimin-core can be added with dedicated integration work.

Priority support & SLA

Direct access to the engineering team. Response time SLAs with guaranteed escalation for production incidents.

vimin-core vs. vimin

Start free and self-hosted. Upgrade when you need more.

vimin-core
Free · Source available
vimin
Enterprise
Max nodes 10 Unlimited
Broadcast dispatch
OpenClaw node support broadcast only ✓ + agent coordination
SSO & Role-based Access Control (RBAC)
Audit Logging & Compliance Reporting (SOC2) basic local audit log
High Availability (HA) Center Nodes
Enterprise Telemetry & Observability Export
Air-gapped deployment support self-hosted
Per-node task targeting
Manual approval for new agents
Hierarchical & graph node topology
Fleet pipelines & workflows basic center-driven pipelines
Data sovereignty controls edge-only result mode
LiveKit & plugin integrations
MLX · llama-cpp · ONNX backends
Custom models & hardware backends
Priority support & SLA

Ready to scale beyond 10 nodes?

Book a 15-minute call to talk through your deployment.

Or email us at