vimin turns a network of machines into a coordinated AI inference cluster. No cloud dependency. Your data stays on your network.
Architecture
One center node. Many inference nodes. Your data never leaves.
Run vimin-core start-center on any machine in your network. It acts as the routing hub — no inference happens here.
Run vimin-core start-agent on each machine. Agents load models locally and register with the center — no shared storage needed.
POST to /api/broadcast. The center routes work to agents, collects results, and returns them — all on your network.
# 1. Start the center node pip install vimin-core[mlx] vimin-core start-center # 2. Connect an agent (on any machine) VIMIN_CENTER_URL=http://192.168.1.10:8080 vimin-core start-agent # 3. Broadcast a task curl -X POST http://192.168.1.10:8080/api/broadcast \ -H "Authorization: Bearer $API_KEY" \ -d '{"prompt": "Summarize Q3 results.", "model_id": "meta-llama/Llama-3.2-3B-Instruct"}'
Enterprise
vimin's enterprise features are designed for teams that need more than a broadcast.
Assign department-level administrative nodes that coordinate their own cluster of inference machines. Any graph topology works.
Choose exactly what flows back to the center node. Sensitive inference results can stay on the edge device entirely.
Connect vimin to your existing stack. First-party integrations include LiveKit for real-time audio and video inference pipelines. Build your own with the plugin API.
Chain inference tasks across nodes so the output of one model becomes the input to the next. Define multi-step pipelines with conditional branching and per-step hardware routing.
vimin-core lets OpenClaw nodes receive broadcast tasks. The full vimin distribution adds agent-to-agent coordination: nodes can delegate work and pass context between inference calls without routing through the center node.
Run fine-tuned or proprietary models alongside standard registry models. Hardware backends not covered by vimin-core can be added with dedicated integration work.
Direct access to the engineering team. Response time SLAs with guaranteed escalation for production incidents.
Open-source vs. Enterprise
Start free and self-hosted. Upgrade when you need more.
Book a 15-minute call to talk through your deployment.