A multi-model orchestration layer where every AI action is classified by risk, routed by policy, and logged with cryptographic integrity. Local inference by default. Cloud access only with explicit approval.
Every AI request flows through a governed pipeline. The orchestrator classifies the task, checks policy, selects the appropriate model, enforces data boundaries, and logs the entire decision chain.
Five production systems consolidated under a unified orchestration architecture. Each component addresses a distinct layer of the AI governance stack.
Project resolution, zone-based permissions (8 zone types), command safety classification, session continuity tracking, and fuzzy matching for project lookup. Python CLI with programmatic API.
Zero-trust security enclave for local LLM inference via Ollama. Policy-gated file access through YAML config, workspace bridge isolation (inbox/outbox/scratch), and hash-chain session audit trail. No outbound network.
Hybrid local+cloud AI workstation with multi-provider abstraction layer. Routes requests to Ollama (local), Anthropic Claude (cloud), or OpenAI-compatible endpoints. Streaming chat, SQLite persistence, conversation history.
Autonomous background worker with 6 specialized agents: Scanner, Planner, Executor, Tester, Reviewer, Monitor. SafetyGuard enforces path sandboxing, rate limits (5/hr), file size caps, kill switch at >50% failure rate.
37,500+ lines across 30 documents defining the complete operational model: 5-tier approval system, 8 session authority states, intent-based routing rules, agent boundary enforcement, and formal approval artifact format.
Session intelligence dashboard analyzing 80+ Claude Code sessions. Three.js 3D terrain visualization of activity patterns, D3 heatmaps, hour-of-day analysis. Zero external dependencies on backend.
The orchestrator selects the appropriate model based on task classification, data sensitivity, and operator policy. Local inference handles the majority of requests. Cloud access requires explicit policy allowance.
| Model | Provider | Location | Use Cases | Data Policy |
|---|---|---|---|---|
| Qwen 2.5 Coder 14B | Ollama | Local (RTX 4070) | Code generation, file operations, planning, review | No data leaves machine |
| Llama 3 8B | Ollama | Local (RTX 4070) | Fast classification, summarization, routing decisions | No data leaves machine |
| Claude (Opus/Sonnet) | Anthropic API | Cloud (opt-in) | Complex reasoning, large context, architecture review | Policy-gated, audited |
| OpenAI-Compatible | Configurable | Cloud (opt-in) | Provider flexibility, evaluation benchmarks | Policy-gated, audited |
Multiple independent safety layers ensure that autonomous AI actions remain bounded, auditable, and reversible.
Runtime enforcement layer in Auton that prevents dangerous operations. Path sandboxing (only D:\ProjectsHome), banned operations (rm -rf, --force, DROP TABLE), rate limiting (5 tasks/hr, 1 concurrent), file size caps (100KB/file, 10 files/task), disk free minimum (1GB), and automatic kill switch on >50% failure rate or 3 consecutive blocks.
Auton operates in DRY_RUN by default. Escalation to SUPERVISED requires: 10 successful reviews, 3 days of operation, <10% failure rate. Escalation to AUTONOMOUS requires: 50 successful tasks, 14 days, <5% failure rate, plus explicit human approval.
LLM Enclave enforces strict file access via bridge_policy.yaml. Allowed roots: D:\ProjectsHome only. Deny patterns: .env, credentials, private keys, node_modules, .git directories. All file operations route through workspace bridge (inbox/outbox/scratch).
Every state transition, permission grant, and AI decision is logged in append-only NDJSON format. Each entry includes a SHA-256 hash of the previous entry, creating a tamper-evident chain. Integrity is verified on every read. Chain breaks are flagged immediately.
| Layer | Technology | Purpose | Status |
|---|---|---|---|
| Inference | Ollama + RTX 4070 | Local LLM execution, GPU-accelerated | Operational |
| Server | Fastify (TypeScript) | API, WebSocket streaming, model proxy | Active |
| Frontend | React + Vite + Tailwind | Chat UI, model selector, dashboard | Active |
| Database | SQLite (sql.js WASM) | Conversations, sessions, audit trail | Active |
| Governance | Python + YAML | Router, enforcement, zone model | Operational |
| Agents | Python + asyncio | 6 autonomous agents, event bus | DRY_RUN |
The orchestrator sits between users/agents and all AI resources. Nothing reaches a model without passing through the policy engine first.