We design, build, and deploy AI assistants, autonomous agents, and backend infrastructure — engineered for your workflow, running on your servers, integrated with your tools.
Every service we offer is derived from production systems we've already built and run. No guesswork — only proven patterns.
Custom AI assistants with defined personas, domain knowledge, and real tool access via MCP. Deploy on Telegram, Discord, Slack, WhatsApp, or web — with multi-provider LLM routing and automatic failover.
Complete voice processing from raw audio through Deepgram STT transcription, LLM summarisation, SQLite FTS5 search indexing, and nightly knowledge graph ingestion — fully automated via systemd.
Production-grade Linux VPS configuration: Docker Engine with compose orchestration, systemd service and timer management, private bridge networking, and socat proxy patterns for safe cross-network access.
Self-hosted Cognee deployment combining LanceDB vector storage, KuzuDB graph database, and SQLite metadata. Your assistant builds semantic memory from documents and conversations, queryable in real time.
Lightweight, purpose-built Python REST APIs that expose your data — email, calendar, voice notes, research findings, and LLM usage metrics — to your AI assistant, dashboard, or any other consumer.
Pre-send token estimation, model registry, multi-provider smart routing, rate-limit-aware fallback chains, and a full call ledger — every LLM call tracked with estimated vs actual tokens, cost, and latency.
Self-hosted ToolJet dashboards that aggregate email, calendar, project tickets, voice notes, and AI usage into one command centre — with action buttons that trigger your AI assistant directly from any row.
IMAP and CalDAV integration connecting your inbox and calendar directly to your AI assistant and dashboard. Smart newsletter filtering, timezone-aware event parsing, and AI draft-reply workflows.
AI-connected project tracking via MCP — your assistant can list issues, read project status, and create tickets without leaving the conversation. When a decision is made, the assistant suggests and creates the ticket.
Your AI assistant automatically saves every research finding — title, topic, markdown content, sources, and tags — to a persistent, FTS5-searchable library. Everything it has ever researched grows into a living knowledge base.
We start with a discovery call to understand your workflow, your tools, and exactly what you want your AI to do — or stop doing manually.
We design the architecture: which models, which tools, which integrations, which hosting pattern — and walk you through every decision before writing a line of code.
We build, test, and deploy the system on your infrastructure. We document everything, hand it over fully, and remain available for iteration.
Every technology in our stack earns its place. We use what works at production scale on real infrastructure.
You need an AI assistant that knows your business — your tickets, your emails, your meetings, your research. We build a private, self-hosted assistant that aggregates everything into one command centre and actually acts on it.
You use ChatGPT or a commercial AI tool but want to own your data, control the model, and integrate your internal systems. We migrate you to a self-hosted stack with equivalent or better capability — no lock-in.
Your product needs AI functionality — voice processing, document intelligence, a knowledge base, or an internal assistant. We design and build the backend AI layer so your team ships faster without reinventing infrastructure.
You want to offer AI assistant and automation services but don't have the deep infrastructure expertise in-house. We build the stack; you deliver it under your brand. White-label engineering partnerships welcome.
AI-Von is a specialist AI engineering practice focused on production-grade, self-hosted AI systems. We design and build AI assistants, autonomous agents, voice pipelines, knowledge graphs, REST APIs, and operational dashboards — all running on infrastructure you control.
Every system we build is engineered to run reliably on a Linux VPS, integrated with your real tools, and designed for long-term operation. We work in Python, Docker, and modern LLM APIs. We do not sell SaaS. We deliver systems.
Based in South Africa, working globally.
Start a Discovery CallEvery engagement starts with a discovery call to understand your workflow, your tools, and what you want your AI to do. No commitment, no pitch deck — just a conversation.
Or talk to Phoebe right now — click the microphone button in the bottom-right corner.
phil@ai-von.com