AI Engineering Practice

Custom AI Systems.
Production-Ready.
Yours to Keep.

We design, build, and deploy AI assistants, autonomous agents, and backend infrastructure — engineered for your workflow, running on your servers, integrated with your tools.

Explore Services Start a Project

const agent = new AIVonAgent({
  model:  'claude-sonnet-4-6',
  tools:  ['mcp.paperclip', 'cognee.search', 'imap.fetch'],
  voice:  'deepgram.aura-2-phoebe-en',
  memory: 'cognee://self-hosted',
});
await agent.deploy({ platform: 'telegram' });

Talk to Phoebe, our AI assistant →

What We Build

Ten specialised services.
One engineering practice.

Every service we offer is derived from production systems we've already built and run. No guesswork — only proven patterns.

AI Assistant & Agent Development

Custom AI assistants with defined personas, domain knowledge, and real tool access via MCP. Deploy on Telegram, Discord, Slack, WhatsApp, or web — with multi-provider LLM routing and automatic failover.

MCP Servers
Multi-platform
LLM Routing
Persona Engineering

Voice AI Pipelines

Complete voice processing from raw audio through Deepgram STT transcription, LLM summarisation, SQLite FTS5 search indexing, and nightly knowledge graph ingestion — fully automated via systemd.

Deepgram nova-3
Deepgram Aura TTS
Auto-pipeline
FTS5 Search

Self-Hosted VPS Infrastructure

Production-grade Linux VPS configuration: Docker Engine with compose orchestration, systemd service and timer management, private bridge networking, and socat proxy patterns for safe cross-network access.

Docker
systemd
socat
Linux

Knowledge Graph & Persistent AI Memory

Self-hosted Cognee deployment combining LanceDB vector storage, KuzuDB graph database, and SQLite metadata. Your assistant builds semantic memory from documents and conversations, queryable in real time.

Cognee
LanceDB
KuzuDB
SQLite FTS5

Custom REST API Development

Lightweight, purpose-built Python REST APIs that expose your data — email, calendar, voice notes, research findings, and LLM usage metrics — to your AI assistant, dashboard, or any other consumer.

Python
IMAP API
CalDAV API
SQLite-backed

LLM Operations & Token Budget Management

Pre-send token estimation, model registry, multi-provider smart routing, rate-limit-aware fallback chains, and a full call ledger — every LLM call tracked with estimated vs actual tokens, cost, and latency.

Token Estimation
Smart Routing
Fallback Chains
Audit Ledger

Operational Dashboard Development

Self-hosted ToolJet dashboards that aggregate email, calendar, project tickets, voice notes, and AI usage into one command centre — with action buttons that trigger your AI assistant directly from any row.

ToolJet
Live Data
AI Actions
Self-Hosted

Email & Calendar Automation

IMAP and CalDAV integration connecting your inbox and calendar directly to your AI assistant and dashboard. Smart newsletter filtering, timezone-aware event parsing, and AI draft-reply workflows.

IMAP SSL
CalDAV
AI Draft Reply
Nextcloud

Project Management AI Integration

AI-connected project tracking via MCP — your assistant can list issues, read project status, and create tickets without leaving the conversation. When a decision is made, the assistant suggests and creates the ticket.

Paperclip MCP
Auto-ticketing
Context-aware

Research Automation & Knowledge Capture

Your AI assistant automatically saves every research finding — title, topic, markdown content, sources, and tags — to a persistent, FTS5-searchable library. Everything it has ever researched grows into a living knowledge base.

Auto-save
FTS5 Search
SOUL.md
Cognee Ingest

The Process

Simple to start.
Powerful in production.

Discover

We start with a discovery call to understand your workflow, your tools, and exactly what you want your AI to do — or stop doing manually.

Design

We design the architecture: which models, which tools, which integrations, which hosting pattern — and walk you through every decision before writing a line of code.

Deploy

We build, test, and deploy the system on your infrastructure. We document everything, hand it over fully, and remain available for iteration.

Under the Hood

Battle-tested technologies.
No unnecessary abstractions.

Every technology in our stack earns its place. We use what works at production scale on real infrastructure.

Python 3.11 Claude Sonnet 4.6 GPT-4o Deepgram nova-3 Deepgram Aura TTS MCP Protocol Docker Engine Docker Compose systemd Linux (Ubuntu) SQLite FTS5 LanceDB KuzuDB Cognee ToolJet Telegram Bot API IMAP / imaplib CalDAV / Nextcloud socat OpenAI-compatible APIs GLM-5-turbo PostgreSQL Discord Slack WhatsApp

Who We Work With

Built for people who
actually operate things.

⚡

Founders & Operators

You need an AI assistant that knows your business — your tickets, your emails, your meetings, your research. We build a private, self-hosted assistant that aggregates everything into one command centre and actually acts on it.

🏢

Businesses Moving Off SaaS AI

You use ChatGPT or a commercial AI tool but want to own your data, control the model, and integrate your internal systems. We migrate you to a self-hosted stack with equivalent or better capability — no lock-in.

🛠

Development Teams

Your product needs AI functionality — voice processing, document intelligence, a knowledge base, or an internal assistant. We design and build the backend AI layer so your team ships faster without reinventing infrastructure.

🤝

Agencies Building AI Products

You want to offer AI assistant and automation services but don't have the deep infrastructure expertise in-house. We build the stack; you deliver it under your brand. White-label engineering partnerships welcome.

About AI-Von

We build AI systems
that run, not just demo.

AI-Von is a specialist AI engineering practice focused on production-grade, self-hosted AI systems. We design and build AI assistants, autonomous agents, voice pipelines, knowledge graphs, REST APIs, and operational dashboards — all running on infrastructure you control.

Every system we build is engineered to run reliably on a Linux VPS, integrated with your real tools, and designed for long-term operation. We work in Python, Docker, and modern LLM APIs. We do not sell SaaS. We deliver systems.

Based in South Africa, working globally.

Start a Discovery Call

10+ Services Delivered

100% Self-Hosted Deployments

5+ LLM Providers Integrated

0 SaaS Lock-in

Get in Touch

Ready to build
your AI stack?

Every engagement starts with a discovery call to understand your workflow, your tools, and what you want your AI to do. No commitment, no pitch deck — just a conversation.

Or talk to Phoebe right now — click the microphone button in the bottom-right corner.

phil@ai-von.com

Custom AI Systems. Production-Ready. Yours to Keep.

Ten specialised services.One engineering practice.