CLEMENT CHIBOKO — AI ENGINEER

I build AI systems
that run at the edge.

NOW

Shipping on-device medical inference and automated model fine-tuning pipelines. actively building

SELECTED

Three projects — medgemma, DomainForge, local-agent-core — making powerful AI local, private, and affordable. See the work →

GET IN TOUCH

Open to freelance projects, collaborations, and full-time roles in AI engineering — clementchiboko@gmail.com

Selected work

03 PROJECTS / 2024 — 2025
medgemma
New
medgemma
────────────────────────
status active
runtime mlx · apple silicon
target on-device inference
────────────────────────
$ pip install medgemma
SOFTWARE · 2025 Open →

medgemma

Medical AI on Apple Silicon. Runs Google's MedGemma 4B entirely on your Mac — no cloud API, no data sent anywhere. Ask medical questions, analyze X-rays, get evidence-based responses from the terminal.

PythonMLXApple SiliconMedical AIMultimodalOn-device
DomainForge
DomainForge
────────────────────────
status in development
stack next.js · qlora · pytorch
target mobile ai · edge gallery
────────────────────────
$ npm run train
SOFTWARE · 2025 Open →

DomainForge

Upload data, generate training sets, fine-tune Gemma 3n via QLoRA on spot GPUs, export production-ready mobile models. 90–97% cheaper than SageMaker, under 48 hours end-to-end.

Next.jsPythonPyTorchQLoRAMobile AIGoogle AI Edge
local-agent-core
local-agent-core
────────────────────────
status active
runtime ollama · qwen3:8b
target local tool-calling
────────────────────────
$ pip install local-agent-core
SOFTWARE · 2025 Open →

local-agent-core

A Python framework for reliable Qwen3-8B tool-calling via Ollama. Solves the three hard problems of local agentic loops — infinite loops, crashed tool calls, and lost session state — so production agents run cleanly on local hardware.

PythonOllamaQwen3Tool-callingLocal LLMFlask
BACKGROUND

About me

I'm an AI engineer focused on the space where machine learning meets real hardware constraints. I build tools that make sophisticated models practical — running privately on a laptop, deploying cheaply on mobile devices, or training efficiently on spot GPU instances.

My work sits at the intersection of model fine-tuning, on-device inference, and developer tooling. I'm particularly drawn to problems where the default answer is "use a cloud API" — and finding a better one.

When I'm not building I'm reading about quantization techniques, new edge inference frameworks, or thinking about how small teams can own and operate AI systems without the enterprise price tag.

TECH I WORK WITH
PythonTypeScriptPyTorchMLXNext.jsNode.jsQLoRA / PEFTPostgreSQLDockerCloudflare
GET IN TOUCH

Let's talk.

Have a project in mind, want to collaborate, or just want to say hello? I'd love to hear from you.

Currently open to freelance projects, interesting collaborations, and full-time roles in AI engineering.

WRITING

KANUNI.

Arte, scientia, et arte aedificandi
in nova aetate consilium et analysisem.

Dispatches on AI, edge systems, and the ideas behind the work — exploring the overlap between technology, craft, and the act of building. Published on Substack as KANUNI.

KANUNI · ON SUBSTACK