Projects

Side work I keep coming back to. Each one started as something I wanted to use myself, then turned into a place to think out loud about agents, infrastructure, and the boring parts of shipping.

langgraph-rag-agent

Source ↗
Python · LangGraph · LangChain · ChromaDB · Anthropic API · OpenAI Embeddings · Pydantic · Langfuse · AWS Lambda · Docker · GitHub Actions

Multi-agent RAG over the LangChain Python docs. Five-node LangGraph with model routing: Haiku for verification, Sonnet for planning. The planner/verifier loop is cyclic — the planner sees the verifier's reason on rejection, so the second attempt is informed instead of identical.

A human-in-the-loop interrupt on SqliteSaver pauses execution one cycle before the budget runs out, with three resume paths (approve, rewrite, reject). Pydantic structured outputs keep the verifier honest; Langfuse traces sit on the driver for observability.

Deployed on AWS Lambda behind a key-throttled API Gateway. CI runs mocked pytest plus a 10-case real-API eval gate that calls Sonnet on every PR before promoting to Lambda.

LLM Shield

Source ↗
TypeScript · Node.js 20 · Express · Redis · ioredis · Vitest

Resilience proxy in front of a flaky upstream, built against the OpenAI API as the test target. The premise: I wanted to see how my own LLM integrations would behave when the provider has a bad day, and the only way to know was to put a layer between them.

Three patterns wired together from scratch. Idempotency keys in Redis so client retries don't duplicate upstream calls. Exponential backoff with jitter on transient errors. A circuit breaker that trips on an upstream error-rate threshold, fails fast while open, and tries a single probe before closing back.

The proxy speaks the OpenAI API verbatim, so existing clients point at it without code changes.

Python · Groq Whisper API · OpenAI/Groq chat APIs · PyYAML · pytest · ruff · GitHub Actions

Push-to-talk voice dictation for Linux. Whisper transcribes through Groq; an LLM step (Groq and OpenAI interchangeable) cleans spoken punctuation, handles bilingual translate-back, and does light rephrasing.

Hard-learned guardrails: skip on empty input, reject hallucinated length-ratio expansions, fall back to raw transcription on API failure. Versioned prompts by (mode, language); evals on a fixed test set replace vibes.

Runs about a dollar a month at daily use.

Roast My Repo

Source ↗
TypeScript · AWS Lambda · API Gateway · Groq · GitHub Actions

Roast My Repo is a live serverless API that analyzes any public GitHub repository and returns a brutally honest, LLM-generated code review — the kind a senior engineer would give if they had zero filter.

The backend is a TypeScript Lambda deployed on AWS behind API Gateway, with cold-start times under 300ms. It fetches the repo tree, samples the most relevant files, and passes them to Groq's inference API for fast, structured critique.

The whole thing ships through a GitHub Actions pipeline I wrote from scratch — lint, test, build, deploy to Lambda — so every push to main hits production within two minutes.

Try the live demo →