An applied AI lab — with clients.
We treat applied AI as its own craft, distinct from research and distinct from plain software engineering. Most of what we do lives in four areas. Most of those four areas share the same concerns: what happens when the model is wrong, and how you would know.
Things we’ve shipped.
A short list. The work below spans direct engagements and prior roles the team held; both shape what we know how to do, and we’re happy to walk through attribution on a call.
AI for international student management at a top-50 R1 university.
Active engagement with the University at Buffalo on bringing AI into the workflows that support international students — admissions correspondence, visa-status tracking, advising-load triage, and the long tail of repetitive document review that bottlenecks the office today. The lab’s thesis: don’t replace the advisors, give them an LLM-aware workspace that reads the full document trail, surfaces what matters, and hands off cleanly when the student needs a human. Scoped, evaluated, shipped behind feature flags.
- Document understanding & classification
- Conversational triage
- RAG over institutional records
- Eval design with the office
- Privacy & FERPA-aware design
A code-stub generator that cut content prep by 70%.
iMocha’s content team writes coding-assessment problems for engineering hiring. The bottleneck was boilerplate: every problem needed working starter stubs in a dozen languages. Built a multi-lingual code-stub generator that produces correct, idiomatic stubs across 13+ programming languages, reducing stub-generation time by 70% and freeing roughly 150 hours of content-engineering work per cycle. Also wrote the validation framework that auto-generates and checks unit tests across the question library — 1,000+ problems, 80% reduction in QA cycles.
- Multi-language codegen
- Schema-validated content pipeline
- Automated test generation
- iMocha MS Teams app integration
- Python · Django
YouTube live integration for a screen-recording workflow.
Contracted to integrate live YouTube feeds directly into a third-party screen recorder so creators could splice in live broadcast streams without leaving the recording surface. The interesting part was not the API plumbing — it was the timing model and the failure modes when an upstream feed silently degrades. Shipped, handed off, kept the team on email if anything broke.
- Live video ingestion
- Cross-app integration
- Failure-mode handling
- Clean handoff
Turn any document into a personalized assessment.
An AI-driven quiz platform that reads any user-uploaded text document and generates contextually relevant quiz questions for educational use cases. Built around semantic similarity search with sentence embeddings indexed in FAISS to extract key topics, then targeted prompt strategies to generate questions that test understanding rather than recall. The interesting design question wasn’t generation — it was how to score “is this question actually good?” without a human in the loop.
- Document chunking & topic extraction
- FAISS vector indexing
- Prompted question generation
- Auto-scoring of question quality
- Open source
Three shapes, one bar.
Most engagements start as one shape and evolve into another. We price the outcome, not the hours. The bar does not move: we only take work where the AI question is the main question.
A focused answer.
You have a sharp AI question — feasibility, cost, approach, or a contentious architectural call. We return working code, a short memo, and a recommendation you can act on. Scope is deliberately tight; the deliverable is clarity.
Take it to production.
We embed alongside your engineers and build the AI layer of your product with them. Eval harness from week one, weekly demos, clean handoff. You own the code at the end; we stay long enough to watch it run.
A second set of eyes, kept.
Retained monthly. Weekly calls, async reviews, the occasional pairing day. For teams shipping AI in earnest who want a sharper partner on the hard calls — and someone who will push back when the internal consensus is wrong.