Published Work
Production applications and research across AI engineering, computer vision, and education technology.
Built a real-time multi-agent orchestration runtime where four thread-resumed AI musicians improvise together while the server enforces deterministic routing, state continuity, and final pattern composition.
- —Architected a dual-mode Codex and Strudel runtime that powers both assistant interactions and live multi-agent jam sessions.
- —Implemented server-owned orchestration for four isolated Codex-backed agent sessions with deterministic @mention routing and structured composition safeguards.
- —Added real-time streaming, multimodal control, and validation gates to keep the system responsive, reliable, and safe under live conditions.
Next.jsTypeScriptCodexStrudelMCPWebSockets
Deployed a stateful LLM application that turns deterministic story and lesson content into 10-chapter educational adventures, using validate-then-stream delivery, task-specific Gemini routing, and persistent session state to make non-deterministic generation reliable for real users.
- —Validate-then-stream delivery generates a full chapter, checks structural correctness with retries, and only then streams approved content to the user.
- —Deterministic story YAML and lesson CSV inputs are wrapped in task-specific Gemini routing, keeping content grounded without relying on embeddings or RAG.
- —Supabase-backed state, reconnect-safe WebSockets, deferred background tasks, and telemetry keep adventures resumable, responsive, and observable in a production-style stack.
FastAPISupabaseGemini 2.5Imagen 4RailwayReactWebSockets
NANA is a public BYOK study prototype that transforms dense PDFs into personalized study workspaces, combining adaptive notes, document overviews, and inline AI explanations around real study workflows.
- —Two-phase PDF pipeline with large-file preprocessing and splitting for cost-efficient note generation.
- —Personalized document overviews and study notes adapted to learner background and goals.
- —Inline AI commands, browser-resumable study state, and exportable Markdown notes built for real study workflows.
PythonFastAPITypeScriptReactViteGemini 3 Flash
End-to-end affective computing pipeline that turns noisy real-world video into music recommendations through dual-pathway ML, variance-weighted fusion, and full-stack delivery.
- —CLIP scene analysis + EmoNet facial cues fused through uncertainty-aware dual-pathway ML
- –Variance-weighted fusion with temporal smoothing improved VEATIC MAE by ~19% vs scene-only baselines
- –Delivered a React + Flask demo with synced playback, emotion visualizations, and DEAM music matching
PyTorchReactFlaskOpenCVCLIP ViT-B/32EmoNetMediaPipe
Values-alignment journal analyzing whether daily behavior reflects stated priorities across ten Schwartz dimensions.
- —VIF: ordinal MLP heads with MC Dropout uncertainty estimation
- —Synthetic data generation: 204 personas, 1,651 journal entries
- —Automated LLM judge labeling pipeline via Claude Code subagents
PyTorchShinyPolarsnomic-embed-textOpenAI APIThree.js
Expert-grounded ViT attention study testing whether vision models look at the architectural evidence human experts mark as diagnostic.
- —Benchmarked DINOv2, DINOv3, MAE, CLIP, SigLIP, SigLIP2, and ResNet-50 on 139 images with 631 expert boxes.
- —Found DINOv3 strongest as a frozen spatial-prior model; CLIP and MAE improve after adaptation but on different evidence.
- —Used IoU, Coverage, MSE, KL, EMD, calibrated baselines, bootstrap CIs, Wilcoxon tests, and Holm correction.
PyTorchFastAPIReactSQLiteVision Transformers