Hi, welcome to my website! :)
Research
My research explores efficiency problems in machine learning systems, focusing on how intelligent algorithmic design can eliminate resource requirements that limit broader participation in AI development and deployment.
I'm currently investigating several interconnected approaches: agreement-based cascading methods for efficient model routing, semantic approaches to extracting parallelism from natural language queries, forward-pass techniques for model compression, memory-efficient KV-cache management for long-context reasoning, and compression for on-device (edge) deployment. My broader research vision centers on demonstrating that accessibility and performance are not opposing forces; that thoughtful design choices can achieve both.