AI/ML Engineer specializing in LLM evaluation, RAG architectures, and production MLOps. I build real, tested systems end to end — from model development to containerized deployment — not just notebooks.
What I do
- Build and evaluate LLM systems: prompting strategies, RAG pipelines (FAISS, LangChain), and agentic workflows (LangGraph), with local model serving via Ollama and vLLM.
- Ship production ML: FastAPI services, Docker, GitHub Actions CI/CD, and MLflow, with tested, modular codebases.
- Work across the ML lifecycle: transformer fine-tuning, multimodal models, and on-device/edge inference.
Research
Two papers accepted at ICTIS 2026 (Springer): one on evaluating prompt-design strategies for LLM-based code summarization, and one on multimodal emotion recognition using passive smartphone sensors.
Selected projects
- RetrainRadar: a model-agnostic MLOps service that detects data drift (PSI, KS) and scores retrain-vs-wait cost trade-offs, with an agentic decision module. FastAPI, Docker, CI/CD, tested core.
- monoaudit: an open-source toolkit auditing ML systems for hidden bias, using disaggregated metrics and statistical baselines.
- Multimodal emotion-aware recommendation: an Android app fusing phone sensors and on-device facial detection, serving recommendations via a FastAPI and FAISS pipeline.
Education & certifications
M.Tech, Computer Engineering (CGPA 9.15). AWS Cloud Practitioner Essentials certified. GCP Vertex AI (in progress).
I am open to remote roles and opportunities with visa sponsorship.