A highly passionate and results-driven AI Engineer specializing in Generative AI & Agentic AI Systems.
With 1.5 years of hands-on experience across the full AI project lifecycle, I build, optimize, and deploy scalable AI-driven solutions.
- Agentic AI & Orchestration: CrewAI, LangChain Agents
- LLM Inference & Optimization: Llama.cpp Server, vLLM, Ollama
- Generative AI Applications: Retrieval-Augmented Generation (RAG), Conversational AI, Autonomous Agents
- Fine-tuning & Cloud Training: RunPod GPU Cloud, LoRA/QLoRA fine-tuning
- Deployment & Frontend: Streamlit Apps, Docker, API-driven integrations
- Cloud & Infra: AWS ECR, AWS ECS (Fargate/EC2), Task Definitions, Application Load Balancer, Auto Scaling
- Programming Languages: Python (core), with focus on ML/AI frameworks
- 1.5 years building and deploying LLM-powered solutions in production.
- Designed & optimized RAG-based chatbots, knowledge assistants, and multi-agent workflows using CrewAI/LangChain.
- Optimized inference paths with Llama.cpp/vLLM/Ollama to reduce latency and cost across multi-model deployments.
- Built Streamlit dashboards and conversational UIs for rapid iteration and stakeholder demos.
- Implemented AWS ECR + ECS deployments with robust Task Definitions, environment secrets, autoscaling, and ALB routing.
- Deployed fine-tuned models on RunPod Cloud GPUs, leveraging LoRA/QLoRA strategies for cost-efficient training.