AI Engineering Manager | AI Architect | Principal Data Scientist

Chirag Jain

15+ years building AI solutions • Leading teams to deploy LLMs, AI Agents, and GenAI systems at scale

About Me

Data Science Leader with 15+ years of experience (11 full-time, 4 freelance) designing, deploying, and scaling ML, DL, and LLM-based solutions across diverse industries. Currently focused on Generative AI, with deep expertise in GPT, LLMs, and AI Agents to accelerate innovation and deliver measurable business impact.

I lead high-performing teams at top organizations like Bain & Company, Citi, Barclays, and dunnhumby (Tesco Group), building enterprise-grade AI solutions including Text-to-SQL systems, multi-domain RAG chatbots with AI agents, and voice analytics pipelines. My work has delivered over $90M in combined revenue impact and cost savings.

Proven track record of helping organizations achieve both topline growth and bottom-line efficiency through strategic AI implementations. I specialize in LLMOps, fine-tuning (LoRA, RLHF), RAG systems, and deploying scalable AI solutions with Docker, Kubernetes, and modern MLOps practices.

Skills & Expertise

LLMs & AI Agents

Related Skills
10 skills
GPT-4
Llama
Claude
Mistral
LangChain
CrewAI
+4 more skills

RAG & Fine-Tuning

Related Skills
7 skills
Vector Databases
Embeddings
Retrieval Systems
Context Management
Similarity Search
Document Processing
+1 more skills

MLOps & LLMOps

Related Skills
8 skills
Docker
Kubernetes
CI/CD
Model Deployment
Monitoring
A/B Testing
+2 more skills

Team Leadership

Related Skills
6 skills
Team Management
Cross-functional Collaboration
Project Planning
Stakeholder Management
Mentoring
Agile Methodologies

Deep Learning & NLP

Related Skills
8 skills
Transformers
BERT
PyTorch
TensorFlow
Text Classification
Named Entity Recognition
+2 more skills

Data Engineering

Related Skills
7 skills
PySpark
SQL
ETL Pipelines
Data Warehousing
Big Data Processing
Real-time Streaming
+1 more skills

Technical Stack

PythonLangChainCrewAIAutoGenGPT-4LlamaClaudeMistralHugging FacePyTorchTensorFlowDockerKubernetesPySparkAWSGCPAzureAirflowMLflowWeights & BiasesSQLTableauPower BIGit

Featured Projects

Multi-Domain RAG Chatbots with AI Agents

Designed and deployed multi-domain RAG chatbots fine-tuned with LoRA at Bain & Company. Integrated with n8n, LangGraph, and CrewAI for workflow orchestration and multi-agent collaboration, using MCP for context exchange across enterprise systems.

LangGraphCrewAILoRAn8nMCP
View Project

Text-to-SQL with GenAI & LLMs

Built enterprise Text-to-SQL solutions using Generative AI/LLMs fine-tuned on domain-specific schemas. Improved query accuracy by 30-40% and enabled self-service analytics across business teams, transforming natural language into complex SQL queries.

GPT-4LangChainFine-tuningRAG
View Project

GenAI Voice Analytics Pipeline

Architected and deployed a GenAI-powered voice analytics pipeline at Citi, leveraging Whisper, LLMs, and embedding-based clustering to process 50M+ customer calls. Extracted complaint themes and sentiment, delivering $14M in cost savings and $72M in incremental revenue.

WhisperBERTFlan T5LlamaDocker
View Project

Personalized Recommendation System

Built a recommendation system for Tesco at dunnhumby, identifying top personalized offers for millions of users. Leveraged collaborative filtering, user-item embeddings, and purchase history analytics on 100B+ retail transactions per year.

PySparkCollaborative FilteringGCPEmbeddings
View Project

Experience

Data Science Manager - AI Architect

Bain & Company - NPS Prism

Apr 2024 - Present
  • Led and built a high-performing AI team (8 Data Scientists, 2 Data Engineers) to modernize NPS Prism with AI-driven automation, deploying scalable solutions using Docker and Kubernetes
  • Built Text-to-SQL solutions using Generative AI/LLMs fine-tuned on domain-specific schemas, improving query accuracy by 30–40% and enabling self-service analytics across business teams
  • Designed and deployed multi-domain RAG chatbots fine-tuned with LoRA, integrated with n8n, LangGraph, and CrewAI for workflow orchestration and multi-agent collaboration, using MCP for context exchange
  • Improved survey data quality by detecting fraudulent and low-value responses through AI-based content validation, saving $4M annually and enabling more reliable insights
  • Implemented RLHF pipelines to optimize LLM responses for business-specific tasks, aligning model outputs with analyst preferences and improving factual accuracy and relevance by 25%

Data Science Manager (AVP)

Citi

Feb 2023 - Apr 2024
  • Managed a team of 7 data scientists to architect and deploy a GenAI-powered voice analytics pipeline leveraging speech-to-text models, LLMs, and embedding-based clustering to process 50M+ customer calls and extract complaint themes, sentiment, and intent at scale
  • Optimized end-to-end LLMOps workflows using Docker, Kubernetes, and CI/CD for scalable model training and deployment, enabling targeted AI-driven interventions that reduced complaint volume and delivered $14M in cost savings and $72M in incremental revenue
  • Data Size - 12 million calls per year, 30 million customers & 2 billion touchpoints || Tech Stack - OpenAI's Whisper, BERT-based QnA, Flan T5 & Llama

Research Data Science Manager (AVP)

Barclays Investment Bank

Jan 2022 - Dec 2022
  • Led and managed a team of 5 research data scientists to develop deep learning models (LSTMs and transformer-based architectures) that analyzed news articles and quarterly reports to quantify their influence on stock price movements
  • Authored two research papers for Barclays Data & Investment Science, exploring the use of Natural Language Processing to predict stock prices
  • Data Size - 500 million news articles, 600 billion market data points || Tech Stack - Python, PySpark, Azure, NLP, Transformers, LLMs, Topic Modeling, Regressions, Neural Networks

Lead Applied Data Scientist

dunnhumby (part of Tesco Group)

Oct 2018 - Jan 2022
  • Worked on big retail transaction-level datasets to develop and implement ML/DL models to identify optimal Pricing, Promotion, & Assortment strategies for leading global retailers in the world
  • Built a recommendation system for Tesco to identify top personalized offers for users, leveraging collaborative filtering, user-item embeddings, and purchase history analytics
  • Data Size - ~100 billion item level transactions per year || Tech – Python, PySpark, GCP

Senior Data Scientist

Tredence

Sep 2015 - Sep 2018
  • Developed machine learning classification models to classify 3M SKUs into predefined product hierarchy
  • Built unified customer profiles using demographic segmentation and behavioral clustering
  • Designed and enhanced multiple dashboards for a leading retailer tracking customer experience metrics

Data Scientist

Forgify 3D Tech

Jul 2014 - Sep 2015
  • Improved performance accuracy by 25% through creating 20+ dashboards for real-time metric tracking
  • Improved ROAS by 20% by optimizing spending across 3 marketing channels

Freelance Data Scientist

Codementor

Jul 2010 - Jun 2014
  • Completed 300+ Data Science projects with an average rating of 4.98/5
  • Profile Link - https://www.codementor.io/@chiirag
  • Developed and delivered 150+ projects using various ML/DL techniques
  • Created 50+ dashboards using Tableau, Power BI, Dash, Plotly, Matplotlib, Seaborn

Let's Connect

Interested in collaborating or have a project in mind? I'd love to hear from you!