SURENDIRAN SELVAM

AI Lead Engineer

Building and delivering end-to-end AI platforms with hybrid RAG, scalable AI infrastructure, agentic workflows and multi-agent systems.

Trusted in enterprise-grade AI platforms and live production environments.

10+

Years Engineering

Software & AI

3+

AI Projects

RAG • Agentic AI • Multi Agents • AI Infra

10+

Automation Projects

Infra • Platform • Dev Productivity

5+

Domain Expertise

ERP • BFSI • Health
Logistics • CMS

2+

Cloud Platforms

GCP • Azure

What I Do

AI Platform Architecture

Architecting enterprise-grade AI platforms from the ground up. Designing scalable backends, distributed services, orchestration layers, cloud-native pipelines, and production observability systems that power mission-critical applications at scale.

RAG & LLM-OPS

Building production RAG systems with hybrid search, semantic indexing, and comprehensive evaluation workflows. Delivering end-to-end LLM-OPS solutions with monitoring, versioning, and optimization for reliable AI responses in production environments.

Multi-Agent Systems

Designing intelligent agentic systems with reasoning graphs, tool-calling workflows, and autonomous multi-agent orchestration. Creating systems that handle complex decision-making, planning, and coordination for enterprise-scale use cases.

Infrastructure & Automation

Engineering robust CI/CD systems, automation frameworks, and platform tooling that enhance reliability and accelerate deployment. Building infrastructure that enables teams to ship faster with confidence and maintain production stability.

Featured Projects

🚀 Multi-Agent Product Intelligence Platform

Production-ready multi-agent AI platform demonstrating enterprise-grade architecture with intelligent orchestration, hybrid search, and scalable infrastructure. Built with end-to-end observability and type-safe AI systems.

OpenAIFastAPILangChainQdrantLangSmithRagasDockerGCP
View Project →

Technical Skills

AI & LLM Engineering

  • RAG Pipelines
  • Hybrid Search (Semantic + Keyword)
  • LangChain / LangGraph
  • LangSmith (Observability)
  • Embedding Models
  • Prompt Engineering & Context Design
  • Vector Databases (Qdrant)
  • RAG Evaluation (RAGAS)
  • Generative AI / Foundation Models
  • Agent-to-Agent Protocol (A2A)
  • Model Context Protocol (MCP)

Backend & API Engineering

  • Python
  • Java
  • FastAPI
  • REST / GraphQL
  • Async Architectures & Middleware Systems
  • SQL Databases

Cloud & Infrastructure

  • GCP Vertex AI
  • Microsoft Azure
  • Docker
  • Kubernetes
  • Jenkins
  • Prometheus / Grafana
  • Terraform
  • Linux
  • LLM Serving & Inference (vLLM)
  • ELK Stack (Elasticsearch, Logstash, Kibana)

Automation & Dev Productivity

  • CI/CD Automation
  • Selenium
  • Rest Assured
  • Pytest
  • JUnit
  • Github Actions
  • AI-Assisted Development (GitHub Copilot, Cursor)

Experience

Senior Engineer

Current

Jan 2024 – Present

EPAM Systems
PythonLangChainLangGraphLangSmithOpenAIGeminiGroqLiteLLMInstructorPydanticQdrantHybrid Search (BM25 + Semantic)RRFRAGASFastAPIPostgreSQLEliteA
  • Led end-to-end Gen AI platform delivery for Fortune client: Confluence-to-RAG ingestion with near real-time KB updates, multi-agent workflows (90% accuracy), and hybrid retrieval optimization (25–35% improvement)
  • Architected AI governance framework with PII redaction, bias mitigation, and reliability controls, ensuring production compliance and improving query quality
  • Optimized platform performance (20–25% latency reduction) and operational efficiency (50–60% triage time reduction) using LLMOps observability with LangSmith
  • Drove AI vendor selection and cost optimization (40% reduction) while maintaining quality, influencing vector database and retrieval architecture decisions

Lead Automation Engineer (Automation Platform)

Aug 2021 – Jan 2024

Encora Innovation Labs
Azure DevOpsAzure PipelinesJenkinsDockerDocker ComposeKubernetesHelmELK StackPrometheusGrafanaSonarQubeSelenium GridJava 17SeleniumRest AssuredSpring BootJMeterPostman
  • Built Gen AI-powered RAG system for CI/CD failure triage: indexed pipeline logs, ELK dashboards, and known-fix notes to generate root causes and remediation steps with citations, reducing SME dependency and improving incident response speed
  • Developed and operated shared automation platform for 8 teams (99%+ success rate): optimized CI/CD pipelines (50–70% manual effort reduction), standardized Kubernetes deployments (hours to <1 hour setup), and scaled Selenium Grid (50+ parallel sessions)
  • Centralized observability infrastructure: ingested tens of GBs of logs daily into ELK/Kibana, implemented Prometheus/Grafana monitoring, reducing diagnosis time by 40–50% and infrastructure failures by 30–40%

Software Engineer (Automation & Platform)

Aug 2019 – 2021

ASG Technologies
JavaSelenium WebDriverJUnit4MavenJenkinsDockerKubernetesHelmTomcatOracle WebLogicPostgreSQLSQL ServerApache httpdNginxSoapUI ProPostmanSwaggerJMeter
  • Led team of 3 engineers, delivering automation platform with containerized environments (Docker/Kubernetes) and CI/CD pipelines, reducing provisioning time by 60% and enabling zero-downtime deployments

Software Engineer

Jun 2018 – Jul 2019

Wipro
JavaSelenium WebDriverJUnit4MavenJenkinsDockerKubernetesHelmTomcatOracle WebLogicPostgreSQLSQL ServerApache httpdNginxSoapUI ProPostmanSwaggerJMeter
  • Contributed to Mobius View (Federated Enterprise Content Search & Archive): built containerized test environments, developed CI/CD pipelines, and maintained API automation for multi-source content management platform

Software Engineer

Mar 2018 – May 2018

OASYS Cybernetics
Java 1.8Selenium 3.6.0TestNGMavenJenkinsPostgreSQL 9Apache Tomcat 8PostmanSQL
  • Worked on TNPDS (Tamil Nadu Public Distribution System): developed test automation for large-scale statewide retail network (~25,000 distribution points) using Java, Selenium, and CI/CD pipelines

Software Engineer

Jun 2015 – Feb 2018

Finatel Technologies
Java 1.8Selenium 3.6.0TestNGMavenJenkinsPostgreSQL 9Apache Tomcat 8PostmanSQL
  • Developed test automation for TNPDS (Tamil Nadu Public Distribution System): achieved 70% automation coverage using Java/Selenium, reducing maintenance effort by 40% and supporting large-scale statewide retail network

Certifications

Google Cloud Professional Machine Learning Engineer

Google Cloud

October 2025

Certified Kubernetes Administrator (CKA)

Cloud Native Computing Foundation

September 2025

Microsoft Azure AI Engineer Associate

Microsoft

April 2025

Microsoft Azure Fundamentals

Microsoft

April 2025

About Me

AI Silhouette

I architect and build enterprise AI platforms that solve complex business challenges at scale. With 10+ years of engineering experience, I specialize in designing production-grade AI systems from the ground up — combining deep technical expertise with strategic thinking to deliver solutions that are both innovative and reliable.

My approach centers on clean architecture, scalable infrastructure, and production-first design. I've led the development of multi-agent AI systems, hybrid RAG platforms, and observability frameworks that power real-world applications serving thousands of users.

Beyond code, I focus on the entire lifecycle: architecture design, implementation, deployment, monitoring, and continuous optimization. I believe great AI systems are built on solid engineering foundations — proper observability, robust error handling, and thoughtful design patterns that enable teams to iterate quickly and deploy with confidence.

Connect

Let's Build Something Great Together

Interested in discussing AI architecture, production systems, or technical consulting? Let's explore how we can collaborate on your next AI initiative.

© 2026 Surendiran Selvam. All rights reserved.

Next.jsTypeScriptEngineered with AI

Built with ❤️ by Surendiran