AI & ML
Gen AI Developer/AI ML Engineer
We are looking for a GenAI/AI-ML engineer to build and own core LLM/RAG services, agentic workflows, and the integration layer that connects AI capabilities to our platform. You will work Python-first, ship streaming APIs, manage prompt lifecycles, and ensure safety, observability, and performance at scale.
What We're Looking For
Everything you need to know about this role — responsibilities and the skills we value.
Key Responsibilities
- Build LLM/RAG services in Python (FastAPI/asyncio, Pydantic) with clean APIs and tests.
- Implement agentic AI workflows — tool-using agents with planning, memory, multi-step execution, and recovery/fallback paths.
- Stream responses server-side from model to UI (SSE/WebSockets with retries, backoff, partial responses, and cancellation).
- Build RAG pipelines: ingestion, chunking, embeddings, indexing, reranking, and grounded answers with citations (pgvector/FAISS/Pinecone).
- Manage prompt versioning, templates/params, safe fallbacks, and rollbacks.
- Run evaluations: hallucination/groundedness checks, regression suites for prompts and retrievers.
- Implement guardrails: PII detection/redaction, content safety, and domain constraints.
- Optimise for performance and cost: context reduction, caching/batching, request pacing, and rate-limit handling.
- Design and ship REST/gRPC endpoints that orchestrate tools, retrieval, and post-processing (citations/formatting).
- Implement and consume MCP (Model Context Protocol) tool adapters — files, web, DB connectors — with capability negotiation, auth/permissions, and resource limits.
- Integrate vector stores (pgvector/FAISS/Pinecone) plus DB/file stores and enterprise connectors (SharePoint, Confluence, Slack, Jira).
- Own AuthN/Z and tenancy: JWT/OAuth, role/tenant isolation, secrets management, and audit logging for AI actions.
- Set up observability: logs, metrics, traces, and dashboards/alerts for latency, error rate, and token spend.
- Manage queues/workflows for background ingestion/summarisation with idempotency and retries.
- Enforce data governance: input/output validation, schema contracts, PII handling, and retention policies.
- Maintain CI/CD for prompts, retrieval configs, and API changes; support blue/green or canary releases.
Required Skills & Qualifications
- Strong Python skills — FastAPI, asyncio, Pydantic; clean, tested, production-ready code.
- Hands-on experience building LLM-powered applications (OpenAI, Gemini, Llama, or similar).
- Experience with RAG architectures and vector databases (pgvector, FAISS, Pinecone, or similar).
- Familiarity with agentic frameworks (LangChain, LlamaIndex, CrewAI, or custom implementations).
- Knowledge of streaming APIs: SSE and WebSockets.
- Understanding of prompt engineering, prompt versioning, and evaluation techniques.
- Experience integrating REST/gRPC APIs and enterprise connectors.
- Awareness of AI safety, guardrails, PII handling, and responsible AI practices.
- Familiarity with observability tooling (OpenTelemetry, Grafana, Prometheus, or similar).
- Good understanding of CI/CD pipelines and DevOps practices in an AI/ML context.
Job Overview
Department
AI & ML
Location
Bengaluru, India
Job Type
Full-Time
Work Mode
On-site
We are looking for a GenAI/AI-ML engineer to build and own core LLM/RAG services, agentic workflows, and the integration layer that connects AI capabilities to our platform. You will work Python-first, ship streaming APIs, manage prompt lifecycles, and ensure safety, observability, and performance at scale.