Senior AI Engineer
How to Apply:
Please submit your application to [email protected]
Job Title: Senior AI Engineer
Location: Bangalore, India
Department: IT
Position Summary:
As PennEngineering accelerates its Speed of Now transformation (respond in 1 hour, quote in 1 day, samples in 1 week, finished product in 1 month), we are building an internal capability to design, develop, and deploy AI-powered workflows, automation, and agentic solutions that improve speed, consistency, and quality across the business.
The Senior AI Engineer plays a pivotal role in this transformation: setting technical direction for the AI engineering team, designing systems that are secure, observable, and maintainable at enterprise scale, and ensuring that agentic solutions deliver reliable, measurable business value. You bring a combination of deep AI engineering capability and strong engineering fundamentals — distributed systems design, API architecture, infrastructure, and data engineering — that allows you to own technical quality end-to-end. You are a force multiplier: your architecture decisions, code reviews, and technical mentorship raise the output quality of the entire team.
Key Responsibility
- AI Architecture & Technical Leadership
- Define and evolve the AI engineering architecture for PennEngineering’s agentic application platform — including agent orchestration patterns, memory and context management, tool registries, and evaluation infrastructure
- Lead the technical design of complex, multi-agent systems involving planning, delegation, parallelism, and dynamic tool selection
- Establish engineering standards for prompt management, agent versioning, evaluation harnesses, and production observability
- Drive architecture decisions that balance capability, cost, latency, safety, and maintainability across the agent portfolio
- Evaluate and adopt emerging tools, frameworks, and patterns including Model Context Protocol (MCP) and new model releases — with sound technical judgement
- Own the team’s AI-assisted coding toolchain (Cursor, Claude Code, Amazon Kiro, or similar) define, document, and continuously improve the coding workflows and standards the team uses with these tools, and stay current with how the tooling landscape evolves
- End-to-End System Design
- Design scalable backend systems and service architectures that support AI workloads including asynchronous processing, event-driven architectures, and stateful orchestration
- Own the design of data pipelines that supply AI agents with clean, governed, timely data from ingestion and transformation through to vector storage and retrieval
- Design robust API layers, integration patterns, and service boundaries that allow AI agents to interact safely with enterprise systems at scale
- Architect infrastructure for AI environments using Terraform or AWS CDK — including networking, IAM, secrets management, compute, and storage
- Define and implement deployment strategies - blue/green, canary, feature flags appropriate for AI systems where model behavior changes require careful rollout
- Agile Delivery & Integration
Operate in an iterative agile model:- User-story intake and prioritization
- AI system design, build, and SME validation
- Pilot deployment and data-driven refinement
- Establish a predictable pipeline and regular release cadence
- Ensure traceability from user story to deployed solution to measured business outcome
- Integrate AI agents with enterprise platforms including ERP, CRM, document management, and structured/unstructured data stores
- Collaborate with IS solution architects to align designs with enterprise architecture standards
- Ensure all AI solutions meet corporate security, compliance, and audit requirements
- Production Reliability & Observability
- Establish observability standards: structured logging of agent reasoning traces, token usage tracking, latency profiling, cost attribution, and quality drift detection
- Design and implement automated evaluation pipelines that run regression tests against production agent behavior on every deployment
- Define SLOs and operational runbooks for AI services; lead incident response and root-cause analysis for production issues
- Implement guardrails, circuit breakers, and fallback strategies for agent systems operating in high-stakes enterprise contexts
- Partner with IS/IT security and compliance teams to perform risk assessments and support internal audits of AI systems
- Team Leadership & Mentorship
- Provide technical mentorship to AI Engineers and Associate AI Engineers through code reviews, pairing sessions, and design discussions
- Lead architectural review sessions and champion engineering quality, testing discipline, and documentation standards
- Collaborate with the Principal Systems Architect on roadmap prioritization, resource planning, and cross-functional delivery
Requirements:
- 6–8 years of overall software engineering experience, with at least 3 years focused on AI/LLM application development and 1+ years designing multi-agent or complex agentic systems in production
- Proven ability to design and deliver end-to-end technical systems from data and infrastructure through application logic to monitoring and operations
- Deep expertise in Python and strong proficiency in at least one additional language (TypeScript/Node.js, Java, or Go) used in backend or integration contexts
- Advanced experience with agentic frameworks: LangGraph, CrewAI, AutoGen, AWS Bedrock Agents, or custom orchestration — including multi-agent coordination, state management, and tool-use patterns
- Production-grade experience with RAG systems at scale: advanced retrieval strategies, hybrid search, re-ranking pipelines, evaluation, and knowledge base maintenance
- Hands-on infrastructure engineering experience: Terraform or AWS CDK, CI/CD pipeline design, container orchestration (ECS or EKS), and IAM/security configuration on AWS
- Experience designing and operating distributed backend systems: event-driven architectures, async processing, API design, and service integration patterns
- Strong track record of production observability: structured logging, distributed tracing, metrics, alerting, and cost management for cloud-native AI workloads
- Deep, hands-on experience with AI-assisted coding tools (Cursor, Claude Code, Amazon Kiro, or similar) including the ability to design, document, and govern team-wide coding workflows that leverage these tools, evaluate new entrants in the space, and drive adoption best practices across the engineering team
- Demonstrated ability to mentor engineers and lead technical design discussions with diverse stakeholders
- Bachelor’s degree in computer science, Engineering, or a related technical field; advanced degree a plus
Preferred Qualifications:
- Experience with AWS Quick Suite or similar enterprise AI platforms
- Experience designing AI platforms or internal developer tooling that enables other engineers to build and deploy agents more efficiently
- Familiarity with Model Context Protocol (MCP), agent interoperability standards, and tool-use specifications
- Experience with fine-tuning, RLHF, or adapting open-source models for domain-specific enterprise tasks
- Knowledge of data platform engineering: Snowflake, Databricks, DBT, or similar for managing the data layer that feeds AI systems
- Experience with AI governance frameworks, responsible AI practices, or ML model risk management in regulated or enterprise environments
- Experience in manufacturing, supply chain, or industrial environments
- Exposure to frontend or UI engineering sufficient to design thin interfaces, dashboards, or operator tooling that complements AI back-end systems
