Responsibilities
- Develop, deploy, and maintain LLM-powered applications supporting natural language understanding, conversational search, and intelligent automation.
- Build and optimize RAG pipelines, including embeddings, chunking strategies, retrieval logic, ranking, evaluation, and continuous improvement loops.
- Design and implement agent-based workflows for reasoning, tool usage, routing, and structured task execution.
- Apply advanced prompt engineering techniques—including chain prompting, few-shot prompting, and structured prompting—to improve LLM reliability and output quality.
- Architect and implement high-performance applications and scalable backend systems that incorporate AI capabilities into diverse solutions.
- Build full-stack features using React (or similar), Node.js/Python, REST/GraphQL APIs, and modern backend patterns.
- Deploy and operate AI applications in cloud environments (AWS, Azure, GCP) using Docker, Kubernetes, and CI/CD tooling.
- Ensure system-level reliability with robust monitoring, observability, telemetry, and performance metrics for LLM workloads.
- Partner with product managers, business teams, and engineers to align AI initiatives with business priorities and user needs.
- Translate technical concepts through clear presentations, design reviews, and documentation, ensuring shared understanding across teams.
- Participate in scoping, requirements definition, and architectural discussions, influencing design decisions and system roadmaps.
- Implement evaluation frameworks for RAG, agents, and LLM responses (latency, retrieval accuracy, hallucination detection, etc.).
- Continuously refine data quality, retrieval accuracy, indexing strategies, and model selection processes.
- Drive process improvements across the AI development lifecycle to enhance system robustness and engineering productivity.
- Diagnose and resolve issues across retrieval layers, LLM integration, application logic, and infrastructure.
Core Qualifications
- Experience building scalable, cloud-native applications.
- Experience designing and deploying Generative AI applications or LLM-powered systems.
- Practical experience with Open AI GPT, Azure Open AI, AWS Bedrock, Gemini, Vertex AI, or similar LLM ecosystems.
- Strong background in designing and optimizing RAG architectures and retrieval workflows.
- Experience with at least one orchestration or agent framework, such as:
- Experience with vector databases, such as Pinecone, Weaviate, FAISS, Milvus, or similar technologies.
- Experience implementing and evaluating embedding models and retrieval pipelines (dense, hybrid, cross-encoder).
- Proficiency with prompt engineering, including chain prompting, iterative refinement, and structured output patterns.
- Strong frontend development and backend experience with Node.js/Python, REST APIs, SQL/NoSQL databases.
- Deployment experience with Docker, Kubernetes, and CI/CD pipelines across AWS, Azure, or GCP.