We're looking for someone beyond prompt-tuning — someone who has built, deployed, and optimized real RAG systems and LLM pipelines at scale.

Meruya, West Jakarta

Full-time

Apply / refer

What You'll Do:

Lead the design and deployment of production-grade RAG and LLM systems

Optimize retrieval accuracy, context quality, and model performance at scale

Build robust data pipelines for ingestion, chunking, embedding, and indexing

Collaborate with data engineers, software teams, and product leads to integrate AI into core features

Set best practices for prompt engineering, evaluation, and monitoring

Mentor junior engineers and drive technical direction for GenAI projects

Who You Are:

4+ years in ML engineering, with 1.5+ years focused on LLMs and RAG systems

Deep hands-on experience with LangChain, LlamaIndex, vector databases (Pinecone, Weaviate, FAISS)

Strong in Python, PyTorch/TensorFlow, and MLOps tools (MLflow, Kubernetes, Docker)

Experience deploying models on AWS, GCP, or Azure or hybrid environments

Familiar with evaluation frameworks, A/B testing, and latency optimization

Bonus: Experience with fine-tuning, LoRA, or distillation for domain-specific performance

Practical mindset you care about reliability, cost, and impact, not just novelty

Perks & Benefits:

Senior ML Engineer