Machine Learning Engineer (MLE) ensures that AI and machine learning models transition seamlessly from experimentation to production through well-structured, automated, and continuously improving pipelines. This role is responsible for code refactoring, CI/CD updates, pipeline monitoring in the experiment layer, and model retraining, ensuring AI systems are scalable, stable, and high-performing throughout their lifecycle. MLE ensures model code is production-ready, pipelines are automated and monitored, and models are retrained regularly to sustain accuracy and business value. The role combines strong engineering discipline with proactive problem-solving to make AI operationally reliable and value-generating across all domains of the AI CoE.
Responsibilities:
- Refactor and modularize data science code for scalability, efficiency, and deployment readiness.
- Maintain and enhance CI/CD pipelines to automate model testing, packaging, and deployment.
- Monitor ML pipelines in Vertex AI experiment layer; ensure job completion, accuracy, and resource optimization.
- Execute scheduled or trigger-based retraining to maintain model performance and prevent drift.
- Manage code and pipeline versioning; update documentation and logs after every retrain or release.
- Partner with Data Scientists and MLE Ops to troubleshoot deployment or retraining issues.
Qualifications:
- Bachelor's degree in Computer Science, Data Engineering, or Artificial Intelligence.
- 5–8 years of experience in machine learning engineering, MLOps, or AI deployment environments.
- Proficient in Python, Git, and cloud ML platforms such as Vertex AI (GCP) or equivalent (AWS SageMaker, Azure ML).
- Strong experience in CI/CD pipeline automation and code optimization for production deployment.
- Understanding of model retraining workflows, monitoring, and data drift detection.
- Excellent problem-solving, documentation, and collaboration skills in agile, cross-functional teams.