Oona Insurance is seeking a Lead DevOps Engineer to drive the design, implementation, and operation of our next-generation cloud infrastructure. This role combines hands-on technical expertise with leadership responsibilities, guiding a high-performing DevOps team in building secure, scalable, and resilient environments on AWS, with a strong focus on serverless architectures.
You'll play a key role in shaping our cloud strategy, mentoring engineers, and ensuring seamless integration between infrastructure and applications—while championing automation, observability, and reliability across the platform.
Key Responsibilities
- Lead and mentor the DevOps team in designing, deploying, and managing infrastructure on AWS serverless services (Lambda, API Gateway, EventBridge).
- Oversee and manage VPCs, load balancers, networking, and IAM security policies to maintain reliability and security.
- Drive adoption of Infrastructure as Code (IaC) using Terraform for provisioning, configuration, and scaling.
- Design, implement, and maintain CI/CD pipelines in Jenkins to ensure efficient and reliable software delivery.
- Implement robust monitoring, logging, and alerting solutions (CloudWatch, X-Ray, ELK, Grafana, Prometheus).
- Collaborate with development and architecture teams for seamless application-to-infrastructure integration.
- Lead incident response efforts, conduct root cause analysis, and implement reliability improvements.
- Continuously optimize AWS infrastructure for performance, security, and cost efficiency.
- Foster a culture of ownership, accountability, and continuous learning within the DevOps team.
- Act as a bridge between engineering leadership and technical teams, aligning DevOps practices with broader business goals.
Qualifications
- Bachelor's degree in Computer Science, IT, or related field (or equivalent work experience).
- 6+ years in DevOps, SRE, or Cloud Engineering, including 1–2 years in a leadership role.
- Deep expertise with AWS services, particularly serverless (Lambda, API Gateway, EventBridge).
- Strong hands-on experience with Terraform (modules, state management, automation).
- Proficiency in Jenkins and CI/CD pipeline design.
- Solid understanding of VPC design, networking, load balancers (ALB/NLB), and IAM security.
- Scripting/programming proficiency (Python, Bash, or similar).
- Experience with observability tools (CloudWatch, ELK, Grafana, Prometheus).
- Familiarity with containers (ECS, EKS, Docker) is a plus.
- Strong leadership, communication, and collaboration skills.
- Proven ability to inspire, coach, and develop high-performing engineering teams.
- Skilled in balancing technical depth with strategic decision-making to guide long-term platform vision.