FoJobPreviewBackLink:Senior DevOps / Jakarta

Responsibilities

  • Champion and implement a culture of SRE to maintain a high-quality platform infrastructure
  • Champion and implement application and infrastructure monitoring and alerting to prevent client impacting issues by ensuring system availability, performance and scalability to maintain SLOs and SLAs
  • Optimize application performance at scale
  • Define and support continuous integration and deployment pipelines (CI/CD), handle metal hardware, cloud instances, containers, sandboxes, networking, VPNs, storage, databases, caches, websites, monitoring, logging, backups, ETL, security of web services, etc.
  • Dive deep into technology and stay on the forefront of the latest tools, technologies, and strategies; help evaluate, prototype, and integrate them into work processes
  • Perform with broad independence and deliver on project milestones and tasks on schedule while communicating progress regularly
  • Build strong relationships with SRE team members and software engineering teams to hold each other accountable for quality expectations
  • Evangelize best practices, eliminate bottlenecks, and improve process
  • Maintenance of monitoring, logging, and backup systems
  • Quick respond to breakage and security incidents
  • Author documentation and guides for infrastructure and tooling
  • Collaborate with developer teams to ensure timely delivery for Sandbox and Production
  • Write automated tests to ensure error-free code and performance.
  • Implement best practices in security and data protection

Requirements

  • Hold minimum Bachelors or Masters Degree in Computer Science or equivalent work experience.
  • 5+ years demonstrating hands-on technical leadership and business impact in combining software skills with systems to solve complex automation and reliability challenges
  • 5+ years working with various cloud providers, containerization technologies, automated deployment frameworks, orchestration frameworks, monitoring, logging, alerting, system internals, networking, databases, distributed systems, and service-oriented architecture
  • 5+ years instrumenting proactive alerting and monitoring systems technologies (e.g., Splunk, Grafana, New Relic, Datadog, VictoriaMetrics)
  • Strong experience with automation tools like Rundeck, Ansible.
  • Minimum 3+ years of experience writing software in any modern software language such as C#.NET, Java, Javascript, , React.
  • Minimum 3+ years of experience with open-source CI/CD tools like Jenkins, Gitlab, Github Actions
  • Proven track record to implement load, stress, performance and reliability testing standards at scale to improve service, platform and infrastructure resiliency
  • Strong experience with ELK stack and familiar with microservices architectures