Back to search:AI Solutions / Jakarta
Site Reliability Engineer (Marketplace) - Sea Labs

Engineering and Technology - Sea Labs Indonesia, Jakarta

Entry Level

The Engineering and Technology team is at the core of the Shopee platform development. The team is made up of a group of passionate engineers from all over the world, striving to build the best systems with the most suitable technologies. Our engineers do not merely solve problems at hand; we build foundations for a long-lasting future. We don't limit ourselves on what we can or can't do; we take matters into our own hands even if it means drilling down to the bottom layer of the computing platform. Shopee's hyper-growing business scale has transformed most "innocent" problems into huge technical challenges, and there is no better place to experience it first-hand if you love technologies as much as we do.

About Team

The mission of the SRE (Site Reliability Engineer) team is to ensure the efficient and sustainable operation of Shopee 24x7, as well as to build and maintain large-scale, highly available, high-performance distributed systems based on system availability and performance. It is formed by combining traditional software engineering and technical operation. The SRE team needs to dive deep into the Shopee development lines to ensure that the system is highly scalable under rapid evolution of the System. From the perspective of stability and performance, it includes the design of business development, components of the basic platform (middleware, container scheduling, caching, object storage, etc.), OS optimization, data center and network optimisation. We optimise the inefficient and complicated operation in the traditional operation and maintenance mode through engineering and service means, and are committed to building a sound monitoring system to improve the efficiency of incident handling.

Job Description
  • Deep dive into development lines, learn and understand the mechanism of every application component, and promote product scalability, stability, and performance.
  • Set up, manage, and maintain Shopee product/middleware/big-data applications and services.
  • Perform regular and ad-hoc server-side deployments, make improvements to performance, and troubleshoot.
  • Design and develop automated technical operation platforms.
  • Manage capacity and resources.
  • Responsible for the full-chain stress test to enhance the performance and remove redundancy of applications.
  • Prepare routine operation documentation.
Job Requirements
  • Bachelor's degree or above in Computer Science, Engineering, Information Systems, or related fields.
  • More than 2 years of relevant experience (candidates with no working experience are welcomed to apply).
  • Extensive and hands-on knowledge with Linux operating systems (Ubuntu, CentOS, etc.).
  • Highly familiar with Computer Networks (TCP/IP, DNS, etc.), Computer Organisations, and OS.
  • Hands-on experience with at least one of the programming languages: Bash, Python, Go.
  • Strong analytical and problem-solving skills with the ability to thrive in a dynamic work environment.
  • Passionate and possess a strong sense of responsibility.
  • Fast learning ability and a good team player.
  • Agile and detail-oriented.

Skills below are optional but preferred:

  • Experience with automation tools like Ansible, SaltStack.
  • Experience with monitoring tools like Prometheus, Zabbix, Grafana, etc.
  • Experience with load balancing tools like LVS, Nginx, Openresty, or HAProxy.
  • Experience with container technology such as Docker, Kubernetes.
  • Experience with High Availability system design and Server Deployment Process.
  • Experience with SRE.
  • Experience with Ops PaaS platform or Ops automation platform (e.g., CMDB).

*Please note that you can only apply for up to 2 positions in a given period.

#J-18808-Ljbffr