Back to search:Site Reliability / Jakarta

About the job: Site Reliability Engineer (SRE) Lead – Data Center Operations

VIDA Digital Identity is Indonesia's leading provider of
digital identity verification, digital signature, and trust services
, serving enterprises and government institutions with high standards of
security, compliance, and reliability
.

We are seeking an experienced
Site Reliability Engineering (SRE) Lead
to drive the
reliability, scalability, and operational excellence
of VIDA's core infrastructure — across both
data centers and cloud environments
.

The ideal candidate will have deep expertise in
data center operations
,
infrastructure reliability
, and
automation
, with strong experience in
regulated SaaS environments
.

Responsibilities

1. Site Reliability & Infrastructure Management

  • Lead the SRE function to maintain high availability and performance across all environments.
  • Manage robust, scalable, and secure infrastructure supporting VIDA's digital identity and trust platforms.
  • Establish monitoring, alerting, and incident response systems to proactively detect and mitigate service disruptions.
  • Drive automation in deployment, scaling, and recovery processes to reduce manual effort.

2. Data Center Operations

  • Oversee VIDA's physical and hybrid data center operations, ensuring performance, security, and uptime SLAs.
  • Collaborate with network engineers, cloud architects, and system admins to maintain seamless connectivity and integration.
  • Establish and maintain Disaster Recovery (DR) and Business Continuity Plans (BCP) aligned service obligations.

3. Reliability Engineering & Continuous Improvement

  • Build and maintain observability frameworks for system health and performance monitoring.
  • Conduct root cause analyses (RCA) for incidents and implement corrective actions.
  • Partner with development teams to embed reliability and performance improvements into the software delivery process.

4. Leadership & Team Development

  • Lead and mentor a team of SREs and infrastructure engineers.
  • Collaborate cross-functionally with Engineering, Security, Compliance, and Product teams.
  • Establish and maintain documentation and standard operating procedures (SOPs) for infrastructure management.

Qualifications & Experience

Must Have:

  • Bachelor's degree in Computer Science, Information Systems, or a related technical field.
  • 8+ years of experience in SRE, Infrastructure, or DevOps roles — with at least 3 years in a leadership position.
  • Strong technical expertise in
    data center operations
    ,
    networking
    ,
    load balancing
    ,
    storage systems
    , and
    server infrastructure
    .
  • Strong knowledge of networking (
    TCP/IP, BGP routing, switching, VLANs, firewalls, VPNs, Transit IP
    ).
  • Experience managing hybrid infrastructure environments (on-premise and cloud).
  • Experience with
    Linux systems administration, containerization (Docker/Kubernetes)
    , and
    Infrastructure as Code (Terraform, Ansible)
    .

Preferred:

  • Experience in
    SaaS or regulated industries

Familiarity with
cryptographic systems, PKI, and HSM management
.