Back to search:Senior Server / Jakarta

We are seeking a professional Senior Hardware & Software Operations Engineer to take responsibility for the maintenance and repair of B200 GPU/X86 servers. This role requires deep hardware knowledge and troubleshooting skills to ensure optimal server performance.

Key Responsibilities

  • Diagnose circuit board faults in B200 GPU/X86 servers.
  • Perform regular hardware maintenance and inspections to prevent potential issues.
  • Manage hardware assets, including repair records, part replacements, and inventory tracking.
  • Work with vendors and manufacturers to resolve complex hardware issues, including warranty and RMA services.
  • Prepare and update hardware maintenance reports; record, track, and summarize repair/maintenance/testing results in a results-oriented format.
  • Provide technical support for urgent hardware-related incidents.
  • Analyze hardware failure patterns and propose improvement measures to reduce future fault rates.
  • Participate in hardware upgrades and configuration management to ensure continuous optimization.
  • Provide remote Tier-2 support for service tickets in other regions as needed.

Qualifications

  • Bachelor's degree or above in
    Electronics Engineering, Computer Science
    , or related fields.
  • Minimum
    3 years of experience
    in server/Linux system operations, particularly in GPU server environments.
  • Proficient in analyzing hardware and OS logs to extract key information and identify fault points.
  • Ability to read and understand circuit diagrams and hardware specifications.
  • Strong communication and teamwork skills.
  • Willingness to work on-call, including
    night shifts and weekends
    when required.

Technical Skills

  • Familiar with major server brands (H3C, Inspur, Lenovo, DELL, HyperFusion, etc.); capable of independent server hardware diagnosis and repair, including firmware upgrades and troubleshooting.
  • In-depth understanding of
    B200 GPU/X86 server
    architecture and components.
  • Experienced with server hardware monitoring and management tools.
  • Basic networking knowledge and ability to resolve server network connectivity issues.
  • Familiar with asset management systems and related tools.
  • Strong documentation skills, attention to detail, responsibility, and excellent service awareness and teamwork.

Preferred Qualifications

  • Relevant IT hardware certifications;
    vendor repair engineer
    or
    Red Hat certification
    is a plus.
  • Experience in
    data center design and construction
    .
  • Programming skills for developing
    automation scripts
    to enhance operational efficiency.
  • Strong communication and customer service orientation.
  • Good command of
    English
    , both written and spoken.