About LINE MAN Wongnai

LINE MAN Wongnai is Thailand’s Leading On-Demand Delivery and Lifestyle e-Commerce platform services. We build technology to help Thai people live better, to empower all local businesses by creating an end-to-end food ecosystem through our channel LINE MAN and Wongnai. Connected consumers, riders, and local businesses and improved the daily life of all parties with restaurants nationwide. And because we are local, we provide the deepest variety and services that are tailor-made for Thai people.

We are looking for an experienced in Identity Access Management roles, have a solid security principle, baseline, and expert in security concept. Working in a fast-paced environment, you will bring your expertise and skills to tackle the challenges that impact millions of people on our journey to become the No.1 food platform in Thailand.

Position Overview:

The Head of Site Reliability Engineering (SRE) is a pivotal leadership role responsible for the performance, reliability, and scalability of our critical systems and services. This individual will lead a team of SREs to design, build, and maintain infrastructure, ensuring high availability and optimal performance. The ideal candidate will have a blend of technical expertise, strategic vision, and leadership skills.

Key Responsibilities:

Leadership and Strategy:

  • Develop and execute the SRE strategy to support company goals.
  • Lead and mentor the SRE team, fostering a culture of continuous improvement and innovation.
  • Collaborate with other engineering and product teams to align on priorities and deliverables.

Infrastructure and Operations:

  • Oversee the design, implementation, and maintenance of scalable, reliable infrastructure.
  • Ensure the efficient operation of systems, focusing on automation, monitoring, and performance optimization.
  • Implement best practices for incident response and management, including root cause analysis and post-mortems.

Performance and Reliability:

  • Define and monitor key performance indicators (KPIs) and service-level objectives (SLOs).
  • Proactively identify and mitigate potential reliability issues before they impact customers.
  • Drive the adoption of robust monitoring and alerting solutions to ensure system health and performance.

Security and Compliance:

  • Ensure systems are secure and compliant with relevant regulations and standards.
  • Collaborate with the security team to implement and maintain best-in-class security practices.

Collaboration and Communication:

  • Act as a liaison between engineering, product, and other key stakeholders.
  • Communicate effectively with technical and non-technical audiences, including executive leadership.

Qualifications:

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
  • Minimum of 10 years of experience in software engineering, infrastructure, or operations roles.
  • At least 5 years in a leadership position within a high-growth technology company.
  • Deep understanding of cloud platforms (AWS, GCP, Azure) and container orchestration (Kubernetes, Docker).
  • Proficiency in scripting and programming languages (Python, Go, Ruby, etc.).
  • Experience with CI/CD pipelines, infrastructure as code (Terraform, Ansible), and monitoring tools (Prometheus, Grafana, Datadog).
  • Strong leadership and people management skills.
  • Excellent problem-solving and analytical abilities.
  • Effective communication and collaboration skills.