Senior SRE Engineer
FULL-TIME | TAOYUAN, Taiwan
We are looking for an experienced Senior SRE Engineer with a minimum of 3 years of cloud SRE or DevOps-related experience. The ideal candidate will have a proven track record in automating the build, delivery, and management of infrastructure and applications across NonProd and Prod environments. Strong expertise in monitoring and logging solutions, public cloud platforms, microservices, containers, and cloud networking concepts is essential. This role also involves incident management and on-call support. Proficiency in both Mandarin and English, along with excellent communication skills, is required.
Job Duties:
- Incident Management: Participate in on-call support, incident management, investigation, diagnosis, and resolution to minimize system downtime and ensure reliability.
- Monitoring and Logging: Implement and maintain monitoring and logging solutions, including Prometheus, Grafana, Datadog, to ensure the health and performance of systems.
- Cloud Expertise: Leverage your experience in building and supporting solutions on public cloud platforms, such as AWS or Azure.
- Automation: Drive the automation of infrastructure and application build, delivery, and management processes across Non-Production and Production environments.
- Microservices and Containers: Work with microservice architectures and container technologies like Kubernetes and Docker to optimize application deployments.
- Operating Systems and Networking: Demonstrate knowledge of operating systems, specifically Linux, and cloud networking concepts to troubleshoot and optimize system performance.
Qualifications:
- Bachelor’s or master’s degree in computer science, information systems, or a related field.
- Minimum of 3 years of cloud SRE or DevOps-related experience and minimum of 5 years of overall work experience.
- Experience with monitoring and logging solutions, including Prometheus, Grafana, and Datadog.
- Hands-on experience building and supporting solutions on public cloud platforms (AWS or Azure).
- On-call support experience and proficiency in incident management.
- Strong communication skills in both Mandarin and English, both written and spoken.
- Familiarity with microservice and container technologies such as Kubernetes and Docker.
- Logical thinking, excellent troubleshooting skills, and the ability to resolve complex issues efficiently.
- Required to be on call outside of regular working hours in case of emergencies.
Interested? Please email your resume to [email protected].
Apply For
Senior SRE Engineer
"*" indicates required fields