- 경험
- 5+ yrs
- 샐러리
- —
- 채용 공고
- 1
- 게시됨
- 2시간 전
- Work mode
- 사무실에서
- Resume
- Required to apply
Where you'll work
직무 설명
Role overview
This position focuses on keeping production environments highly available, fast, and dependable through cloud operations, automation, monitoring, and disciplined incident handling.
What you'll do
- Design, build, and maintain scalable AWS-based infrastructure using Terraform or CloudFormation.
- Set up and operate observability and monitoring platforms such as Prometheus, Grafana, Splunk, or Datadog.
- Respond to incidents, perform root cause analysis, participate in on-call rotations, and work with SLIs, SLOs, and error budgets.
- Automate recurring operational work to increase reliability, efficiency, and recovery speed.
- Support Kubernetes, Docker, CI/CD pipelines, runbooks, and ITIL-based operational processes.
Skills and experience needed
- Hands-on background in SRE, DevOps, production support, or cloud operations with AWS exposure.
- Working knowledge of Kubernetes, Docker, Linux, and core networking concepts.
- Ability to script in Python, Bash, or Go.
- Experience with monitoring platforms and incident resolution / RCA workflows.
- Familiarity with infrastructure as code, CI/CD tools, and enterprise support systems is preferred.
Experience
A minimum of 5 years of experience in SRE, DevOps, cloud, or production support roles is required.