- 경험
- 5년 이상
- 샐러리
- —
- 채용 공고
- 1
- 게시됨
- 3일 전
- 작업 모드
- 사무실에서
- 재개하다
- 신청 시 필수 사항
당신이 일하게 될 곳
직무 설명
Role overview
This position focuses on keeping production environments highly available, fast, and dependable through cloud operations, automation, monitoring, and disciplined incident handling.
What you'll do
- Design, build, and maintain scalable AWS-based infrastructure using Terraform or CloudFormation.
- Set up and operate observability and monitoring platforms such as Prometheus, Grafana, Splunk, or Datadog.
- Respond to incidents, perform root cause analysis, participate in on-call rotations, and work with SLIs, SLOs, and error budgets.
- Automate recurring operational work to increase reliability, efficiency, and recovery speed.
- Support Kubernetes, Docker, CI/CD pipelines, runbooks, and ITIL-based operational processes.
Skills and experience needed
- Hands-on background in SRE, DevOps, production support, or cloud operations with AWS exposure.
- Working knowledge of Kubernetes, Docker, Linux, and core networking concepts.
- Ability to script in Python, Bash, or Go.
- Experience with monitoring platforms and incident resolution / RCA workflows.
- Familiarity with infrastructure as code, CI/CD tools, and enterprise support systems is preferred.
Experience
A minimum of 5 years of experience in SRE, DevOps, cloud, or production support roles is required.