Site Reliability Engineer
Actively Reviewing the ApplicationsPT. Indosat Tbk
On-site
Posted 3 weeks ago
β’
Apply by June 11, 2026
Job Description
Role Summary
We are seeking a skilled and passionate Site Reliability Engineer (SRE) to ensure the reliability, scalability, and performance of our hybrid and cloud-native infrastructure. You will play a critical role in automating operations, improving system resilience, and supporting mission-critical services running across Kubernetes and cloud environments.This role is ideal for engineers who enjoy solving complex infrastructure challenges, building automation, and improving platform reliability at scale
Job Description (1/2)
Reliability & System Performance
πΉ Observability, Monitoring & Distributed Tracing
Bachelorβs degree in Computer Science, Informatics, Information Systems, Electrical Engineering, Mathematics/Statistics, or related field.
Experience
We are seeking a skilled and passionate Site Reliability Engineer (SRE) to ensure the reliability, scalability, and performance of our hybrid and cloud-native infrastructure. You will play a critical role in automating operations, improving system resilience, and supporting mission-critical services running across Kubernetes and cloud environments.This role is ideal for engineers who enjoy solving complex infrastructure challenges, building automation, and improving platform reliability at scale
Job Description (1/2)
Reliability & System Performance
- Maintain high availability, scalability, and performance of production systems
- .Define and monitor SLIs, SLOs, and error budgets to ensure service reliability.
- Perform root cause analysis, incident response, and postmortem reviews.
- Implement reliability improvements and proactive failure prevention.
- Manage and optimize workloads running on Google Kubernetes Engine (GKE) and OpenShift.
- Support multi-cluster and hybrid infrastructure environments.
- Implement autoscaling and high availability architecture
- Design and maintain CI/CD pipelines using GitLab CI/CD.
- Implement GitOps deployment workflows using Argo CD.
- Implement safe deployment strategies including:
- Provision and manage infrastructure using Terraform / OpenTofu.
- Develop and maintain Helm charts for Kubernetes deployments.
- Automate operational tasks using Python scripting to reduce manual toil.
πΉ Observability, Monitoring & Distributed Tracing
- Implement centralized logging using Grafana Loki and ELK Stack.
- Build dashboards and alerts using Grafana and Datadog.
- Implement distributed tracing using OpenTelemetry to improve system visibility.
- Improve monitoring coverage and alert accuracy.
- Conduct load and stress testing using tools such as k6, Locust, or JMeter.
- Analyze performance bottlenecks and implement tuning strategies.
- Support capacity planning and performance optimization.
- Support Change Data Capture (CDC) and real-time data streaming pipelines.
- Work with Confluent Platform / Apache Kafka to ensure reliable event-driven data flow.
- Manage secrets securely using Google Cloud Secret Manager and Kubernetes secrets, Vault Hashicorp.
- Implement secure CI/CD and platform access practices.
Bachelorβs degree in Computer Science, Informatics, Information Systems, Electrical Engineering, Mathematics/Statistics, or related field.
Experience
- 0β4 years of experience in SRE, DevOps, Cloud Engineering, or Platform Engineering.
- Hands-on experience supporting production systems and cloud infrastructure.
- Strong Linux system administration and networking fundamentals.
- Hands-on experience with Kubernetes and containerized environments.
- Experience designing and maintaining CI/CD pipelines.
- Infrastructure as Code experience (Terraform), Ansible.
- Helm chart development and Kubernetes deployment management.
- Monitoring, logging, and observability best practices.
- Programming/scripting skills in Bash, Python (Go is a plus).
- Familiarity with Google Cloud Platform (GCP).
Quick Tip
Customize your resume and cover letter to highlight relevant skills for this position to increase your chances of getting hired.
Related Similar Jobs
View All
Project Quality Manager
Technip Energies
India
Contract
Communication
Engineering
Quality Control
+38
Remote Data Analyst - 60736
Turing
India
Contract
βΉ1β1 LPA
Communication
Machine Learning
Data Analysis
+24
Frontend Developer
Arting Digital
Pune
Git
JavaScript
Angular
+35
Software Test Automation Engineer - Mobile / Payments
GoDaddy
India
Full-Time
Engineering
Leadership
Automation Frameworks
+44
Python QA Engineer (Bengaluru - 7+ Years of Experience )
PaasWise
Bengaluru
Full-Time
Python
Share
Quick Apply
Upload your resume to apply for this position