Senior Site Reliability Engineer (SRE)

Indonesia·SlemanPermanentsenior

EngineeringDevops Engineer

0 views0 saves0 applied

Apply Now

Quick Summary

Key Responsibilities

The Senior Site Reliability Engineer (SRE) is accountable for the following functions and responsibilities: Design, implement, and maintain infrastructure for applications Architect,

Technical Tools

EngineeringDevops Engineer

POSITION SUMMARY:

AccelByte is building a 24x7 operations team for AAA multiplayer video games. In this position, we need a driven Site Reliability Engineer who can actively participate in the day-to-day combat by maintaining high reliability of our service and drive prioritization in fixing what may be broken today, as well as able to envision, design, and implement processes and technologies to improve the ability to identify, isolate, correlate, and mitigate service impacting problems in the system. The Site Reliability Engineer must also know some coding to automate routine tasks in service metrics gathering, correlating, organizing, and presenting, in addition to detail and in-depth root cause analysis

ESSENTIAL FUNCTIONS/RESPONSIBILITIES:

The Senior Site Reliability Engineer (SRE) is accountable for the following functions and responsibilities:

Design, implement, and maintain infrastructure for applications
Architect, implement and maintain a highly scalable deployment framework that improves our products' stability, reliability, and availability.
Build and run service deployment using K8s and other CNCF projects
Provide a secure, high-scalable, and cost-effective cloud platform
Construct and build effective systems to monitor the health of our system/applications, and to handle outages
Solve problems occurring in all our environments and create solutions to prevent them from happening again
Produce automation and innovative tools to assist the product development teams and to deliver operational excellence
Create and maintain infrastructure-related documentation and SRE runbooks
Collaborate with other stakeholders to provide cost-effective, operational excellence, and performance-efficient infrastructure solutions to improve our products.
Identify technology, process gaps, and opportunities for improvement
Liaise, communicate, and work directly with our clients
Perform any other design-related duties as required
Envision, design, and implement AIOps solutions to enhance operational efficiency and predictive maintenance.

QUALIFICATIONS/EXPERIENCE REQUIRED

5+ years Cloud Engineering or DevOps experience with AWS, 2+ years Kubernetes, Certification in AWS preferred
Degree in Computer Science or equivalent experience
Deep knowledge of cloud service providers and best practices around implementation and configuration, preferably managing AWS and Kubernetes
Familiarity with infrastructure management and operations lifecycle concepts and ecosystem, deep understanding of IaC and GitOps
Proven track record of building infrastructure as code (Terraform is a must), configuration management, and package manager (eg: Helm Chart)
Experience in delivering products against a plan in a fast-paced, multi-disciplined, and often ambiguous environment
Experience working independently to design, plan, and execute technical projects
Demonstrated deep knowledge of technical program management and engineering best practices
Innovative thinking balanced with a strong customer and quality and cost efficiency focus
Comfort and experience with cross-organizational communication; excellent written and verbal communication skills
Working experience with some of the following technologies and tools: Docker, Kubernetes, git, Redis, MongoDB, PostgreSQL, ElasticSearch, GitLab CI, Nexus, SonarQube, Terraform, Helm, Prometheus, ELK/EFK, Grafana, CloudWatch
Solid security best practices
Strong proficiency in Go, including the ability to conduct high-quality code reviews. Experience with Python and Bash is also required.
Keen problem-solving skills with the ability to work under pressure (during a production event)
Flexibility in working with people with different timezones
Experience with AIOps and building/harnessing AI tools to automate and optimize operational tasks.

QUALIFICATIONS/EXPERIENCE PREFERRED

Previous experience working in the game industry
Working experience with one or more of the following: Emissary, Linkerd, Istio, Nomad, Kafka, Flux, ArgoCD, GitOps, DevSecOps
Familiar with web services patterns/architectures, e.g. REST, SOAP, etc.
Experience working with auto-scaling workloads both in containers and VMs
Experience with other cloud technologies and infrastructure: GCP, Azure
Experience with Confluence, Jira, and BitBucket
IT standards, methodologies, Cryptographic key management regulations, and audit experience would be asset(s).