Senior Staff Engineer – Site Reliability | Nagarro | Remote (Colombia)
Job Description
Company Description
We are a Digital Product Engineering company that is scaling in a big way! We build products, services, and experiences that inspire, excite, and delight. We work at scale — across all devices and digital mediums, and our people exist everywhere in the world (19000+ experts across 33 countries, to be exact). Our work culture is dynamic and non-hierarchical. We are looking for great new colleagues. That is where you come in!
Job Description
- Experienced L3 SRE engineer based on business-critical SaaS application.
- Capacity to L3 across the full stack including infra backend and front-end, before escalation to engineering business unit.
- Capacity to automate SRE tools to provide proactive L3 support, close to our tech monitoring strategy.
- Capacity to work under business pressure for business critical applications.
- Capacity to communicate accordingly with L1,L2, Engineering, Product managers, leadership and end-users during troubleshooting.
Qualifications
Must have Skills: Kubernetes (Expert), Github Actions, Terraform (Expert), and AWS.
- Capacity to communicate accordingly.
- Experience with incident and problem management.
- Experience with multitenant applications.
- Solid understanding of networking concepts (TCP/IP, DNS, Routing, etc) like VPCs, subnets, firewalls, and load balancing, TLS and SSL.
- Experience with CI/CD pipelines (e.g., Jenkins, Github Actions) & version control.
- Python, react/next – Monitoring and logging to analyze & track resource utilization, application performance, and identify potential issues, Grafana, Prometheus, Loki or ELK.
- Experience with AWS, particularly EKS, serverless, queue & various databases.
- Solid knowledge Kubernetes.
Additional Information