Sr. Site Reliability Engineer

senior

EngineeringDevops Engineer

0 views0 saves0 applied

Apply Now

Quick Summary

Overview

Job Description Amazing Career Moments Happen Here Transforming the insurance industry is ambitious, we know. That’s why at Applied, we’re building a team that shows up every day ready to learn, willing to try new things, and driven to deliver innovative software and services that make us…

Technical Tools

ansibleargocdawsazuredatadoggcpgithub-actionshelmkubernetespackerpostgresqlpythonsqlterraformtypescriptvaultci-cddistributed-systemslinuxmentoringmicroservicesnetworkingperformance-optimization

Transforming the insurance industry is ambitious, we know. That’s why at Applied, we’re building a team that shows up every day ready to learn, willing to try new things, and driven to deliver innovative software and services that make us indispensable to our customers – all within a culture built on values that make us indispensable to each other too. With 40+ years of experience in the insurtech game, we’re not just redefining what’s achievable, we’re creating a place where amazing career moments are made possible. 

We're seeking a Senior Site Reliability Engineer to join our SRE team. Applied Systems is committed to delivering innovative solutions that empower insurance agencies and carriers to streamline operations, improve customer experiences, and drive business growth. As part of our team, you will play a critical role in ensuring the reliability, scalability, and efficiency of our software applications, enabling us to deliver best-in-class services to our customers.

Responsibilities

~2 min read

→Infrastructure as Code (IaC): Develop and maintain IaC using Terraform, Terraform CDK with TypeScript, Packer, and Ansible to automate on-prem and cloud infrastructure provisioning and management
→System Reliability & Scalability: Collaborate with development and platform teams to design scalable, reliable systems with fault tolerance, high availability, and performance optimization
→Monitoring & Observability: Implement and manage monitoring solutions using Datadog to ensure system performance, tracing instrumentation, and adherence to SLI/SLO/SLAs
→Service Discovery & Networking: Utilize HashiCorp Consul for service discovery, dynamic configuration, and network automation across distributed systems
→Disaster Recovery: Define and implement best practices for disaster recovery and high availability across hybrid environments
→CI/CD Pipelines: Build and maintain CI/CD pipelines using tools like GitLab and GitHub Actions to streamline deployments and ensure code quality
→Automation: Automate repetitive tasks to increase efficiency and reduce human error, leveraging tools like Python, Go, Bash, and PowerShell
→Kubernetes Expertise: Manage Kubernetes environments, including Helm charts and ArgoCD for application deployment and orchestration
→Mentorship & Collaboration: Mentor junior engineers, lead technical discussions, and collaborate across teams to drive consensus on design decisions and technical initiatives
→Documentation: Create and maintain accurate documentation for workflows, procedures, and infrastructure standards to support internal teams and customers
→On-Call Support: Participate in the on-call rotation to provide production support and resolve complex engineering challenges
→Vendor Collaboration: Work with third-party vendors to evaluate and integrate their products and services into the infrastructure ecosystem

Experience:
- 5+ years of experience in DevOps, SRE, or Infrastructure Engineering roles
- Strong foundations in the areas of Incident Management, Troubleshooting, Observability of software applications.
- Experience with cloud platforms (GCP, AWS, Azure), including traffic management solutions
- Familiarity with distributed systems, microservices architecture, and related technologies
Technical Skills:
- Proficiency in Python, Go, Bash, and PowerShell
- Expertise in Windows and Linux system administration
- Advanced knowledge of IaC tools like Terraform, including Terraform CDK with TypeScript, Packer, and HCL
- Knowledge of CI/CD pipelines and version control systems (GitLab, GitHub Actions, etc.)
- Familiarity with monitoring tools (Datadog) and security solutions (HashiCorp Vault, Cloud Armor)
- Experience with SQL Server and PostgreSQL for database management
- Kubernetes expertise, including Helm charts and ArgoCD for application deployment and orchestration
Soft Skills:
- Excellent communication skills to collaborate with engineers, product managers, and business stakeholders
- Strong organizational skills and attention to detail
- Ability to prioritize tasks and make accurate decisions under pressure
- Passion for mentoring and guiding team members

We know that talent comes from all backgrounds and experience levels. We encourage military members and their spouses as well as candidates without a degree or a background in tech to apply!

Candidate will need to reside in North America, working arrangement will be remote.

Responsibilities

~1 min read

Applied Systems is proud to be an Equal Employment Opportunity Employer. Diversity and Inclusion is a business imperative and is a part of building our brand and reputation. At Applied, we don’t discriminate, and we are committed to recruit, develop, retain, and promote regardless of race, religion, color, national origin, sexual orientation, gender identity, disability, age, veteran status, and other protected status as required by applicable law.  

#LI-Remote