Aylo
Aylo14h ago
New

SRE Developer

OtherDeveloper
0 views0 saves0 applied

Quick Summary

Overview

Established in 2004, we are a tech pioneer offering world-class adult entertainment and games on some of the internet’s safest and most popular platforms.

Requirements Summary

3+ years of experience in Site Reliability Engineering, DevOps, Production Support, or Systems Engineering Bachelor’s degree in computer science or related field Hands-on experience with Grafana, Kubernetes and Docker Experience with OpsGenie for…

Technical Tools
confluencedockergrafanajirakafkakuberneteslaravelmysqlphppythonredisci-cdmicroservicesperformance-optimizationsecurity-best-practices

Established in 2004, we are a tech pioneer offering world-class adult entertainment and games on some of the internet’s safest and most popular platforms. With the support of an international team of dynamic and collaborative innovators, we are on a mission to enable safe user experiences and empower our communities by celebrating diversity, inclusion, and expression — all while maintaining robust trust-and-safety protocols. 

We embrace the best of both worlds! Local talent can thrive in our collaborative office space with the flexibility of a hybrid work environment, while remote team members play an integral role in shaping our dynamic culture from afar. We have offices in Montreal (Quebec), Austin (Texas) and Nicosia (Cyprus). 

We are seeking a highly skilled Site Reliability Engineer (SRE) to support and enhance the reliability, scalability, and performance of our production systems. In this role, you will play a key role in incident response, root cause analysis, and continuous improvement of operational processes while leveraging cutting-edge tooling and AI-assisted solutions.

Responsibilities

~1 min read
  • Own the reliability, availability, and performance of production systems in a containerized, microservices-based environment
  • Monitor system health using Grafana dashboards, alerts, and observability tools; proactively identify and resolve issues
  • Manage and operate Kubernetes clusters (via Rancher), including deployments, scaling, and troubleshooting
  • Lead and participate in incident management using OpsGenie, including on-call rotations, escalations, and post-incident reviews
  • Troubleshoot issues across application, infrastructure, messaging, database, and container layers
  • Build and maintain automation scripts and tools using Bash, Go, and/or Python to improve operational efficiency
  • Support and optimize CI/CD pipelines using GitLab, ensuring smooth deployment and release processes
  • Collaborate with development teams to improve application reliability, performance, and observability
  • Work with databases and data systems (MySQL, Redis) for performance monitoring and issue resolution
  • Support distributed messaging systems such as Kafka and RabbitMQ
  • Contribute to and maintain operational documentation, runbooks, and knowledge bases using Jira and Confluence
  • Perform root cause analysis (RCA) and implement preventative measures
  • Ensure systems operate in alignment with security, compliance, and data privacy standards
  • Leverage AI-powered engineering tools to accelerate troubleshooting, documentation, and workflows

Requirements

~1 min read

Requirements

~1 min read
  • 3+ years of experience in Site Reliability Engineering, DevOps, Production Support, or Systems Engineering
  • Bachelor’s degree in computer science or related field
  • Hands-on experience with Grafana, Kubernetes and Docker
  • Experience with OpsGenie for incident management and on-call coordination
  • Strong experience with GitLab/Git, including CI/CD pipelines and release processes
  • Proficiency with Atlassian tools (Jira, Confluence) for tracking and documentation
  • Solid knowledge of MySQL • Experience with Kafka and/or RabbitMQ
  • Familiarity with Redis for caching and performance optimization
  • Working knowledge of Temporal or similar workflow orchestration tools
  • Strong scripting skills in Bash
  • Proficiency in Go and/or Python for automation and tooling
  • Familiarity with PHP applications (Symfony, Laravel) for production support
  • Proven ability to troubleshoot complex systems across multiple layers
  • Excellent documentation habits (runbooks, playbooks, system diagrams)

Nice to Have

~1 min read
  • Knowledge of FTC data protection principles
  • Understanding of NIST frameworks and security best practices Familiarity with GDPR requirements (data handling, logging, retention, privacy)

Location & Eligibility

Where is the job
Montréal, Canada
On-site at the office
Who can apply
CA

Listing Details

Posted
May 12, 2026
First seen
May 12, 2026
Last seen
May 13, 2026

Posting Health

Days active
0
Repost count
0
Trust Level
67%
Scored at
May 12, 2026

Signal breakdown

freshnesssource trustcontent trustemployer trust
Aylo
Aylo
greenhouse
Employees
5
Founded
2018
View company profile
Newsletter

Stay ahead of the market

Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.

A
B
C
D
Join 12,000+ marketers

No spam. Unsubscribe at any time.

AyloSRE Developer