Site Reliability Software Engineer – Core Database Team | Temporal Technologies | Remote (Worldwide)

Site Reliability Software Engineer – Core Database Team | Temporal Technologies | Remote (Worldwide)

Remote Worldwide
Application ends: August 10, 2024
Apply Now

Job Description

With over 33,000 stars on GitHub, ClickHouse is the fastest and most resource-efficient open-source real-time data warehouse.  Our Core Engineering teams own the heart of our ClickHouse Open Source project.  We are looking for exceptional SRE/DevOps software engineers to join our remote-first, global team and continue scaling and growing our open-source and ClickHouse Cloud offerings. 

What you will do:

  • Collaborate closely with engineering teams to design and develop highly resilient and performant systems at scale.
  • Use your understanding of distributed systems to identify and resolve low-level challenges quickly. Dive into complex issues such as networking, load balancing and hardware maintenance and demonstrate your troubleshooting and problem-solving skills.
  • Identify bottlenecks and repetitive patterns in existing support processes and reduce costs by introducing better automation.
  • Participate in existing cluster operations, including monitoring cluster health, investigating issues, and resolving bugs. This commitment extends to providing on-call support.
  • Contribute to our Open-Source repos

What you bring along:

  • 5+ years of experience in Software Engineering, Site Reliability Engineering, or a development-focused DevOps role.
  • Experience with highly distributed systems and databases/data stores.
  • Proficiency in Python or Go.
  • Experience with alerting and monitoring tools such as Prometheus.
  • Strong working knowledge of Linux and containers, Bash and administration tools.
  • A comprehensive knowledge of Kubernetes.
  • Readiness to occasionally read code in C++ for reference and a better understanding of our internals.
  • Knowledge of standard algorithms and data structures.