Site Reliability Engineer | CloudRaft | Remote (India)
Job Description
About Us :
CloudRaft is a dynamic and innovative company focused on delivering cutting-edge cloud-native solutions. We thrive on collaboration, creativity, and excellence, aiming to provide top-tier services to our clients. We are looking for a talented and experienced Lead Architect to join our team immediately and help us scale our operations to new heights.
Role :
- You have good understanding and professional work experience of running Kubernetes in on-prem and cloud (OpenShift, EKS, AKS and GKE). 
- You are comfortable in programmable infrastructure and can do programming in Golang or Python. 
- You are experienced in production grade CI/CD in tools such as Github Action, Argo CD and Gitlab and have explored advanced deployment strategies. 
- You can set up observability pipelines and backend using popular products like vector, fluentd, opentelemetry, prometheus, grafana etc. 
- Take the observability to the next level with products such as Victoria Metrics, Thanos, and SigNoz 
- You have production experience in troubleshooting and resolve system issues 
- Have a good understanding and implementation experience of SRE concepts such as SLIs and SLOs 
- You can represent the organization and collaborate with and coach the customer teams 
- You have curiousity to learn and develop skills in upcoming fields such as AI, MLOps, Edge Computing, etc 
- You like sharing your work through technical writing and speaking sessions in the community and conferences 
Qualifications:
- Bachelor’s degree in Computer Science, IT, or a related field
- 2-5 years of experience in DevOps/SRE
- Stong Understanding in at least two of AWS, OpenShift, Azure and Google Cloud
- Hands-on production experience in designing and managing Kubernetes clusters
- Hands-on experience in CI/CD and setting up Developer tooling
- Programming skills in any modern programming language (Python or Golang or Node)
- Infrastructure as Code (Terraform, CDK, Pulumi, etc)
- You have understanding of security concepts and tooling
- Excellent problem-solving and troubleshooting skills
- Strong communication and teamwork skills
- Ability to write well as we prefer async communication
- Having product mindset and customer empathy is a big plus
 
  
  
  
 