Alphasenseindia2mo ago

Cloud Reliability & Recovery Engineer

IndiaRemotemid

OtherEngineer

12 views0 saves0 applied

Apply Now

Quick Summary

Overview

About AlphaSense: The world’s most sophisticated companies rely on AlphaSense to remove uncertainty from decision-making. With market intelligence and search built on proven AI, AlphaSense delivers insights that matter from content you can trust.

Key Responsibilities

Cloud Resilience Architecture Design and implement multi-region, multi-AZ AWS architectures that meet RTO/RPO targets Engineer active-active and active-passive failover patterns using Route 53, Global Accelerator, and CloudFront Build automated DR…

Requirements Summary

Experience 5+ years in cloud infrastructure, SRE, or IT disaster recovery engineering roles 3+ years of hands-on AWS experience in production environments at scale Proven delivery of multi-region DR architectures with defined and tested RTO/RPO…

Technical Tools

argocdawsdynamodbgithub-actionskubernetespythonterraformci-cdnetworking

The world’s most sophisticated companies rely on AlphaSense to remove uncertainty from decision-making. With market intelligence and search built on proven AI, AlphaSense delivers insights that matter from content you can trust. Our universe of public and private content includes equity research, company filings, event transcripts, expert calls, news, trade journals, and clients’ own research content.

The acquisition of Tegus by AlphaSense in 2024 advances our shared mission to empower professionals to make smarter decisions through AI-driven market intelligence. Together, AlphaSense and Tegus will accelerate growth, innovation, and content expansion, with complementary product and content capabilities that enable users to unearth even more comprehensive insights from thousands of content sets. Our platform is trusted by over 6,000 enterprise customers, including a majority of the S&P 500. Founded in 2011, AlphaSense is headquartered in New York City with more than 2,000 employees across the globe and offices in the U.S., U.K., Finland, India, Singapore, Canada, and Ireland. Come join us!

We are seeking an experienced Cloud Engineer to design, implement, and continuously improve our Business Continuity Planning (BCP) and Disaster Recovery (DR) capabilities across AWS cloud environments.

This is a hands-on technical role requiring deep AWS expertise, strong scripting skills, and a passion for building highly available, fault-tolerant, and resilient cloud architecture by leveraging container orchestration with Kubernetes and infrastructure as code using Terraform. Good understanding of CI/CD pipelines to enable rapid, reliable deployments and minimize downtime. Adept at implementing DR strategies including multi-region failover, backup and restore automation, and recovery testing aligned with industry BCP/DR standards. You will collaborate closely with security, infrastructure, and application teams to ensure our systems can withstand and rapidly recover from any disruption.

Responsibilities

~1 min read

→Design and implement multi-region, multi-AZ AWS architectures that meet RTO/RPO targets
→Engineer active-active and active-passive failover patterns using Route 53, Global Accelerator, and CloudFront
→Build automated DR runbooks and playbooks using AWS Systems Manager Automation and Step Functions
→Implement chaos engineering practices using AWS Fault Injection Simulator (FIS) to validate resiliency
→Architect cross-region replication strategies for S3, DynamoDB Global Tables, RDS, and Aurora Global
→Review containerized workloads using Kubernetes, ensuring resilience through self-healing, auto-scaling, and multi-cluster or multi-region deployments.

Administer AWS Backup across all services (EC2, EBS, RDS, EFS, FSx, DynamoDB, Aurora) with policy-based automation
Design immutable backup vaults and cross-account/cross-region backup replication pipelines
Develop and automate data recovery testing procedures, ensuring integrity and meeting defined SLAs
Implement point-in-time recovery (PITR) for databases and storage; validate via regular restore drills
Maintain Business Continuity Plans (BCP) and Disaster Recovery (DR) strategies, including tracking RTO (Recovery Time Objective) and RPO (Recovery Point Objective).

Author and maintain Terraform/CloudFormation templates for all BCP/DR infrastructure components
Automate DR testing pipelines through CI/CD (CodePipeline, CodeBuild, GitHub Actions)
Write Python/Bash/PowerShell scripts to orchestrate failover, failback, and health-check workflows
Manage infrastructure state in AWS Control Tower and implement Landing Zone DR patterns

Build CloudWatch dashboards, alarms, and composite alarms for availability and DR-readiness indicators
Integrate AWS Health, Personal Health Dashboard events into PagerDuty/OpsGenie alerting workflows
Participate in on-call rotations and lead DR incident response; conduct post-incident reviews (PIRs)
Develop and maintain runbooks for AWS service degradations, regional outages, and data corruption events

Conduct regular BCP/DR tabletop exercises and full failover simulations to validate recovery procedures and improve organizational readiness, document results and action items.
Ensure DR controls meet SOC 2, ISO 22301, NIST 800-53, and HIPAA/PCI requirements as applicable
Maintain current and accurate DR documentation: BIAs, BCPs, DRP runbooks, and recovery evidence
Collaborate with audit and compliance teams to provide DR evidence and remediation tracking

Requirements

~1 min read

AWS Certified Solutions Architect – Professional or AWS Certified DevOps Engineer – Professional
AWS Certified Advanced Networking – Specialty certification
Experience with AWS Resilience Hub for automated resilience assessments and policy enforcement
Familiarity with CloudEndure / AWS Elastic Disaster Recovery (DRS) for workload replication
Knowledge of Kubernetes-based DR (EKS multi-region, Velero backups, ArgoCD GitOps failover)
Hands-on experience with serverless DR patterns (Lambda, API Gateway, DynamoDB)

5+ years in cloud infrastructure, SRE, or IT disaster recovery engineering roles
3+ years of hands-on AWS experience in production environments at scale
Proven delivery of multi-region DR architectures with defined and tested RTO/RPO targets
Expert-level proficiency with core AWS resilience services (see skills matrix below)
Strong scripting skills: Python, Bash, or PowerShell for automation and orchestration
Experience with Infrastructure as Code: Terraform and/or AWS CloudFormation
Solid understanding of networking fundamentals: VPC, TGW, Direct Connect, VPN, DNS failover
Excellent written and verbal communication; able to produce executive-level DR reports

You'll be joining a security organization that emphasizes automation, engineering-driven approaches, and systematic problem-solving. Our team operates at the intersection of security operations, detection engineering, incident response, and infrastructure security. We value practical solutions, measurable outcomes, and continuous improvement.

Reporting to the Director of Event Response you'll execute on strategic initiatives to operationalize mature BCP/DR capabilities that protect AlphaSense's mission-critical operations and support our commitment to customer trust. This role sits at the critical intersection of incident response and business continuity, ensuring our ability to respond to and recover from major disruptions.

AlphaSense is an equal-opportunity employer. We are committed to a work environment that supports, inspires, and respects all individuals. All employees share in the responsibility for fulfilling AlphaSense’s commitment to equal employment opportunity. AlphaSense does not discriminate against any employee or applicant on the basis of race, color, sex (including pregnancy), national origin, age, religion, marital status, sexual orientation, gender identity, gender expression, military or veteran status, disability, or any other non-merit factor. This policy applies to every aspect of employment at AlphaSense, including recruitment, hiring, training, advancement, and termination.

In addition, it is the policy of AlphaSense to provide reasonable accommodation to qualified employees who have protected disabilities to the extent required by applicable laws, regulations, and ordinances where a particular employee works.

Recruiting Scams and Fraud

We at AlphaSense have been made aware of fraudulent job postings and individuals impersonating AlphaSense recruiters. These scams may involve fake job offers, requests for sensitive personal information, or demands for payment. Please note:

AlphaSense never asks candidates to pay for job applications, equipment, or training.
All official communications will come from an @alpha-sense.com email address.
If you’re unsure about a job posting or recruiter, verify it on our Careers page.

If you believe you’ve been targeted by a scam or have any doubts regarding the authenticity of any job listing purportedly from or on behalf of AlphaSense please contact us. Your security and trust matter to us.