Senior Site Reliability Engineer (SRE)

BABAsenior
EngineeringDevOps & InfrastructureSite Reliability Engineer
0 views0 saves0 applied

Quick Summary

Key Responsibilities

Operate and support a 24/7 production VoIP hosting environment, including monitoring, incident response, and on-call rotation; Build, maintain,

Requirements Summary

Strong Linux systems administration (Debian/Ubuntu, RHEL/CentOS/Alma, kernel troubleshooting). Deep experience with LXC/LXD containers and at least one hypervisor (KVM/QEMU, Proxmox, VMware).

Technical Tools
EngineeringDevOps & InfrastructureSite Reliability Engineer

Founded in 2004, Bicom Systems has grown into a global communications company with team members and customers around the world. We create reliable, easy-to-use tools that help businesses stay connected - through calling, messaging, and modern collaboration solutions. Our products are used in thousands of workplaces every day, supporting millions of people as they communicate and work together. What sets us apart is the strong partnerships we build and the supportive global ecosystem behind everything we do.

At Bicom Systems, we believe results come from teams that take ownership, act with trust, and are accountable for their work. We invest in professional growth, encourage thoughtful decision-making, and support a healthy work-life balance.

We are looking for a senior professional with 5+ years of experience who wants to take initiative, make meaningful contributions, and grow with a supportive team. Join us!

Responsibilities

~1 min read
  • Operate and support a 24/7 production VoIP hosting environment, including monitoring, incident response, and on-call rotation;
  • Build, maintain, and optimize virtualization and container platforms supporting VoIP softswitches;
  • Design and maintain monitoring, alerting, and observability across infrastructure and network layers;
  • Automate infrastructure provisioning and lifecycle using IaC and scripting tools;
  • Troubleshoot complex Linux, storage, and network issues with a focus on latency, QoS and reliability;
  • Lead platform upgrades, maintenance, and migration with minimal or no downtime;
  • Perform capacity planning and performance optimization;
  • Create and maintain operational documentation and runbooks.

Requirements

~1 min read
  • Strong Linux systems administration (Debian/Ubuntu, RHEL/CentOS/Alma, kernel troubleshooting).
  • Deep experience with LXC/LXD containers and at least one hypervisor (KVM/QEMU, Proxmox, VMware).
  • Hands-on experience with ZFS (pool design, tuning, snapshots, send/receive, scrub/repair).
  • Storage networking: iSCSI target/initiator administration and familiarity with NVMe over Fabrics (NVMe-oF / nvme-oTCP) concepts and troubleshooting.
  • Proficiency with server-grade hardware: ECC memory, RAID controllers, NVMe drives, BMC remote management, and firmware/BIOS upgrades.
  • Scripting and development skills (Bash + one higher-level language such as Python, Go, or Ruby) for automation and tooling.
  • Experience implementing monitoring/observability (Prometheus, Grafana, ELK/EFK, or equivalent) and alerting thresholds.
  • Strong low-level troubleshooting skills: I/O performance profiling, CPU pinning, network capture/analysis (tcpdump, wireshark), and interpreting kernel logs.

Nice to Have

~1 min read
  • Experience with VoIP systems and basic telecom/security practices.
  • Cluster administration, software-defined networking, or distributed storage.
  • CI/CD pipelines and on-call incident handling.
  • Familiarity with advanced monitoring tools (optional).
  • Experience with cloud/bare-metal hybrid architectures and hardware automation.
  • Knowledge of NFV and performance tuning for virtualized workloads.
  • Familiarity with telecom security best practices.
  • Experience with container orchestration (Kubernetes).
  • Relevant certifications (RHCE, LPIC, CNCF, or vendor-specific).

Listing Details

First seen
March 26, 2026
Last seen
April 20, 2026

Posting Health

Days active
25
Repost count
0
Trust Level
22%
Scored at
April 20, 2026

Signal breakdown

freshnesssource trustcontent trustemployer trust
Bicom Systems
Employees
125
Founded
2003
View company profile
Newsletter

Stay ahead of the market

Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.

A
B
C
D
Join 12,000+ marketers

No spam. Unsubscribe at any time.

Bicom SystemsSenior Site Reliability Engineer (SRE)