Acquire BPO
Acquire BPO is an organisation devoted to helping you grow and scale your business through our wide range of offshoring solutions.
- Open roles
- 40
- New role every
- ~1.6 days
Company signals
Score: 49Job facts
- Location
- Taguig City
- Type
- Full-time
- Posted
- May 05, 2026
Site Reliability Engineer (SRE)
at Acquire BPO
We’re an award-winning global outsourcer providing contact center and back office services on behalf of our global clients. Come work at a place where innovation and teamwork come together to support the most exciting missions in the world!
Role objective
The Site Reliability Engineer serves as the guardian of our production
systems, ensuring the reliability, scalability, and performance of our IoT
telemetry platform. You will define and enforce Service Level Objectives
(SLOs), automate operational processes, and build the infrastructure and
tooling that enables our engineering teams to deploy with confidence. By
implementing comprehensive monitoring, incident response procedures, and
reliability practices, you will play a pivotal role in maintaining the uptime
and data freshness that our customers depend on for their critical fleet
operations.
The role will focus on the following key areas:
SLO Management
Infrastructure Automation
Incident Response
Security & compliance
Key Responsibilities
Responsibilities of the Site Reliability Engineer will include but are not
limited to:
Service Level Management & Reliability
• Define, monitor, and enforce Service Level Objectives (SLOs) and error
budgets across
all production systems
• Track error budget burn rates and make data-driven decisions to halt risky
deployments when thresholds are exceeded
• Implement comprehensive monitoring and alerting strategies using Prometheus,
Grafana, and PagerDuty
• Establish and maintain reliability standards that support business-critical
uptime
requirements
Infrastructure Automation & Management
• Design and implement Infrastructure as Code (IaC) solutions using Pulumi
with
TypeScript
• Manage and optimize AWS services including EKS (Elastic Kubernetes Service),
MSK
(Managed Streaming for Kafka), SingleStore, MongoDB S3
• Automate operational processes to eliminate toil, targeting any task that
consumes
more than 2 engineer-days per quarter
Incident Response & Post-Mortem Leadership
• Serve as incident commander during production outages and service
degradations
• Lead comprehensive post-mortem processes within 48 hours of incidents
• Drive "never-again" corrective actions to completion, ensuring systemic
improvements
• Maintain and improve incident response procedures and runbooks
Security & Compliance
• Implement and enforce least-privilege IAM policies across all AWS resources
• Manage security patch pipelines and vulnerability remediation processes
• Support compliance initiatives including SOC2 and ISO 27001 certification
requirements
• Ensure security best practices are embedded in all infrastructure and
operational
procedures
On-Call & Operational Excellence
• Participate in follow-the-sun on-call rotation with one week
primary/secondary
commitment every five weeks
• Provide 24×7 support coverage across AU/NZ, EU/ZA, and MX time zones
• Maintain operational runbooks and knowledge transfer documentation
• Continuously improve on-call experience and reduce alert fatigu
Join the A-Team and experience the A-Life!