Infrastructure Observability Engineer at Snapp

General Trias, Cavite, Philippines -

Full Time

Start Date

Immediate

Expiry Date

06 Mar, 26

Salary

0.0

Posted On

06 Dec, 25

Experience

2 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Infrastructure Monitoring, DevOps Automation, Prometheus, Grafana, ELK, Zabbix, Networking Fundamentals, Linux Administration, CI/CD Tools, Containerization, Scripting Languages, Root Cause Analysis, Incident Management, Cloud Infrastructure, Kubernetes, Reliability Engineering

Industry

technology;Information and Internet

Description

Our Journey So Far At Snapp, we’re redefining how cities move. Our ride-hailing and mobility platform connects millions of riders and drivers every day, delivering safe, reliable, and efficient transport solutions. Powered by real-time data and robust infrastructure, we make urban travel faster, simpler, and more sustainable. We operate with the mindset of a global tech leader and the agility of a startup, building services that scale across markets while staying responsive to local needs. Your Impact As an Infrastructure Observability Engineer within the Platform team, you will work across observability platforms, infrastructure monitoring, and DevOps automation to ensure comprehensive visibility and high system reliability. You will maintain and enhance monitoring and logging stacks, analyze infrastructure events, and drive proactive improvements that strengthen performance and resilience. This highly technical role emphasizes automation and continuous optimization rather than reactive support. What You’ll Drive Forward Build, operate, and optimize monitoring and logging systems (Prometheus, Grafana, ELK, Zabbix, etc.) Ensure full observability coverage for infrastructure, networks, and services. Maintain alerting rules, dashboards, SLO/SLA metrics, and anomaly detection. Analyze logs and metrics to identify patterns and potential risks. Monitor infrastructure health across compute, storage, virtualization, and network layers. Perform root cause analysis of network-related incidents (Routing/Switching, load balancing, DNS, firewalls) Collaborate with network and datacenter teams on incident follow-ups. Maintain knowledge of network topologies, protocols, and traffic flows. Support improvement of infrastructure reliability and performance. Work with CI/CD pipelines to ensure reliable delivery and deployment processes. Develop automation for observability, monitoring, and operational workflows. Maintain Linux-based systems and automate routine infrastructure tasks. Contribute to reliability engineering initiatives (IaC, Docker, GitOps, auto-remediation, etc.) What Powers Your Drive At least 2+ years of experience in NOC/IOC, SRE, infrastructure operations, DevOps, or a similar technical role. Strong hands-on experience with monitoring & logging stacks (Prometheus, Grafana, ELK, Zabbix, etc.). Solid understanding of networking fundamentals (CCNA Routing, Switching, VLANs, BGP, OSPF, load balancing) Strong Linux administration background. Familiarity with CI/CD tools (GitLab CI, ArgoCD, Jenkins, GitHub Actions, etc.) Hands-on experience with containerization (Docker) and service mesh tools Practical knowledge of automation using Bash, Python, or similar scripting languages. Ability to read and interpret logs, metrics, traces, and alerts. Strong communication and documentation skills, especially in technical reporting. Preferred Qualifications (optional) Experience designing observability architecture for large-scale infrastructure. Contribute to reliability engineering initiatives (Terraform, Ansible, Docker, GitOps, auto-remediation, etc.) Knowledge of ITIL Incident/Problem Management practices. Experience with cloud infrastructure or private cloud platforms. Experience with Kubernetes (cluster operation, troubleshooting, manifests, Helm, etc.) Ready to Get on Board? Help us shape the future of ride-hailing and urban mobility. Submit your CV and let’s build smarter cities together.

Responsibilities

The Infrastructure Observability Engineer will build, operate, and optimize monitoring and logging systems to ensure comprehensive visibility and high system reliability. This role involves analyzing infrastructure events and driving proactive improvements to strengthen performance and resilience.