Sr. Director, Site Reliability Engineering, Mobility at WEX

Maine, New York, United States -

Full Time

Start Date

Immediate

Expiry Date

12 Jun, 26

Salary

239000.0

Posted On

14 Mar, 26

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Site Reliability Engineering, SRE Roadmap, Global Scalability, Incident Management, Root-Cause Analysis, Predictive Incident Management, AI/ML, Error Budget Oversight, SLOs, SLIs, Infrastructure as Code, Cloud Cost Optimization, Kubernetes, Service Mesh, Leadership, Communication

Industry

Software Development

Description

Role: Senior Director, Site Reliability Engineering (Mobility) Role Overview As the Senior Director of SRE, you will be the architect leader of resilience for our global mobility platform, leading a high-performing organization of engineers dedicated to ensuring that our services—from real-time GPS tracking and dispatching to global payment processing—are seamless, scalable, and secure, bridging the gap between ambitious product velocity and the uncompromising "five-nines" availability required for a world in motion. We are seeking a seasoned Sr. Director of Engineering in the WEX Mobility Engineering organization to lead engineering for one of the Mobility verticals that caters to payments solutions. The Mobility team spans across the USA, India, and Brazil, providing SaaS and API solutions to various customers in North America. WEX Mobility products enable credit issuance to fleet companies and owner-operators in the form of WEX Credit Cards, widely accepted at fueling stations and select other merchants in the US, allowing fleet managers and operators to configure spend controls that restrict fleet members to use their card only at preconfigured merchants, for configured product families, and for certain amounts. Key Responsibilities Strategic Leadership: Define the multi-year SRE roadmap, pivoting from reactive firefighting to proactive, automated platform health. Global Scalability: Oversee infrastructure that supports millions of concurrent trips across diverse geographic regions, accounting for local regulatory and latency requirements. Incident Management & Prevention: Own the end-to-end incident lifecycle. You won't just manage the "Big Outages"; you’ll foster a blameless culture focused on root-cause analysis (RCA) and permanent remediation. Predictive Incident Management: Deploy AI/ML models to analyze historical telemetry data to predict capacity "hotspots" and system fatigue hours before they manifest. Error Budget Oversight: Partner with Product and Engineering VPs to balance innovation speed with reliability via strictly enforced Service Level Objectives (SLOs) and Service Level Indicators (SLIs). Planning: Work closely with product and commercial partners to drive, prioritize, and work backwards from the customer requirements and exceed expected outcomes. Execution: Drive effective monthly, weekly, and quarterly mechanisms to plan, execute, and audit workstreams. Cloud Cost & Efficiency: Optimize a massive global cloud footprint (AWS/GCP/Azure), ensuring performance doesn't come at the cost of unsustainable burn. Platform Engineering: Champion "Infrastructure as Code" (IaC) and self-service tooling so that developers can deploy safely without manual intervention. Team management and growth: Establish a robust and clear engineering roadmap to maintain clarity and motivation for the engineering team. Maintain career growth plans and provide monthly and quarterly feedback for individuals’ continual progress. Productivity: Establish measurement of metrics-driven dev productivity across Mobility SRE org Communication: Comfortably present, influence, and communicate to the senior leadership team. Provide regular updates and insights to senior leadership on the challenges and opportunities within the Mobility domain.Effectively manage up, across, and down with tangible written strategy documents or plans. Qualifications & Experience Education: BS/MS in Computer Science, Engineering, or equivalent practical experience. Experience: 12+ years in SRE, with at least 5 years in a senior leadership role (Director or above) managing managers. Scale: Proven track record of managing distributed systems at a "Hyper-scale" level (e.g., millions of requests per second). Platform: Expertise in rapid development and deployment using cloud computing platforms such as AWS or Azure. Technical Depth: Deep understanding of Kubernetes, service mesh (Istio/Linkerd), edge computing, and global traffic management. Leadership: Excellent leadership, team-building, and dynamic decision-making skills. Ability to deal with ambiguity and thrive in a fast-paced, dynamic environment. Excellent verbal and written communication skills. Preferred Qualifications: Experience with high-concurrency, geospatial, or real-time marketplace dynamics is a significant plus. Experience building high-performance distributed systems at internet-scale companies. Experience building credit card products, or experience developing solutions in a scheme/network. Experience building or managing fleet systems. Experience working on closed-loop card systems. The base pay range represents the anticipated low and high end of the pay range for this position. Actual pay rates will vary and will be based on various factors, such as your qualifications, skills, competencies, and proficiency for the role. Base pay is one component of WEX's total compensation package. Most sales positions are eligible for commission under the terms of an applicable plan. Non-sales roles are typically eligible for a quarterly or annual bonus based on their role and applicable plan. WEX's comprehensive and market competitive benefits are designed to support your personal and professional well-being. Benefits include health, dental and vision insurances, retirement savings plan, paid time off, health savings account, flexible spending accounts, life insurance, disability insurance, tuition reimbursement, and more. For more information, check out the "About Us" section. Pay Range: $195,000.00 - $239,000.00 WEX is a global commerce platform that helps businesses solve for operational complexities like employee benefits, managing and mobilizing fleets, and streamlining payments. With over 6,500 employees, we work with large and small companies in more than 200 countries and territories, and can tailor our services to meet the unique needs of their businesses. We hire people who share our passion for continuous innovation and client service that is unparalleled in the industry. Offering comprehensive and market competitive benefits, our offerings are designed to support your personal and professional well-being. If you’re looking for a growing career - come be part of WEX today. To learn more about our employee benefits, please click here. WEX is an equal opportunity employer committed to diversity and inclusion in the workplace. All qualified applicants will receive consideration for employment without regard to sex, race, color, age, national origin, religion, sexual orientation, gender identity, protected veteran status, disability or other protected status. WEX promotes a drug-free workplace. Qualified individuals with a disability have the right to request a reasonable accommodation. If you require a reasonable accommodation as a result of your disability at any point in the job application process, please submit your request through our Reasonable Accommodation Request Form. This form is for accommodation requests only and cannot be used to inquire about the status of applications.

Responsibilities

This role involves defining the multi-year SRE roadmap, shifting focus from reactive firefighting to proactive automation for platform health across global mobility services. The director will own the end-to-end incident lifecycle, foster a blameless culture, and deploy predictive AI/ML models to prevent capacity issues.