Senior Software Engineer - Chaos Engineering at Microsoft
Redmond, Washington, United States -
Full Time


Start Date

Immediate

Expiry Date

14 Jan, 26

Salary

0.0

Posted On

16 Oct, 25

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Chaos Engineering, Service Reliability, System Resilience, Distributed Systems, Power Efficiency, Platform Costs, Networking Costs, Incident Response, Monitoring, Architecture Choices, Optimization, Microsoft Azure, Microsoft Research, M365, High Availability, Substrate

Industry

Software Development

Description
The High Availability (HA) team part of M365 Core, is seeking a Senior Software Engineer - Chaos Engineering. This role is crucial as HA has been a cornerstone of the Substrate backend solution. We continue to explore opportunities for improving and optimizing service reliability. Our continuous strive to provide best service to our customers goes beyond just optimizing the storage stack solution. We work relentlessly on reducing Microsoft capital and operational expenses, as we continue to explore more paths for optimization while maintaining reliable 4.5 9s availability. To achieve that HA has extended its charter beyond traditional database availability and redundancy solution - towards optimizing power efficiency, platform costs, networking costs. The latter will be the major focus of a talented engineer who decides to join our team.   Chaos Engineering is the discipline of experimenting on a system to build confidence in the system’s capability to withstand turbulent conditions in production. As part of Chaos team in HA, you will be working closely with partners (Azure, EXO-Exchange Online, MSR-Microsoft Research) to build the next generation of Chaos platform for Substrate. The platform will validate the resilience, architecture choices, predictability and even monitoring and incident response processes of critical components in M365 distributed systems. Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
Responsibilities
As part of the Chaos team in HA, you will work closely with partners to build the next generation of the Chaos platform for Substrate. This platform will validate the resilience and architecture choices of critical components in M365 distributed systems.
Loading...