Software Engineer at Microsoft
Atlanta, Georgia, United States -
Full Time


Start Date

Immediate

Expiry Date

17 Feb, 26

Salary

0.0

Posted On

19 Nov, 25

Experience

2 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Software Engineering, System Reliability, Service Level Objectives, Observability, Operability, Automation, AI, Distributed Software Design, Cloud Systems Architecture, Microservices, Containers, Data Technologies, Azure, Incident Management, Telemetry, Programming Languages

Industry

Software Development

Description
Contributes to defining system reliability goals through Service Level Objectives (SLOs) and enhancing production posture with targeted improvements in observability and operability (telemetry, alerting, incident/change management, safe deployment practices). Builds reusable automation and processes that help multiple teams meet their reliability goals. With guidance, influences product architecture and roadmaps to ensure customer-experienced reliability is a core design principle. Works directly on product code to achieve reliability outcomes. Leverages AI to proactively detect anomalies, predict incidents, and automate operational workflows - scaling reliability efforts across complex systems. With guidance, supports the design and development of large-scale distributed software services and solutions. Delivers “best-in-class” engineering by ensuring services are modular, secure, reliable, testable, diagnosable, observable, and reusable. Collaborates with internal and external partners to support team goals. Balances pragmatism with vision - driving continuous improvements in process and codebase. Builds automation to prevent or remediate service issues before they impact users. Applies cutting-edge AI tools and techniques to reduce operational toil and scale reliability engineering across complex systems. Bachelor's Degree in Computer Science, or related technical discipline with proven experience coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python, OR equivalent experience. Familiarity with modern distributed software design patterns and cloud systems architecture, including microservices, containers, load balancing, queuing, and caching. Experience building, shipping, and operating reliable solutions. Experience with data technologies (SQL/NoSQL/etc. ). Experience with Azure. Experience in AI adoption with tools like GitHub Copilot, Azure OpenAI, and custom copilots to streamline development and reduce toil.
Responsibilities
Contributes to defining system reliability goals and enhances production posture with improvements in observability and operability. Builds reusable automation and processes to help teams meet reliability goals and works directly on product code to achieve reliability outcomes.
Loading...