Senior Software Engineer at Microsoft
, , United States -
Full Time


Start Date

Immediate

Expiry Date

25 Feb, 26

Salary

0.0

Posted On

27 Nov, 25

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Networking, AI Infrastructure, High Performance Computing, Distributed Systems, Machine Learning, C, C++, C#, Java, JavaScript, Python, Debugging, Scalability, Reliability, Performance Optimization, Observability

Industry

Software Development

Description
Design, develop, and optimize networking solutions tailored for large-scale AI training infrastructure. Architect and implement high-performance, low-latency, and low-jitter communication frameworks for distributed systems. Benchmark, analyze, and enhance the scalability and reliability of networking systems to handle petabyte-scale data transfer. Debug and resolve complex networking issues in large-scale, high-performance environments. Drive identification of dependencies and the development of design documents for a product, application, service, or platform. Create, implement, optimize, debug, refactor, and reuse code to establish and improve performance and maintainability, effectiveness, and return on investment (ROI). Act as a Designated Responsible Individual (DRI) and guides other engineers by developing and following the playbook, working on call to monitor system/product/service for degradation, downtime, or interruptions, alerting stakeholders about status and initiates actions to restore system/product/service for simple and complex problems when appropriate. Proactively seeks new knowledge and adapts to new AI trends, technical solutions, and patterns that will improve the availability, reliability, efficiency, observability, and performance of products while also driving consistency in monitoring and operations at scale. Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, OR Java, JavaScript, or Python 1+ years Networking OR High Performance Computing experience. Bachelor's Degree in Computer Science OR related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, OR Python OR Master's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience. Familiarity with Machine Learning, AI Infrastructure, Operating Systems fundamentals and virtualization technologies 1+ years experience on Distributed Systems 1+ years experience on High Performance Computing / Machine Learning middleware
Responsibilities
Design and develop networking solutions for large-scale AI training infrastructure. Debug and resolve complex networking issues while enhancing scalability and reliability of systems.
Loading...