Senior Network Systems Engineer at Microsoft
Redmond, Washington, United States -
Full Time


Start Date

Immediate

Expiry Date

17 Feb, 26

Salary

0.0

Posted On

19 Nov, 25

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Networking, AI/ML, Hardware Design, Firmware, Software, Integration, Debugging, Risk Assessment, Validation Testing, Troubleshooting, High-Speed Interfaces, Data Center Infrastructure, Silicon Engineering, Collaboration, Architecture, Performance Analysis

Industry

Software Development

Description
Collaborate with architecture, silicon engineering, firmware, hardware design, hardware validation, OS (operating systems), manufacturing, and customer teams to build state-of-the-art accelerator hardware solutions. Participate in architectural discussions and evaluate networking performance required to meet the demands of AI/ML workloads Analyze new interfaces and subsystems to develop integration plans, analyze power efficiency, debug integration issues, and provide recommendations. Perform NUDD (new, unique, different and difficult) technology and feature analysis and provide risk assessment and mitigations. Drive technical requirements and ensure the solution is flexible and scalable across the full (HW/FW/SW) stack. Enable platform and solution level discussions, influencing architecture of the product, and delivering to product goals across quality, reliability, and performance. Collaborate with internal, external, and open-source partners to onboard innovative technologies in a seamless manner. Master's Degree in Electrical Engineering, Computer Engineering, Mechanical Engineering, or related field AND 3+ years technical engineering experience OR Bachelor's Degree in Electrical Engineering, Computer Engineering, Mechanical Engineering, or related field AND 5+ years technical engineering experience OR equivalent experience. 5+ years of relevant experience in system (compute,networking, and/or accelerator) level design and/or implementation across the hardware development lifecycle. 5+ years of hands-on experience in developing AI/ML scale up and scale out networking for accelerator-based hardware systems, as well as developing MAC/PCS and SERDES for high-speed interfaces and high-density connectivity equipment. 5+ years of experience to work across multiple disciplines (hardware, firmware, software, and/or data center infrastructure) to identify risks, drive discussions, detail system tradeoffs, and assess impact. These requirements include, but are not limited to the following specialized security screenings: 5+ years of experience developing GPU, FPGA based accelerator platforms for AI/ML used cases Great to have proven track record of bringing up, integrating, and deploying new hardware technology. Knowledge of high-volume silicon (SoCs, GPUs, or FPGAs), compute, storage, and/or networking design, manufacturing, and deployment. Understanding on Networking hardware QSFP-dd cables, DACs, AECs, Cable Backplanes, NICs, PHY, Switches Ability to define validation test cases to qualify end to end network across functionality, performance and scale testing Ability to trouble shoot network issues at multiple layers Physical layer, Datalink and Network Layer, Protocol layer Great to have: Understanding on AI Network, Network Collectives, Traffic profiles in AI networks, Ultra Ethernet Knowledge about datacenters & operations at scale. Experience managing hardware programs through the entire product lifecycle. Proven ability to communicate effectively in verbal, written, and presentation formats.
Responsibilities
Collaborate with various teams to build advanced accelerator hardware solutions and evaluate networking performance for AI/ML workloads. Drive technical requirements and influence product architecture while ensuring quality, reliability, and performance.
Loading...