Software Engineer - Host Networking at Meta
Menlo Park, California, United States -
Full Time


Start Date

Immediate

Expiry Date

06 Feb, 26

Salary

0.0

Posted On

08 Nov, 25

Experience

2 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

C/C++, Python, Shell Scripting, Network Devices, Routing Protocols, Test Suites, Systems Programming, TCP/IP, HTTP/HTTPS, Linux Kernel, Remote Direct Memory Access, FPGA Emulation, CUDA, RoCM, OpenCL

Industry

Software Development

Description
At Meta, we're building and operating one of the world's most dynamic and fast-paced networks, powering our global data centers and supporting cutting-edge technologies like AI, Generative AI, Recommendation engines, and Metaverse. Our network infrastructure teams are responsible for developing, deploying, and operating this complex system, covering the entire network lifecycle from hardware development to operation. We're seeking software engineers to join our teams and help build scalable distributed systems, develop innovative solutions to our challenges, and ship them into production. As part of our network engineering teams, you'll have the opportunity to work on cutting-edge switching technology, collaborate with talented engineers, and contribute to the development of Meta's hyper-scale network infrastructure. Responsibilities Design, develop, and validate drivers, firmware, and software for network devices, transport stacks, and AI workloads Debug complex system-level issues and lead performance tuning exercises to optimize software stack performance Understand software components from multiple partner teams, lead integration efforts, and drive continued development Develop and automate test suites for CI/CD framework and various components Collaborate with partner teams to integrate software components, align on goals, and participate in oncall rotations Design, develop, and deploy services to manage datacenter network switches and forwarding functions Enhance HPC collective communication and parallel computing libraries (NCCL, RCCL, OneCCL, MPI) Qualifications Currently has, or is in the process of obtaining a Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience. Degree must be completed prior to joining Meta 2+ years software development experience in industry settings or PhD degree +9 months of experience Proficiency in C/C++ and at least one scripting language (Python/Shell Scripting) Experience with network devices and products (routers, switches, adapters, load balancers) and an understanding of network routing protocols Experience with developing and automating test suites Systems programming, TCP/IP, HTTP/HTTPS, SPDY, DNS, and load balancers Linux Kernel, especially drivers and network stack Working knowledge of transport stack particularly Remote Direct Memory Access (RDMA) and/or RDMA over Converged Ethernet version 2 (RoCEv2) Qemu, FPGA Emulation environment is a plus Parallel computing platforms such as CUDA, RoCM and OpenCL Experience with one of Platform services (program, control, and monitor Optics, Physical Layer (PHY), FPGAs, sensors, fan control, power etc), Board Support Package (BSP), Operating Systems, Kernel, Bootloader, Power Management, Real-Time Operating System (RTOS), Linux
Responsibilities
Design, develop, and validate drivers, firmware, and software for network devices while debugging complex system-level issues. Collaborate with partner teams to integrate software components and enhance HPC collective communication libraries.
Loading...