Software Systems Engineer at Meta
Fremont, California, United States -
Full Time


Start Date

Immediate

Expiry Date

04 Sep, 26

Salary

245000.0

Posted On

06 Jun, 26

Experience

10 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

C, C++, Python, Linux, Systems Software Engineering, Hardware Lifecycle Management, Distributed Systems, Kernel Development, Device Drivers, BMC/IPMI/Redfish, Infrastructure Automation, Telemetry Analysis, Fleet Health Monitoring, Firmware Integration, System Diagnostics, CI/CD

Industry

Software Development

Description
Meta is seeking a Software Systems Engineer to join our Production Systems Engineering organization, where you will work at the intersection of software and large-scale hardware infrastructure. In this role, you will design, build, and optimize the systems software that powers Meta's global production fleet — spanning servers, storage, networking, and custom silicon. You will drive reliability, efficiency, and performance improvements across the infrastructure stack, partnering closely with hardware engineering, data center operations, and platform teams to ensure Meta's production systems operate at scale. Responsibilities Design and develop systems software for managing, provisioning, and monitoring large-scale production hardware infrastructure, including compute, storage, and networking components Build and maintain tooling for hardware lifecycle management, fleet health monitoring, and automated remediation of production system failures Collaborate with hardware engineering teams to define software interfaces and firmware integration requirements for new servers and accelerator platforms Develop and optimize low-level systems software, including kernel modules, device drivers, and platform management agents, to improve hardware utilization and reliability Architect scalable infrastructure automation frameworks that reduce manual operational toil and accelerate hardware deployment across Meta's data center fleet Identify and resolve systemic reliability and performance issues across production hardware by analyzing telemetry, failure patterns, and system-level diagnostics Define technical direction for production systems software components, driving alignment across infrastructure engineering and data center operations stakeholders Mentor other engineers on systems software design patterns, debugging methodologies, and production infrastructure best practices Lead cross-functional efforts to evaluate and integrate new hardware technologies into the production environment, including bring-up, validation, and qualification workflows Minimum Qualifications Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience 8+ years of experience in systems software engineering, including development in C, C++, or Python for Linux-based production environments 6+ years of experience with large-scale infrastructure systems, including hardware lifecycle management, fleet automation, or data center operations software Experience developing or integrating with low-level system components, such as kernel interfaces, BMC/IPMI/Redfish management stacks, or hardware telemetry frameworks Experience designing and operating distributed systems software at scale, including monitoring, alerting, and automated remediation pipelines Experience communicating technical decisions and system designs through written documentation and cross-functional stakeholder alignment Preferred Qualifications Experience debugging and troubleshooting issues across hardware and software boundaries Experience working on hardware/software projects in the manufacturing and hardware validation space Familiarity with test automation frameworks and CI/CD pipelines Experience with large-scale distributed systems $173,000/year to $245,000/year + bonus + equity + benefits
Responsibilities
Design and develop systems software to manage, provision, and monitor Meta's global production hardware infrastructure. Collaborate with hardware engineering to optimize low-level software and architect scalable automation frameworks to reduce operational toil.
Loading...