Site Reliability Engineer, ASE Block Storage
at Apple
Cupertino, California, USA -
Start Date | Expiry Date | Salary | Posted On | Experience | Skills | Telecommute | Sponsor Visa |
---|---|---|---|---|---|---|---|
Immediate | 26 Nov, 2024 | USD 264200 Annual | 31 Aug, 2024 | N/A | Capacity Planning,Storage Solutions,Code,Distributed Systems,Provisioning,Data Migration,Backup,Linux,Kubernetes,Disaster Recovery,Storage Systems,Microservices | No | No |
Required Visa Status:
Citizen | GC |
US Citizen | Student Visa |
H1B | CPT |
OPT | H4 Spouse of H1B |
GC Green Card |
Employment Type:
Full Time | Part Time |
Permanent | Independent - 1099 |
Contract – W2 | C2H Independent |
C2H W2 | Contract – Corp 2 Corp |
Contract to Hire – Corp 2 Corp |
Description:
SUMMARY
Posted: Jun 24, 2024
Weekly Hours: 40
Role Number:200556709
Apple Cloud infrastructure is vast, and the storage SRE teams of Apple Cloud are building and running the next generation distributed storage systems to support Apple’s most critical services. Operating at our scale, across multiple geographically dispersed data centers, and servicing users with exceptionally large data presents unique challenges. As a storage SRE at Apple, you’ll need to solve these problems using your deep understanding of storage, data analysis, programming, teamwork, and expertise in Linux system internals. Storage SREs at Apple involve themselves across the full infrastructure stack; from tuning the block storage layer to content delivery network traffic management.
DESCRIPTION
We are looking for seasoned software and systems engineers to join the Block Storage SRE team at Apple. The role involves tremendous amount of individual responsibility and influence over the direction the platform, shaping its use by many critical Apple Cloud services for years to come. You are someone with ideas and real passion for software delivered as a service to improve reuse, efficiency, and simplicity. This engineer’s work will affect hundreds of millions of users and be essential to the success of some of the most visible current and future Apple features. At Apple Cloud, we run a mix of open source, vendor licensed, and internally developed tools to perform functions such as system configuration management, provisioning, software development & deployment, logging, and monitoring. You’ll learn these tools and have opportunities to improve them. We think critically and strive to balance the best solution with the need to get things done for each engineering challenge we face. Good ideas are heard and results are rewarded.
- 5+ years of experience in a Site Reliability Engineer or Infrastructure Software Development role.
- Acute drive to automate manual operations and to improve them with well defined and tested APIs.
- Awareness of best practices for deployment of storage systems - implication of physical and virtual deployment models to change management, failure domains, hardware lifecycle management, etc.
- Experience with deploying, supporting and monitoring new and existing services, platforms, and application stacks.
- Experienced in SRE principles, such as monitoring, alerting, error budgets, fault analysis, and other common concepts in reliability engineering. Skilled at identifying opportunities to reduce manual work through enhancements in code and processes
- Kubernetes Operator development experience.
- Familiarity with relational & non-relational databases (such as Cassandra, Postgres, & RocksDB).
- BS or MS in Computer Science or equivalent industry experience
PREFERRED QUALIFICATIONS
- Experience in building, operating, and scaling distributed storage systems in a private, public, or hybrid cloud environment.
- The ability to design, author, review, and release code in one or more high level language (e.g. Go (preferred), Rust, Python, and/or Java, etc.).
- Good understanding of block, object, and file storage solutions in Linux (such as LVM, XFS, ext4, S3, Ceph, Gluster, NFS).
- Familiarity with microservices architecture and container orchestration with Kubernetes.
- Understanding of Linux internals, standard networking protocols, and distributed systems.
- Experience with provisioning, data migration, backup & recovery, at-scale testing, disaster recovery, and capacity planning.
Responsibilities:
- 5+ years of experience in a Site Reliability Engineer or Infrastructure Software Development role.
- Acute drive to automate manual operations and to improve them with well defined and tested APIs.
- Awareness of best practices for deployment of storage systems - implication of physical and virtual deployment models to change management, failure domains, hardware lifecycle management, etc.
- Experience with deploying, supporting and monitoring new and existing services, platforms, and application stacks.
- Experienced in SRE principles, such as monitoring, alerting, error budgets, fault analysis, and other common concepts in reliability engineering. Skilled at identifying opportunities to reduce manual work through enhancements in code and processes
- Kubernetes Operator development experience.
- Familiarity with relational & non-relational databases (such as Cassandra, Postgres, & RocksDB).
- BS or MS in Computer Science or equivalent industry experienc
REQUIREMENT SUMMARY
Min:N/AMax:5.0 year(s)
Information Technology/IT
IT Software - Other
Software Engineering
BSc
Computer Science
Proficient
1
Cupertino, CA, USA