Software Developer 3 at Oracle Risk Management Services
, , United States -
Full Time


Start Date

Immediate

Expiry Date

14 Feb, 26

Salary

0.0

Posted On

16 Nov, 25

Experience

2 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Software Development, NCCL, Distributed Systems, AI, ML, HPC, Scalability, Agile, Collaboration, Performance Tuning, Collective Communication, High-Speed Networks, System Design, Adaptability, Self-Motivation, Code Quality

Industry

IT Services and IT Consulting

Description
Oracle Cloud Infrastructure (OCI) Cluster Networking team is building an ultra-high-performance network to support AI/ML/HPC workloads. Join us to design systems that scale from tens to hundreds of thousands of GPUs without sacrificing performance. Our team develops and tunes the software and hardware stack for distributed workloads using libraries such as NCCL on high-speed networks. Strong knowledge and practical experience with NCCL is essential for this role. You’ll apply collective communication libraries to tune system performance at a previously unheard-of scale—our approach to scaling is cutting edge. We’re looking for adaptable, self-motivated engineers who learn quickly, write solid code, and work across the stack. Ideal candidates have experience with distributed systems, value scalability and simplicity, and thrive in collaborative, agile environments.

How To Apply:

Incase you would like to apply to this job directly from the source, please click here

Responsibilities
Design systems that scale from tens to hundreds of thousands of GPUs. Develop and tune the software and hardware stack for distributed workloads.
Loading...