Senior Software Development Eng. - GPU Networking at Advanced Micro Devices
Dublin, County Dublin, Ireland -
Full Time


Start Date

Immediate

Expiry Date

15 Sep, 25

Salary

0.0

Posted On

15 Jun, 25

Experience

0 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

C++, C, Hip, Version Control, Software Development, Debugging, Documentation, Cuda, System Software, Optimization, Testing

Industry

Computer Software/Engineering

Description

PREFERRED EXPERIENCE:

  • Strong background developing system software in C/C++
  • Experience with at least one of the following:
  • Implementing communication middleware like MPI/SHMEM
  • Implementing lower-level communication frameworks like UCX and libfabric, or development using RDMA APIs
  • Development and optimization of communication collective algorithms (e.g. AllReduce)
  • Familiarity with GPU programming in HIP or CUDA
  • In-depth knowledge of best practices in software development, including testing, profiling, debugging, documentation, version control, issue tracking, and planning
  • Proven track record contributing to open-source projects
Responsibilities

WHAT YOU DO AT AMD CHANGES EVERYTHING

We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences – the building blocks for the data center, artificial intelligence, PCs, gaming and embedded. Underpinning our mission is the AMD culture. We push the limits of innovation to solve the world’s most important challenges. We strive for execution excellence while being direct, humble, collaborative, and inclusive of diverse perspectives.
AMD together we advance_

THE ROLE:

As a GPU network software engineer you will design, implement, and test features in communication libraries, middleware, and frameworks to provide best in class support for GPU applications running high performance computing and machine learning workloads at scale. You will work with technical experts within AMD, our partners, and the open-source community to implement these features as part of AMD’s open source ROCm stack for GPU computation.

KEY RESPONSIBILITIES:

  • Design, implement, and test features to enhance GPU support in communication libraries, middleware and frameworks
  • Benchmark, profile and optimize code to maximize performance of multi-node GPU applications
  • Deliver high-quality code and documentation following best practices for open-source software development
  • Work with key technical experts at our customers, across AMD, and with our industry partners in the Ultra Ethernet Consortium and Ultra Accelerator Link Consortium to advance scale out and scale-up software and hardware solutions.
Loading...