Sr. Manager Software Development - GPU Communication Libraries at Advanced Micro Devices Inc
Austin, TX 78735, USA -
Full Time


Start Date

Immediate

Expiry Date

25 Jul, 25

Salary

0.0

Posted On

25 Apr, 25

Experience

0 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Version Control, Software Projects, Testing, Hip, Continuous Integration, Software Engineering Practices, Opencl, Algorithms, Leadership Skills, Communication Skills, Cuda, Coding Standards

Industry

Computer Software/Engineering

Description

PREFERRED EXPERIENCE:

  • Proven leadership skills and a history of delivering software projects
  • Knowledge of professional software engineering practices and best practices for the full software development life cycle including requirements elicitation and analysis, scoping/estimation, coding standards, code reviews, version control, build processes, testing, and continuous integration
  • Experience managing the day-to-day activities of a software engineering team using Agile methods
  • Strong written and verbal communication skills
  • Experience with open-source development processes
  • Knowledge of one or more of the following: RDMA Verbs, Libfabric/UCX, MPI/SHMEM/PGAS, collective communication algorithms
  • Experience developing GPU-based parallel computing software using HIP, CUDA, or OpenCL
Responsibilities

WHAT YOU DO AT AMD CHANGES EVERYTHING

We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences – the building blocks for the data center, artificial intelligence, PCs, gaming and embedded. Underpinning our mission is the AMD culture. We push the limits of innovation to solve the world’s most important challenges. We strive for execution excellence while being direct, humble, collaborative, and inclusive of diverse perspectives.
AMD together we advance_
Responsibilities:

THE ROLE:

As a software development manager, you will lead a team of talented computer scientists and software engineers developing GPU network software for high performance computing and machine learning workloads as part of the AMD Radeon Open Ecosystem (ROCm). The work spans all levels of the stack, including the Linux kernel / network device driver level, RDMA Verbs, Libfabrics and UCX, MPI/SHMEM and machine learning specific collective libraries like RCCL.

KEY RESPONSIBILITIES:

  • Manage the day-to-day activities of the team within an Agile/Scrum environment
  • Work closely with senior developers, architects, and stakeholders to develop a long-term strategy and feature backlogs for your libraries and translate that into achievable road maps and project plans in an Agile environment
  • Establish a software development process that spans the entire software development life cycle and work with your engineers and key stakeholders from QA, DevOps, and program management to establish and continuously improve that process
  • Work with senior developers and architects to develop the best technical designs and approaches
  • Manage, execute, and report progress against project plans and delivery commitments
  • Build and track metrics to proactively drive process improvement
  • Hire, mentor and develop software engineers
Loading...