Research Intern - AI SW/HW Co-design at Microsoft
Hillsboro, Oregon, United States -
Full Time


Start Date

Immediate

Expiry Date

20 Feb, 26

Salary

0.0

Posted On

22 Nov, 25

Experience

2 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

GPU Programming, AI Accelerator Programming, Analytical Modeling, Hardware Benchmarking, Compute Graphs, Generative AI Models, Workload Optimization, Model Sharding, Operator Fusion, Tiling, Programming AI Accelerators, Triton, C Programming, Motivation, Agency

Industry

Software Development

Description
Research Interns put inquiry and theory into practice. Alongside fellow doctoral candidates and some of the world's best researchers, Research Interns learn, collaborate, and network for life. Research Interns not only advance their own careers, but they also contribute to exciting research and development strides. During the 12-week internship, Research Interns are paired with mentors and expected to collaborate with other Research Interns and researchers, present findings, and contribute to the vibrant life of the community. Research internships are available in all areas of research, and are offered year-round, though they typically begin in the summer. Currently enrolled in a Ph.D. program in Computer Science, Electrical Engineering or a related STEM field 1+ years of experience working in Graphics Processing Unit (GPU)/AI Accelerator programming In addition to the qualifications below, you'll need to submit a minimum of two reference letters for this position as well as a cover letter and any relevant work or research samples. After you submit your application, a request for letters may be sent to your list of references on your behalf. Note that reference letters cannot be requested until after you have submitted your application, and furthermore, that they might not be automatically requested for all candidates. You may wish to alert your letter writers in advance, so they will be ready to submit your letter. Experience working with model serving stacks such as SGLang/vLLM. Ability to identify bottlenecks with workloads - through analytical modeling and actual hardware benchmarking. Familiarity of compute graphs of Generative AI models including LLMs and image/video diffusion models. Understanding of common techniques used for workload optimization such as model sharding, operator fusion, tiling etc. Proficiency in programming AI accelerators - Coding in Triton or C to achieve desired goals. High levels of motivation and agency.
Responsibilities
Research Interns collaborate with fellow doctoral candidates and researchers to advance their careers and contribute to research and development. They are expected to present findings and engage with the community during the internship.
Loading...