Multimodal AI Researcher, Audio at Dolby Laboratories, Inc.

Atlanta, Georgia, United States -

Full Time

Start Date

Immediate

Expiry Date

07 Jan, 26

Salary

0.0

Posted On

09 Oct, 25

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Generative Modeling, Multimodal Semantic Understanding, Deep Learning, Audio Processing, Self-Supervised Learning, Python, TensorFlow, PyTorch, Audio Enhancement, Speech Analysis, Music Information Retrieval, Publication Record, Collaboration Skills, Innovation, Mentoring, Research Collaboration

Industry

Computers and Electronics Manufacturing

Description

Generative modeling for audio applications (diffusion models, autoregressive models, masked generative transformers). Multimodal semantic understanding and multimodal reasoning. Multimodal representations (audio-video, audio-text, audio-video-text). Multimodal AI architectures, with a focus on generating audio, music, and speech (text-to-audio, video-to-audio, image-to-audio). Self and semi-supervised learning. AI driven audio enhancement, processing, and generation (for speech and music), such as speech enhancement and analysis, source separation, text-to-speech, text-to-music, music information retrieval, audio classification. LLMs for audio applications. Ph.D. in Computer Science or similar field. A strong background in deep learning, both in terms of conceptual understanding, as well as practical experience. Technical knowledge of audio fundamentals. Deep passion for audio, music, and multimedia applications. Deep knowledge on current machine learning literature. Strong publication record, with publications in major machine learning conferences (e.g. NeurIPS, ICLR, ICML) or top domain-specific conferences is desirable (e.g., ACL, CVPR, ICASSP, Interspeech). Highly skilled in Python and one or more popular deep learning frameworks (TensorFlow or PyTorch). Ability to envision new technologies and turn them into innovative products. Good communication and collaboration skills. Use deep learning to create new solutions (including foundation models) and enhance existing applications. Push the state-of-the-art and develop intellectual property. Transfer technology to product groups. Establish research collaborations with external university partners. Mentor interns on novel research problems. Publish papers in top-tier conferences and journals. Advise internal leaders on recent deep learning advancements in the industry and academia to further influence research direction and business decisions.

Responsibilities

The role involves generative modeling for audio applications and developing multimodal AI architectures focused on generating audio, music, and speech. Responsibilities also include mentoring interns, publishing research, and advising on deep learning advancements.