Sign up with

Already have an account? Log in here

Need some help?
Talk to us at +91 7670800001

Machine Learning Engineer (5+ years of experience) at Captions

New York, NY 10022, USA -

Full Time

Start Date

Immediate

Expiry Date

12 Jun, 25

Salary

230000.0

Posted On

13 Mar, 25

Experience

0 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Kubernetes, Cuda, Python, Containerization, Docker, Domain Experience

Industry

Information Technology/IT

Description

Captions is the leading video AI company, building the future of video creation. Over 10 million creators and businesses have used Captions to create videos for social media, marketing, sales, and more. We’re on a mission to serve the next billion.
We are a rapidly growing team of ambitious, experienced, and devoted engineers, researchers, designers, marketers, and operators based in NYC. You’ll join an early team and have an outsized impact on the product and the company’s culture.
We’re very fortunate to have some the best investors and entrepreneurs backing us, including Index Ventures (Series C lead), Kleiner Perkins (Series B lead), Sequoia Capital (Series A and Seed co-lead), Andreessen Horowitz (Series A and Seed co-lead), Uncommon Projects, Kevin Systrom, Mike Krieger, Lenny Rachitsky, Antoine Martin, Julie Zhuo, Ben Rubin, Jaren Glover, SVAngel, 20VC, Ludlow Ventures, Chapter One, and more.
Check out our latest financing milestone and some other coverage:
The Information: 50 Most Promising Startups
Fast Company: Next Big Things in Tech
The New York Times: When A.I. Bridged a Language Gap, They Fell in Love
Business Insider: 34 most promising AI startups
Time: The Best Inventions of 2024

DOMAIN EXPERIENCE

Exposure to diffusion models, multimodal video generation, or large-scale generative architectures.
Experience with distributed training frameworks (FSDP, DeepSpeed, Megatron-LM) or HPC environments.

Responsibilities

ABOUT THE ROLE

Captions is seeking a Machine Learning Engineer to partner closely with our Researchers and bring large-scale multimodal video diffusion models into production. You’ll be responsible for optimizing and deploying state-of-the-art generative models (tens to hundreds of billions of parameters) to deliver low-latency, high-throughput inference at scale. This is a unique opportunity to work on cutting-edge AI—spanning audio-video generation, diffusion architectures, and temporal modeling—and ensure these innovations reach millions of creators worldwide.