Senior Machine Learning Engineer - Data at Colossyan
Budapest, Közép-Magyarország, Hungary -
Full Time


Start Date

Immediate

Expiry Date

08 Jun, 25

Salary

0.0

Posted On

10 Feb, 25

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Good communication skills

Industry

Information Technology/IT

Description

ABOUT US

At Colossyan, we’re building the future of workplace learning with AI video.

Top companies like P&G, Porsche, BASF, BDO, and Paramount already use Colossyan to create engaging and interactive video content faster and more cost effectively than traditional video production. Nearly 1 million videos have been created using Colossyan, and we’ve been recognised as a G2 Leader in multiple product categories.Here’s an overview of our standout features:

  • Create text-to-speech videos hosted by one of our 150+ AI avatars
  • Translate your video content to 70+ languages in just four clicks
  • Bring documents to life with our document-to-video feature
  • Personalize your videos by creating a custom avatar of yourself, complete with a cloned voice
  • Make learning content interactive with features like branching, multiple choice quizzes, and more

To learn more about our product features, visit colossyan.com.

Responsibilities

THE ROLE

We’re looking for a Senior ML Engineer - Data to play a key role in shaping the foundation of our AI models by curating, processing, and optimizing large-scale datasets.In this role, you’ll work closely with research and product teams to ensure our models are trained on the highest quality data. You’ll design robust data pipelines, develop automated evaluation frameworks, and explore innovative techniques like semi-supervised learning and human-in-the-loop ML to continuously improve model performance.This is an opportunity to make a real impact—your work will directly influence the effectiveness and accuracy of our AI-driven products.

KEY RESPONSIBILITIES:

  • Design and develop scalable data pipelines, including sourcing, scraping, filtering, post-processing, de-duplicating, and versioning of data for AI model training.
  • Build frameworks for data evaluation and quality assessment, ensuring that our models are trained on high-quality, reliable data.
  • Develop automated evaluation pipelines to benchmark new models before deployment in our production API.
  • Collaborate with research and product teams to incorporate their data needs and optimize pipelines for various tasks.
  • Conduct open-ended research on data quality improvements, including semi-supervised learning, human-in-the-loop ML, and fine-tuning with human feedback.
Loading...