MLOps Engineer at AI Robot Association

Tokyo, , Japan -

Full Time

Start Date

Immediate

Expiry Date

27 Jan, 26

Salary

0.0

Posted On

29 Oct, 25

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

MLOps, Machine Learning, Robotics, Python, PyTorch, JAX, Cloud Services, Distributed Systems, Data Pre-processing, Model Optimization, Performance Analysis, SQL, ROS, Sensor Fusion, Image Processing, Hyper-parameter Optimization

Industry

technology;Information and Internet

Description

About AIRoA The AI Robot Association (AIRoA) is launching a groundbreaking initiative: collecting one million hours of humanoid robot operation data with hundreds of robots, and leveraging it to train the world’s most powerful Vision-Language-Action (VLA) models. What makes AIRoA unique is not only the unprecedented scale of real-world data and humanoid platforms, but also our commitment to making everything open and accessible. We are building a shared “robot data ecosystem” where datasets, trained models, and benchmarks are available to everyone. Researchers around the world will be able to evaluate their models on standardized humanoid robots through our open evaluation platform. For researchers, this means an opportunity to: Work on fundamental challenges in robotics and AI: multimodal learning, tactile-rich manipulation, sim-to-real transfer, and large-scale benchmarking. Access state-of-the-art infrastructure: hundreds of humanoid robots, GPU clusters, high-fidelity simulators, and a global-scale evaluation pipeline. Collaborate with leading experts across academia and industry, and publish results that will shape the next decade of robotics. Contribute to an initiative that will redefine the future of embodied AI—with all results made open to the world. Key Responsibilities Design, implement, and maintain large-scale ML pipelines and optimize model performance for training on massive robot datasets Design, deploy, and maintain distributed training clusters to reduce model development cycles Collaborate closely with VLA researchers to capture evolving ML infrastructure, data pre-processing, training, monitoring, evaluation, and deployment requirements and continuously improve ML pipeline through analysis and experimentation. Optimize ML infrastructure and pipeline for cost, performance, and reliability Design, develop, and maintain MLOps tools and platforms to ensure VLA researchers can efficiently visualize and analyze the performance Required Qualifications Master’s degree in Computer Science, Engineering, or a related field (or equivalent practical experience). 3+ years of professional experience as a software engineer in MLOps engineering, machine learning, or robotics. Experience developing high-quality, production-level software in a team environment. Experience in deploying distributed systems to popular cloud services such as AWS, GCP, Azure. Experience in orchestration tools such as Airflow, Dagster, or Kedro. High proficiency in Python. High proficiency in PyTorch or JAX. Preferred Qualifications Experience with training and fine-tuning techniques for RL, VLM, and VLA models, including distillation, supervised fine-tuning, and policy optimization. Experience in hyper-parameter optimization Experience in distributed training frameworks and cluster management for deep neural network training Deep understanding of GPU memory management and optimization techniques Experience in analyzing, monitoring, and managing data quality. Experience with processing robotics-related sensor data (e.g., RGB/Depth images, point clouds), including knowledge of image/signal processing, sensor fusion, and time synchronization. Experience with ROS/ROS2. High proficiency in SQL. Experience optimizing system performance using performance analysis tools. Others (linguistic qualification, etc.) 【Highly appreciated】 English proficiency at business level There are currently no comparable projects in the world that collect data and develop foundation models on such a large scale. As mentioned above, this is one of Japan’s leading national projects, supported by a substantial investment of 20.5 billion yen from NEDO. This position will play a crucial role in determining the success of the project. You will have broad discretion and responsibility, and we are confident that, if successful, you will gain both a great sense of achievement and the opportunity to make a meaningful contribution to society. Furthermore, we strongly encourage engineers to actively build their careers through this project—for example, by publishing research papers and engaging in academic activities.

Responsibilities

Design, implement, and maintain large-scale ML pipelines while optimizing model performance for training on massive robot datasets. Collaborate closely with VLA researchers to continuously improve ML infrastructure and pipeline through analysis and experimentation.