Large Model Algorithm Researcher (Multimodal & Code AI)-Soaring Star Talent at ByteDance

Singapore, Southeast, Singapore -

Full Time

Start Date

Immediate

Expiry Date

30 Aug, 25

Salary

0.0

Posted On

30 May, 25

Experience

0 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Natural Language Processing, Programming Languages, Ml, Communication Skills, Computer Vision, Machine Learning, Technology, Team Spirit, Problem Analysis

Industry

Information Technology/IT

Description

QUALIFICATIONS

Got PhD degree, with priority given to those who have published papers in fields such as machine learning (ML), computer vision (CV), and natural language processing (NLP).2. Possess excellent programming skills, data structure, and algorithm skills, proficient in C/C++ or Python programming languages. Priority will be given to those who have won awards in competitions such as ACM/ICPC, NOI/IOI, Top Coder, and Kaggle.3. Have research experience in the field of machine learning, particularly in large-scale language models (LLMs) and generative artificial intelligence.4. Be passionate about technology, have outstanding problem analysis and solving abilities, be enthusiastic about solving challenging problems, and possess good communication skills and team spirit.

Responsibilities

Team introduction:The AI Innovation Center is a department focused on building AI infrastructure and driving cutting-edge research in AI. We explore industry-leading AI technologies, including large language models (LLMs) and multimodal large models, with the goal of developing models that can understand multilingual content and vast amounts of video data, ultimately delivering a better content consumption experience for users. In the Code AI domain, we leverage the powerful code understanding and reasoning capabilities of LLMs to enhance program performance and R&D efficiency.Project Introduction:Multimodal foundation large models (VLM) represent a research hotspot in the industry and a critical technology for business scenario applications. In 2024, Innovation Center developed VFM V1, a multimodal large model tailored for TikTok’s business scenarios. It matches the performance of the best open-source model Qwen VL on public test sets, while significantly outperforming all other foundation models on business test sets. In the future, we aim to continuously develop foundation models with efficient perception and reasoning capabilities, capable of handling multilingual and massive video content understanding algorithms to deliver a better content consumption experience for users.Project Challenges:Enhance the multimodal perception encoder: The current encoder uses a fixed frame rate. We need to explore more efficient adaptive frame rates while considering the integration of modalities such as audio and user behavior.How to fuse multimodal perception and thinking capabilities to promote stronger comprehensive perception and cognitive abilities of the model.