Intelligent Computing Architecture and Operating System Researcher-Soaring at ByteDance
Singapore, Southeast, Singapore -
Full Time


Start Date

Immediate

Expiry Date

30 Aug, 25

Salary

0.0

Posted On

30 May, 25

Experience

0 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Computer Science, Data Structures, Operating Systems, Kernel, C, Mathematics, C++, Computer Architecture, Problem Analysis, Artificial Intelligence

Industry

Information Technology/IT

Description

QUALIFICATIONS

  1. Academic BackgroundGot doctor degree, preferably majoring in software engineering, computer science, mathematics, artificial intelligence, or related fields.Strong capabilities in computer architecture; excellent coding skills, solid foundation in data structures and fundamental algorithms; proficiency in C/C++, Go, or Python.2. Technical ProficiencyFamiliar with Linux operating systems, kernel, and network-related knowledge; prior development experience in these areas is preferred. Experience in performance optimization is a plus.3. Problem-Solving AbilitiesOutstanding problem analysis and solving skills, with the ability to independently explore innovative solutions.4. Collaboration and CommunicationStrong communication and teamwork skills, capable of collaborating with cross-functional teams to explore new technologies and drive technological advancements.5. Professional MindsetRobust psychological resilience and adaptability, with the courage to tackle challenges and the ability to remain calm, composed, and flexible in complex situations.
Responsibilities

Team Introduction:The ByteDance System Department is responsible for the R&D, design, procurement, delivery, and operational management of the company’s infrastructure ranging from chips to servers, operating systems, networks, CDNs, and data centers. It provides efficient, stable, and scalable infrastructure to support global services such as Douyin, Toutiao, and Volcano Engine.The current areas of operation include, but are not limited to: the design and construction of data centers, chip R&D, server development, network engineering, Volcano Engine’s edge-cloud services, high-performance intelligent hardware development, intelligent delivery and operation of IDC resources, intelligent monitoring and early warning of hardware infrastructure, operating systems and kernels, virtualization technologies, compilation toolchains, supply chain management, and many other infrastructure-related areas.Project Background:In today’s digital era, with the deep integration of cloud computing, artificial intelligence, and big data technologies, modern data centers face a prominent contradiction between exponentially increasing computing demands and the efficiency bottlenecks of existing computing architectures. Traditional architectures centered on general-purpose CPUs expose numerous issues when handling diverse workloads. For example, the “memory wall” effect caused by bandwidth and latency constraints in the memory subsystem continues to intensify; data movement overhead between heterogeneous computing units exceeds actual computation time; performance overhead from secure and trusted execution environments exceeds 30%; and the improvement of compute density per rack is limited by power density thresholds. Meanwhile, emerging workloads such as AI training, graph computing, and time-series databases exhibit dynamic heterogeneous characteristics, imposing differentiated requirements on computing architectures—traditional fixed architectures find it difficult to achieve optimal energy efficiency.As a critical software infrastructure and core technology in computer architecture, operating systems (OS) also face enormous challenges in this context. With the growth of computing demands and technological advancements, traditional homogeneous computing environments can no longer meet increasingly complex computational tasks. Modern computing scenarios feature highly heterogeneous hardware architectures, including CPUs, GPUs, FPGAs, TPUs, NPUs, DPUs, etc., while edge computing and cloud computing form distributed networks. Traditional OSes struggle to efficiently manage resources across nodes and architectures. Additionally, scenarios like AI training require low-latency, high-throughput, secure, trusted, and dynamically elastic distributed system support, necessitating that OSes possess unified abstraction and scheduling capabilities across heterogeneous resources. Both academia and industry have actively explored and researched next-generation computer OSes in areas such as distributed microkernel architectures, heterogeneous resource scheduling algorithms, cross-layer optimizations and compiler support, security and trust technologies, virtualization and Serverless, AI-driven OS kernel optimization, and OS-built-in AI inference engines.

Loading...