Sign up with

Already have an account? Log in here

Need some help?
Talk to us at +91 7670800001

Site Reliability Engineer - Video Infrastructure at ByteDance

San Jose, California, USA -

Full Time

Start Date

Immediate

Expiry Date

07 Sep, 25

Salary

355000.0

Posted On

07 Jun, 25

Experience

0 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Linux, Computer Science, Java, C, C++, Mysql, Mongodb, Google Cloud, Redis, Incident Handling, Capacity Management, Cloud Services, Distributed Systems, Python, Aws

Industry

Information Technology/IT

Description

QUALIFICATIONS

Minimum Qualifications:- Bachelor’s degree in Computer Science or a related technical background involving software/system engineering, or equivalent working experience.- Extensive knowledge of SRE responsibilities, such as monitoring, incident handling, capacity management and disaster recovery.- Extensive knowledge of networking, operation system, database system and container technology.- Good understanding of every aspect of microservice architecture, and hands on experience in troubleshooting in large scale distributed systems.Preferred Qualifications:- Good programming experience with at least one of the following languages: C, C++, Java, Python, or Go.- Hands on experience in common open-source systems such as Linux, MySQL, MongoDB, Redis and ELK and experience in building solutions with AWS,Google Cloud, Azures and other cloud services is a plus.- Passionate, self-motivated, strong ownership and good teamwork skills.

Responsibilities

Team IntroductionVideo Cloud Infra team, facing business experience and cost, builds a competitive video transmission network and multimedia processing platform, builds data foundation and analysis capabilities, drives product refined operation, reduces costs and increases efficiency.Responsibilities- Build global infrastructure for multi-media transport, storage and process, to serve billions of users all over the world.- Engage in global production system management such as monitoring, emergency response, capacity planning and optimization.- Build tools, automations, visualizations and monitors to facilitate the operation and optimization of the global infrastructure.- Engage in and improve the whole service lifecycle, from inception and design, through deployment, operation and refinement.- Scale up systems sustainably through mechanisms like automation, and initiate changes that improve system reliability and processing speed.