Alibaba Cloud-Control System Site Reliability Engineer-Seattle at Alibaba

Seattle, Washington, USA -

Full Time

Start Date

Immediate

Expiry Date

09 Sep, 25

Salary

219600.0

Posted On

09 Jun, 25

Experience

0 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Acp, Python, Performance Tuning, System Administration, Communication Skills, System Configuration, Base Pay, Scripting Languages, Access

Industry

Information Technology/IT

Description

JOB DESCRIPTION

We are a dynamic and creative group dedicated to building and optimizing the company’s core cloud computing infrastructure. Our mission is to deliver secure, stable, and cost-effective computing services to our customers.Our team focuses on the following key areas:Management Control Technology: Leveraging Alibaba Cloud’s industry-leading computing, networking, and storage capabilities, we efficiently orchestrate and manage services to deliver outstanding API services to our customers. This not only enhances the efficiency and flexibility of using cloud resources but also significantly improves the overall user experience.Architecture Design: We develop and optimize our internal control systems by utilizing general control components, service orchestration strategies, and application deployment models to support the automated management, monitoring, and operation of cloud services, thereby enhancing overall service quality.CloudOps Tool Development: We design and develop advanced CloudOps tools, including but not limited to elastic scaling services, tagging services, and Workbench, to enhance the flexibility and efficiency of cloud resource management. Our goal is to simplify operations and enhance visualization, providing customers with a superior management experience.CloudOps Practices: By implementing DevOps and Site Reliability Engineering (SRE) best practices, we improve the operability and reliability of our systems, ensuring the continuous and smooth operation of our infrastructure. We are committed to advancing automated workflows, change management, and rapid incident response to enhance overall service quality.We encourage team members to actively engage with the latest technology trends and tools, aiming to continuously innovate in a fast-paced and challenging environment. We offer an open team culture that values creativity and innovation, providing unlimited support and opportunities for everyone’s growth and success. Whether you are an expert in cloud infrastructure or an engineer eager to enhance your skills, we look forward to having you join us to drive the frontier of technology forward!department/team1、Responsible for the daily maintenance, monitoring and alerting, version iteration upgrades of the virtualization technology stack underlying ECS (Elastic Compute Service), as well as problem investigation, diagnosis, and resolution during these processes.2、Responsible for addressing customer-side technical inquiries related to ECS virtualization, handling work orders, and responding to emergencies.3、Responsible for the localization and systematic optimization of the ECS virtualization operation and maintenance system, as well as the development and enhancement of the data-driven system.

Responsibilities

Please refer the Job description for details