Staff Observability Engineer at Micron Technology
Hyderabad, Telangana, India -
Full Time


Start Date

Immediate

Expiry Date

09 Mar, 26

Salary

0.0

Posted On

09 Dec, 25

Experience

10 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Observability, Incident Resolution, Troubleshooting, AIOps, SRE Principles, Automation, Python, PowerShell, SLIs, SLOs, ITIL Processes, Leadership, Stakeholder Management, Monitoring, Dashboarding, Incident Management

Industry

Semiconductor Manufacturing

Description
Provide the ability to gather report and alarm on infrastructure software or hardware for the purpose of ensuring reliable operations of information technology. Support in incident resolution and troubleshooting . Lead Observability Strategy: Define and execute the observability roadmap aligned with business and IT goals, integrating AIOps and SRE principles. Tool Ownership & Integration: Manage and optimize observability tools including OpsRamp, Splunk, AppDynamics, NetBrain, ThousandEyes, and explore new platforms like BigPanda and ServiceNow AIOps. Automation Leadership: Drive automation of L1/L2 operational tasks using Python and PowerShell, improving efficiency and reducing manual intervention. SRE Adoption: Collaborate with cross-functional teams to implement Site Reliability Engineering (SRE) practices, including SLIs/SLOs, error budgets, and incident response automation. Monitoring & Dashboarding: Design and maintain comprehensive dashboards and alerting mechanisms for infrastructure, applications, and network performance. Incident & Problem Management: Partner with ITSM teams to enhance incident detection, root cause analysis, and resolution workflows. Mentorship & Collaboration: Lead and mentor a team of observability engineers, fostering a culture of innovation, ownership, and continuous improvement. 10+ years of experience in IT operations, observability, or infrastructure monitoring. Strong hands-on experience with tools like Splunk, OpsRamp, AppDynamics, NetBrain, ThousandEyes. Experience with AIOps platforms (BigPanda, ServiceNow AIOps preferred). Proficiency in Python and PowerShell for automation and scripting. Familiarity with SRE principles and implementation strategies. Solid understanding of ITIL processes (Incident, Change, Problem Management). Excellent communication, leadership, and stakeholder management skills.
Responsibilities
The Staff Observability Engineer will lead the observability strategy, defining and executing the roadmap aligned with business and IT goals. They will also manage observability tools and drive automation of operational tasks to improve efficiency.
Loading...