Lead SRE (Network Focused)
at Gartner
Egham, England, United Kingdom -
Start Date | Expiry Date | Salary | Posted On | Experience | Skills | Telecommute | Sponsor Visa |
---|---|---|---|---|---|---|---|
Immediate | 19 Dec, 2024 | Not Specified | 24 Sep, 2024 | 7 year(s) or above | Good communication skills | No | No |
Required Visa Status:
Citizen | GC |
US Citizen | Student Visa |
H1B | CPT |
OPT | H4 Spouse of H1B |
GC Green Card |
Employment Type:
Full Time | Part Time |
Permanent | Independent - 1099 |
Contract – W2 | C2H Independent |
C2H W2 | Contract – Corp 2 Corp |
Contract to Hire – Corp 2 Corp |
Description:
DESCRIPTION
Hiring near our Egham, UK. Hybrid, flexible environment.
Responsibilities:
ABOUT THIS ROLE:
We are seeking a Principal SRE (Network Focused) who will play a crucial role in supporting the production and operations of our conferences IT platforms and our enterprise network (both cloud & on prem). During live conferences, the candidate will work closely with network operations team members to ensure smooth operation of the network infrastructure by optimizing and maintaining its high performance and reliability. Additionally, during non-conference periods, the candidate will focus on ensuring the operational readiness of the network infrastructure. This includes Observability, performance and resiliency utilizing Chaos Engineering techniques.
WHAT YOU’LL DO
- As part of SRE scrum team, troubleshoot and resolve complex network performance and reliability issues, working closely with network operations and engineering teams
- Function as SME in utilizing NPM tools to drive forensics on network performance and reliability issues that impact optimal end user experience and / or application health
- Work closely with the Conference Network team and also Enterprise Network team to maintain / enhance a comprehensive knowledge of the systems and infrastructure
- Work closely with the Observability team to maintain / improve dashboard / alerting posture
- Monitor operational dashboards and alerts during conferences and respond to alerts
- Collaborate to develop / design chaos test cases that effectively simulate real-world scenarios, identify potential vulnerabilities and areas for improvement
- Execute chaos tests, analyze using NPM, APM and other monitoring tools to identify performance and stability issues
- Utilize breadth of knowledge and experience to accurately connect the dots between application and network performance issues
- Utilize strong network forensics knowledge to cross train other IT engineers
- Use data driven analysis to drive continuous improvement in network observability, performance, reliability and resilience
- Perform analytics on previous incidents to understand root causes and use automation to detect problems faster, reduce the probability and/or impact of problem recurrence where possible
- Support and drive advancement of our NPM tools and services
- Available to work flexible hours as required for operational support and during select conferences to ensure coordination among globally distributed team
- Participate in on-call schedule, ensuring that issues are addressed promptly and effectively
REQUIREMENT SUMMARY
Min:7.0Max:12.0 year(s)
Information Technology/IT
IT Software - Network Administration / Security
Networks
Graduate
Computer science or a related field required
Proficient
1
Egham, United Kingdom