Site Reliability Engineer
at CVS Health
Woonsocket, Rhode Island, USA -
Start Date | Expiry Date | Salary | Posted On | Experience | Skills | Telecommute | Sponsor Visa |
---|---|---|---|---|---|---|---|
Immediate | 19 Nov, 2024 | USD 72100 Annual | 22 Aug, 2024 | 2 year(s) or above | Jenkins,Continuous Improvement,Java,Splunk,Change Management,Computer Science,Languages,Python,Appdynamics,Platforms,Harness,It Service Management,Software,Problem Management,Git | No | No |
Required Visa Status:
Citizen | GC |
US Citizen | Student Visa |
H1B | CPT |
OPT | H4 Spouse of H1B |
GC Green Card |
Employment Type:
Full Time | Part Time |
Permanent | Independent - 1099 |
Contract – W2 | C2H Independent |
C2H W2 | Contract – Corp 2 Corp |
Contract to Hire – Corp 2 Corp |
Description:
Bring your heart to CVS Health. Every one of us at CVS Health shares a single, clear purpose: Bringing our heart to every moment of your health. This purpose guides our commitment to deliver enhanced human-centric health care for a rapidly changing world. Anchored in our brand — with heart at its center — our purpose sends a personal message that how we deliver our services is just as important as what we deliver.
Our Heart At Work Behaviors™ support this purpose. We want everyone who works at CVS Health to feel empowered by the role they play in transforming our culture and accelerating our ability to innovate and deliver solutions to make health care more personal, convenient and affordable.
This position will be part of the PCW Pharmacy Technology Site Reliability Engineering team with focus on improving reliability and stability of the application portfolio. The ideal candidate will be a highly technical visionary, who is committed to ensuring seamless experiences for consumers and will be passionate about continuous improvement through automation, performance enhancements, and innovation.
Key Responsibilities:
- Identify, maintain, and manage to SLOs, SLIs, and operational KPIs.
- Establish and Maintain strong partnerships with Product, Engineering, Infrastructure, and Service Management teams with the ability to influence key decisions.
- Proactive review of the existing environment as well as engagement on enhancements and/or new services to identify and remediate stability, reliability, and performance improvement opportunities
- Continuous review of system telemetry and alerting ensure actionable engagement by operations teams.
- Identify and develop automation solutions to address potential problems before they result in a service interruption
- Investigate root cause of major incidents, identify remediation plans, and share knowledge across platforms
- Provide technical coaching and direction to organizational resources
- Stay current with emerging technologies and market trends to best position the organization
- Review capacity models frequently to ensure production results are within expected bounds.
- Ensuring incident response processes and associated playbooks are current and effective.
Required Qualifications:
- 3+ years of experience in a Site Reliability Engineer or Application Operations role
- 2+ years of experience demonstrated scripting or developing software in languages such as java and python
- 2+ years of experience managing and improving cloud deployed services on platforms such as AKS & GCP as well as monolith systems
- 2+ years of experience with configuring, customizing, and extending monitoring platforms such as AppDynamics, Splunk, Grafana, ELK, or similar.
Preferred Qualifications:
- Experience managing version control systems such as GIT.
- Experience with tools such as Jenkins and Harness
- Continuous improvement oriented ranging from ideation to implementationAbility to engage cross functional teams to champion the resolution of issues and design solutions
- Strong communication, organizational, analytical, and problem solving skills
- Knowledge of IT Service Management best practices such as change management and problem management
Education:
- Bachelor’s degree or equivalent experience required
- Bachelor’s degree in Computer Science preferred
Responsibilities:
- Identify, maintain, and manage to SLOs, SLIs, and operational KPIs.
- Establish and Maintain strong partnerships with Product, Engineering, Infrastructure, and Service Management teams with the ability to influence key decisions.
- Proactive review of the existing environment as well as engagement on enhancements and/or new services to identify and remediate stability, reliability, and performance improvement opportunities
- Continuous review of system telemetry and alerting ensure actionable engagement by operations teams.
- Identify and develop automation solutions to address potential problems before they result in a service interruption
- Investigate root cause of major incidents, identify remediation plans, and share knowledge across platforms
- Provide technical coaching and direction to organizational resources
- Stay current with emerging technologies and market trends to best position the organization
- Review capacity models frequently to ensure production results are within expected bounds.
- Ensuring incident response processes and associated playbooks are current and effective
REQUIREMENT SUMMARY
Min:2.0Max:3.0 year(s)
Information Technology/IT
IT Software - Other
Software Engineering
Graduate
Computer science preferred
Proficient
1
Woonsocket, RI, USA