Site Reliability Engineer

at  CVS Health

Woonsocket, Rhode Island, USA -

Start DateExpiry DateSalaryPosted OnExperienceSkillsTelecommuteSponsor Visa
Immediate19 Nov, 2024USD 72100 Annual22 Aug, 20242 year(s) or aboveJenkins,Continuous Improvement,Java,Splunk,Change Management,Computer Science,Languages,Python,Appdynamics,Platforms,Harness,It Service Management,Software,Problem Management,GitNoNo
Add to Wishlist Apply All Jobs
Required Visa Status:
CitizenGC
US CitizenStudent Visa
H1BCPT
OPTH4 Spouse of H1B
GC Green Card
Employment Type:
Full TimePart Time
PermanentIndependent - 1099
Contract – W2C2H Independent
C2H W2Contract – Corp 2 Corp
Contract to Hire – Corp 2 Corp

Description:

Bring your heart to CVS Health. Every one of us at CVS Health shares a single, clear purpose: Bringing our heart to every moment of your health. This purpose guides our commitment to deliver enhanced human-centric health care for a rapidly changing world. Anchored in our brand — with heart at its center — our purpose sends a personal message that how we deliver our services is just as important as what we deliver.
Our Heart At Work Behaviors™ support this purpose. We want everyone who works at CVS Health to feel empowered by the role they play in transforming our culture and accelerating our ability to innovate and deliver solutions to make health care more personal, convenient and affordable.
This position will be part of the PCW Pharmacy Technology Site Reliability Engineering team with focus on improving reliability and stability of the application portfolio. The ideal candidate will be a highly technical visionary, who is committed to ensuring seamless experiences for consumers and will be passionate about continuous improvement through automation, performance enhancements, and innovation.

Key Responsibilities:

  • Identify, maintain, and manage to SLOs, SLIs, and operational KPIs.
  • Establish and Maintain strong partnerships with Product, Engineering, Infrastructure, and Service Management teams with the ability to influence key decisions.
  • Proactive review of the existing environment as well as engagement on enhancements and/or new services to identify and remediate stability, reliability, and performance improvement opportunities
  • Continuous review of system telemetry and alerting ensure actionable engagement by operations teams.
  • Identify and develop automation solutions to address potential problems before they result in a service interruption
  • Investigate root cause of major incidents, identify remediation plans, and share knowledge across platforms
  • Provide technical coaching and direction to organizational resources
  • Stay current with emerging technologies and market trends to best position the organization
  • Review capacity models frequently to ensure production results are within expected bounds.
  • Ensuring incident response processes and associated playbooks are current and effective.

Required Qualifications:

  • 3+ years of experience in a Site Reliability Engineer or Application Operations role
  • 2+ years of experience demonstrated scripting or developing software in languages such as java and python
  • 2+ years of experience managing and improving cloud deployed services on platforms such as AKS & GCP as well as monolith systems
  • 2+ years of experience with configuring, customizing, and extending monitoring platforms such as AppDynamics, Splunk, Grafana, ELK, or similar.

Preferred Qualifications:

  • Experience managing version control systems such as GIT.
  • Experience with tools such as Jenkins and Harness
  • Continuous improvement oriented ranging from ideation to implementationAbility to engage cross functional teams to champion the resolution of issues and design solutions
  • Strong communication, organizational, analytical, and problem solving skills
  • Knowledge of IT Service Management best practices such as change management and problem management

Education:

  • Bachelor’s degree or equivalent experience required
  • Bachelor’s degree in Computer Science preferred

Responsibilities:

  • Identify, maintain, and manage to SLOs, SLIs, and operational KPIs.
  • Establish and Maintain strong partnerships with Product, Engineering, Infrastructure, and Service Management teams with the ability to influence key decisions.
  • Proactive review of the existing environment as well as engagement on enhancements and/or new services to identify and remediate stability, reliability, and performance improvement opportunities
  • Continuous review of system telemetry and alerting ensure actionable engagement by operations teams.
  • Identify and develop automation solutions to address potential problems before they result in a service interruption
  • Investigate root cause of major incidents, identify remediation plans, and share knowledge across platforms
  • Provide technical coaching and direction to organizational resources
  • Stay current with emerging technologies and market trends to best position the organization
  • Review capacity models frequently to ensure production results are within expected bounds.
  • Ensuring incident response processes and associated playbooks are current and effective


REQUIREMENT SUMMARY

Min:2.0Max:3.0 year(s)

Information Technology/IT

IT Software - Other

Software Engineering

Graduate

Computer science preferred

Proficient

1

Woonsocket, RI, USA