Cloud Engineer - Observability Platforms

at  CVS Health

Saint Paul, Minnesota, USA -

Start DateExpiry DateSalaryPosted OnExperienceSkillsTelecommuteSponsor Visa
Immediate08 Jul, 2024USD 64890 Annual09 Apr, 20242 year(s) or abovePython,Postgresql,Docker,Aws,Microservices,Code,Mysql,Computer Science,Google Cloud,Infrastructure,Reliability Engineering,KubernetesNoNo
Add to Wishlist Apply All Jobs
Required Visa Status:
CitizenGC
US CitizenStudent Visa
H1BCPT
OPTH4 Spouse of H1B
GC Green Card
Employment Type:
Full TimePart Time
PermanentIndependent - 1099
Contract – W2C2H Independent
C2H W2Contract – Corp 2 Corp
Contract to Hire – Corp 2 Corp

Description:

A BRIEF OVERVIEW

Location: Hybrid or Remote potential

  • Woonsocket, RI
  • Florham Park, NJ
  • Chantilly, VA
  • Alpharetta, GA
  • Soho, NY
  • Hartford, CT

CVS Health Enterprise Engineering plays a critical part in shaping the future of CVS Health. If you’re looking for the chance to leverage advanced technology to redefine the CVS Application Platform landscape, enhance the customer experience and improve engineer’s lives on day to day basis, this is the opportunity for you. Join us and challenge your IT expertise and analytical skills to help create a better engineering experience to our customers within CVS Health.
The platform is focused on providing a seamless developer experience, identifying, and analyzing system design weaknesses, along with troubleshooting complex technical issues. In addition, this role will primarily assist the team technically around Observability Platform capabilities, Alerting and Monitoring Solutions. You should have the ability to learn quickly, shadow senior engineers on the team, and communicate clearly. You should also have excellent Software Engineering and collaboration skills. A successful candidate will be a highly motivated, collaborative individual; motivated to achieve results in a fast-paced environment.

PREFERRED SKILLS:

  • Exposure to MySQL, PostgreSQL or any other RDS databases
  • 2 + years of experience with CD tools such as Argo CD, Harness etc. Argo CD is preferred.
  • 2 + years of experience with Open Telemetry tools such as OTEL
  • 2 + years of experience in open-source frameworks and 1 + years of experience with Tempo
  • 2 + years of Strong Linux OS-level, command-line and scripting knowledge (e.g., Go, Python, Bash etc.), and configuration management principles
  • Experience with SaaS Architecture and with the development and operation of high-traffic backend systems
  • Experience with infrastructure-as-code with Ansible, Terraform etc.
  • Strong Exposure to Microservices and web app architecture
  • Experience in architecting, implementing, and managing monitoring tools such as Prometheus/Grafana, CloudWatch, NewRelic, and ELK in the cloud

EDUCATIONAL QUALIFICATION:

  • A Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, a related field, or equivalent industry experience.

Responsibilities:

RESPONSIBILITIES:

  • Implement the technical capabilities and collaborate with other Observability Engineers on team in terms of technical solutions.
  • Collaborate with Team, Principal Engineer, Architect, and Engineering Leaders to understand the Roadmap and help them team in delivering those capabilities on time.
  • Build and maintain monitoring and alerting systems that provide timely feedback on the performance and health of our systems/applications. Continuously improve infrastructure and applications to ensure 99.99% uptime while removing architectural complexity.
  • Adopt and implement best coding practices and documentation standards.
  • Improve the Application Observability product. You will build dashboards and provide guidance on how to monitor various technologies covered by OpenTelemetry components.
  • Develop and maintain non functional requirements for Observability Metrics such as SLAs and SLOs.
  • Help monitor and maintain Production environments using experience with Loki, Grafana, Prometheus and alertManager.
  • Hands-on experience creating Grafana operational dashboards, data visualization using Grafana.
  • Achieve material improvements in system performance based on insights from observability metrics.
  • Provide 24/7 operations support for production, other critical environments to ensure 99.99% availability of our systems.
  • Work on OpenTelemetry-based solutions. We have plans to ship Grafana OpenTelemetry distributions for Java, React, Angular, Node JS etc.
  • Clear understanding of SRE best practices, performance management, capacity analysis and creating fault tolerant deployment patterns.
  • Write documentation. As an instrumentation expert, you will write documentation that makes it easy for Grafana Cloud users to instrument their applications with OpenTelemetry and get started with Grafana Cloud.
  • Teach others. Share the knowledge with OpenTelemetry, semantic conventions, and various technologies and frameworks to both Grafana squads and customers.
  • Help Customers on priority basis as and when customers reach out on support channels
  • A passion for staying up to date with the latest trends and technologies in public cloud environments.
  • Exceptional analytical skills, able to apply knowledge and experience in decision-making to arrive at creative and commercial solutions
  • Excellent verbal and written communication skills

WHAT YOU’LL DO DAY TO DAY

  • Design and implement observability platform strategies.
  • Improve reliability, stability, and performance of production systems.
  • Implement automation of engineering and operations processes for observability platform
  • Optimize Observability Practices
  • 24/7 On-call rotation support as needed. On-call support will be rotated among team members
  • Maintenance and administration of source control systems for observability platform
  • Help Customers on priority basis as and when customers reach out on support channels

FOR THIS ROLE YOU WILL NEED MINIMUM REQUIREMENTS:

  • 3 + years of experience in Cloud Engineering, Site Reliability Engineering
  • 2 + years of experience with Observability Platform tooling such as Grafana, Loki, Prometheus
  • 2 + years of experience with Docker, Kubernetes, and Helm
  • 2 + years of experience on Cloud platform – Google Cloud or AWS. GCP is preferred


REQUIREMENT SUMMARY

Min:2.0Max:3.0 year(s)

Information Technology/IT

IT Software - Other

Software Engineering

Graduate

Computer Science, Electrical, Electrical Engineering, Engineering

Proficient

1

Saint Paul, MN, USA