Site Reliability Engineer at Fidelity TalentSource
Westlake, TX 76262, USA -
Full Time


Start Date

Immediate

Expiry Date

28 Nov, 25

Salary

0.0

Posted On

28 Aug, 25

Experience

8 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Distributed Systems, Resiliency, Scripting Languages, Groups, Splunk, Instrumentation, Computer Science, Automation, Logging, Software Development, Python, Shell Scripting, Infrastructure, Cloud Computing, Aws, Cloud Development, Communication Skills, Kubernetes, Java

Industry

Information Technology/IT

Description

Fidelity TalentSource is your destination for discovering your next temporary role at Fidelity Investments. We are currently sourcing for a Site Reliability Engineer to work in Fidelity’s Enterprise Infrastructure Group in Westlake TX or Merrimack NH!

THE SKILLS YOU BRING

  • Ability to automate with various scripting languages (Python, Shell scripting, etc…)
  • Experience managing systems using infrastructure as code tools (IAM, ARM, Terraform, Chef, …)
  • Solid understanding of Cloud Computing and DevOps concepts including CI/CD pipelines
  • Hands-on Kubernetes skills and knowledge.
  • Hands on experience with one or more observability tools (Prometheus, Grafana, ELK/OpenSearch, OpenTelemetry, Datadog, etc…)
  • Experienced in Instrumentation with systems skills on building and operating, monitoring, logging, alerting services of distributed systems at scale
  • Proven experience in maintaining scalability and resiliency of complex environment.
  • Proven experience in implementing advanced observability practices and techniques at scale.
  • Demonstrated ability to apply modern monitoring tools (Datadog, Prometheus, Splunk, …)
  • Ability to triage, complete root cause analysis, and be decisive under pressure
  • Experience managing and interpreting large datasets using query languages and visualization tools
  • Proficient communication skills with an ability to reach both technical and non-technical audience
  • Ability to learn new software, method and practices and bringing them to our developers
  • Ability to work with a variety of individuals and groups, both in person and virtually, in a constructive and collaborative manner and build and maintain effective relationships
Responsibilities

Our Site Reliability Engineering group within Enterprise Infrastructure combines Operations Excellence with the Development Experience to deliver services at high scale, high availability with resilience by using automation and Infrastructure Code. We build reliability into our ecosystem by applying standard methodologies in Resiliency Engineering, Automation, Observability & Chaos Testing.
The team comes from diverse technical backgrounds, and the responsibilities provide the opportunity for a variety of challenges. Ideal candidates will have a background in either software engineering or systems engineering with a desire to learn the other or previous experience as an SRE. We are looking for a Systems Thinking, SRE Engineer who has helped teams scale through production insights, operational automation, developer guidance, real-time metrics, automation, automation, automation…!

Loading...