Site Reliability Engineer

at  Planet DDS

Glasgow, Scotland, United Kingdom -

Start DateExpiry DateSalaryPosted OnExperienceSkillsTelecommuteSponsor Visa
Immediate17 Jul, 2024GBP 82918 Annual17 Apr, 20243 year(s) or aboveCloud Security,Cloud,Microsoft Teams,Rca,Security,Email,Ansible,Powershell,B2B,Synthetics,Service Availability,Infrastructure Management,Infrastructure,Root Cause Analysis,Communication SkillsNoNo
Add to Wishlist Apply All Jobs
Required Visa Status:
CitizenGC
US CitizenStudent Visa
H1BCPT
OPTH4 Spouse of H1B
GC Green Card
Employment Type:
Full TimePart Time
PermanentIndependent - 1099
Contract – W2C2H Independent
C2H W2Contract – Corp 2 Corp
Contract to Hire – Corp 2 Corp

Description:

ABOUT US:

Planet DDS is the leading provider of cloud-enabled dental software solutions serving over 10,000 practices in North America with over 60,000 users. The company delivers a complete platform of solutions for dental practices including Denticon Practice Management, Apteryx XVWeb Digital Imaging, and Legwork Patient Relationship Management. Planet DDS is committed to creating value for its dental practice clients by solving the most urgent challenges facing today’s dental practices in North America

QUALIFICATIONS

  • 3+ years of experience operating and troubleshooting Azure App Services, Azure Functions, Azure Logic Apps, Azure SQL, Azure Storage, Application Insights, Azure Redis, VNets and Azure App Gateway.
  • 3+ years of experience with Reliability concepts to ensure high performance and high service availability, able to define implement and improve business performance SLO’s.
  • 3+ years of experience with Observability across multiple domains (APM, Infrastructure, Synthetics, Logs, etc…) within cloud and on-premise environments using Datadog, Azure Monitor and Application Insights. NewRelic and Grafana are nice to have.
  • 3+ years of experience with Production operations including 24x7 on-call support, escalation/paging with OpsGenie, incident management, RCA (Root Cause Analysis) and retrospective analysis.
  • 3+ or more years in hands on technical roles (such as site reliability engineer, software engineer, DevOps engineer, infrastructure engineer).
  • Experience with infrastructure management across multiple cloud and on-premise environments using tools such as Terraform, Bicep, PowerShell, Ansible.
  • Security is part of everything we do and will require your knowledge of fundamental cloud security (e.g., identity and access management, firewalls, etc.)
  • Strong collaboration and communication skills in a hybrid environment using Microsoft Teams, email and calendar.
  • Bachelor’s Degree in a relevant major or equivalent years of experience

Any of the following would be a plus:

  • Dental industry knowledge
  • Azure certifications
  • Experience working in B2B SaaS companies
  • Experience with cloud containers, specifically Kubernetes

Responsibilities:

Develop:

  • Architecture, strategy and implementations to enable or enhance the Observability and Reliability of applications and services running on IaaS and PaaS in Microsoft Azure. AWS and GCP are nice to have.
  • Service Level Objectives and indicators focused on improving business workflow performance and availability.
  • Technical and business dashboards, metrics, and actionable alerting.
  • Processes and automation for increasing uptime and availability, reducing toil and improving all phases of incident and problem management.

24x7 Support:

  • Perform deep dives into systemic and latent reliability issues, incident management, problem management.
  • Participate in all aspects of incident management including awareness, communication, remediation, retrospective / root cause analysis.
  • Identify and implement process improvements of MTTA (Mean Time to Acknowledge) and MTTR (Mean Time to Resolve).
  • Support operations & engineering teams on Azure. AWS and GCP are nice to have.
  • Supports applications written in .NET, .NET core, MVC and JavaScript.
  • Training & mentoring for peers and less experienced engineers.
  • Production environments with on-call rotations.
  • Advocacy
  • Train and mentor engineering teams on modern observability practices and techniques.
  • Define and socialize SRE culture, best practices, architectural and security standards.
  • Assess and raise risks across the organization.

Partnership with:

  • Internal engineering, architecture and operations teams to ensure alignment.
  • External teams to support their work and ensure compliance with our standards

Optimize & manage:

  • Multi product observability platforms supporting cloud / on prem infrastructure, services and applications. Observability cost optimization.
  • Measuring and monitoring availability, latency, and overall system health across multiple product lines.
  • Other duties as assigned


REQUIREMENT SUMMARY

Min:3.0Max:8.0 year(s)

Information Technology/IT

IT Software - Application Programming / Maintenance

Software Engineering

Graduate

A relevant major or equivalent years of experience

Proficient

1

Glasgow, United Kingdom