Manager, Site Reliability Engineering

at  Plume

Remote, Oregon, USA -

Start DateExpiry DateSalaryPosted OnExperienceSkillsTelecommuteSponsor Visa
Immediate12 Sep, 2024USD 200000 Annual15 Jun, 202410 year(s) or aboveJava,Php,Debian,Teams,Working Experience,Python,Operating Systems,PerlNoNo
Add to Wishlist Apply All Jobs
Required Visa Status:
CitizenGC
US CitizenStudent Visa
H1BCPT
OPTH4 Spouse of H1B
GC Green Card
Employment Type:
Full TimePart Time
PermanentIndependent - 1099
Contract – W2C2H Independent
C2H W2Contract – Corp 2 Corp
Contract to Hire – Corp 2 Corp

Description:

LIFE AT PLUME

At Plume, we believe that technology isn’t about moving faster, it’s about making life’s moments better. Which is why we’ve built the world’s first, and only, open and hardware-independent service delivery platform for smart homes, small businesses, enterprises, and beyond. Our SaaS platform uses WiFi, advanced AI, and machine learning to create the future of connected spaces—and human experiences—at massive scale.
We now deliver services to over 50 million locations globally and have managed over 2.5 billion devices on our platform. We’re expanding rapidly, pioneering a new category, and we achieved our Series F funding in just four years. Our customers include many of the world’s largest Communications Service Providers (CSPs) who look to Plume to help them evolve their smart home offerings while gleaning insights from their own data.
With a bias for action and a love for being trailblazers, the team at Plume embodies a combination of relentless curiosity and imaginative innovation. We challenge ourselves to think in ways that other companies don’t, work to do what should be done (rather than what can), and if we can’t do it exceptionally well, we don’t do it. It’s how we’ve assembled a team of world-class builders, thinkers, and doers. And it’s how we’re reinventing what’s possible every day.
We’re looking for a seasoned Technical Manager, experienced with Customer Facing environments, to Captain our Site Reliability Engineering Team. This team is focused on deployments, fixes, and sustainability. The right candidate needs to have strong technical knowledge in key areas while focusing on customer satisfaction.

DESIRED SKILL SET

  • 10+ Years of experience with Production Troubleshooting
  • Minimum 3+ Years of experience leading or managing teams
  • Bachelor’s degree in related field or equivalent experience, Advanced degree preferred.
  • This is a leadership role, but you must have Technical knowledge and working experience with:
  • Kubernetes (operate)
  • Basic Terraform Knowledge
  • Experience Programming/Scripting - one of the following (eg. Perl, Python, PHP, GoLang, Java, etc)
  • Experience with modern cloud infrastructure, preferably AWS
  • Experience with modern Linux Operating systems (Enterprise Linux or Debian based)
  • Experience both setting up and utilizing self-managed Monitoring and observability tools (e.g. Nagios/Icinga, Grafana, Prometheus)

Responsibilities:

  • Supervise a team of Site Reliability Engineers who provide first-line support to Customer Clouds. Deployments, On-call, Application Provisioning are some of the routine tasks.
  • Attend and conduct customer Meetings for Project and Roadmap specification.
  • Manage growth and performance of SRE team members.
  • Be able to step in and execute or triage issues as much as the Engineers. Hands-on past experience is beneficial. Some examples are as follows:
  • Provision and scale multi-datacenter Kubernetes Infrastructure and Applications (EKS)
  • Deploy Software in multiple Production Environments
  • Own monitoring and alerting to production systems, improvements and changes
  • Contribute improvements to the current automation
  • Contribute improvements to our on-call process and alerting
  • Play a key role in the recruitment and retention of top talent.


REQUIREMENT SUMMARY

Min:10.0Max:15.0 year(s)

Information Technology/IT

IT Software - Other

Other

Graduate

Proficient

1

Remote, USA