Senior Site Reliability Engineer - Auth0 at Okta
Sydney, New South Wales, Australia -
Full Time


Start Date

Immediate

Expiry Date

30 Jul, 25

Salary

0.0

Posted On

01 May, 25

Experience

1 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Production Systems, Technical Writing, Drive, Ownership, Authentication, Scripting, App, Microservices, Databases, Azure, Ssl, Kubernetes, System Architecture, Communication Skills, Reliability, Aws, Scalability, Web Technologies, Automation, Docker, Perspectives, Routing

Industry

Information Technology/IT

Description

GET TO KNOW OKTA

Okta is The World’s Identity Company. We free everyone to safely use any technology—anywhere, on any device or app. Our Workforce and Customer Identity Clouds enable secure yet flexible access, authentication, and automation that transforms how people move through the digital world, putting Identity at the heart of business security and growth.
At Okta, we celebrate a variety of perspectives and experiences. We are not looking for someone who checks every single box - we’re looking for lifelong learners and people who can make us better with their unique experiences.
Join our team! We’re building a world where Identity belongs to you.
As a Senior Site Reliability Engineer you will champion all things pertaining to reliability at Okta for Auth0. Working closely with the Product Engineers, Quality Engineers, Platform Engineers and Architecture teams, your primary focus will be on ensuring production systems remain operational at all times, while continually setting and achieving long-term performance, reliability and scalability goals in a platform with an exponential growth plan for the coming years.
With Okta’s increased dedication to ensuring customer availability expectations are exceeded in every way, you will play a key role as we evolve our system architecture to meet the demands of enormous growth and support the hundreds of millions of users who rely on us to provide uninterrupted access to business-critical enterprise and consumer applications.

SKILLS

  • Exceptional communication skills, including technical writing in the English language
  • Systematic problem-solving approach, coupled with a strong sense of ownership and drive
  • Understanding of microservices, cloud infrastructure (AWS, Azure), databases (SQL, No-SQL, Key/Value), containers (docker, kubernetes), web technologies (web sockets, http) and networking (SSL, routing, VPN)
  • Live and breathe SLIs, SLOs, error budgets and SLAs
  • Strong belief in automating everything and reducing toil for yourself and teammates
  • Loves to work as a team, but is able to work effectively in a remote environment where tasks may be self-driven

EXPERIENCE

  • 2+ years as a Site Reliability Engineer or in a Cloud Operations/DevOps role
  • 1+ years using golang, shell scripting and terraform
  • 2+ years as software developer in a SaaS environment
  • 3+ years in a production environment supporting large-scale, mission-critical applications

    LI-Hybrid

Responsibilities
  • Working with the other teams to run, own and improve incident response processes
  • Participate in regular on-call rotations to ensure 24/7 coverage of all critical systems
  • Use existing monitoring tools to identify problems and resolve and/or escalate to service teams
  • Implement changes to enable or improve infrastructure resilience, monitoring, and alerting
Loading...