Site Reliability Engineer at NEP Australia
Southbank VIC 3006, , Australia -
Full Time


Start Date

Immediate

Expiry Date

19 Nov, 25

Salary

0.0

Posted On

19 Aug, 25

Experience

2 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Go, Software Development, Kubernetes, Augmented Reality, Computer Engineering, Professional Development, Software, Ansible, Cloud, Python, Automation, Digital Assets, App, Systems Engineering

Industry

Information Technology/IT

Description

OUR COMPANY

NEP is Australia’s leading provider of outsourced television production services.
We are always looking for great people to join our team; people with a passion for people and teamwork helping us deliver exceptional results for our clients.
NEP Australia is currently looking for a Site Reliability Engineer to join our team at the Andrew’s HUB in Southbank!

REQUIRED SKILLS AND ATTRIBUTES:

  • Bachelors or Masters in Computer Engineering (or equivalent experience)
  • 2+ years in Software or Systems Engineering
  • Automation for scaling using tools like Ansible, Terraform, Helm, and ArgoCD.
  • Software development in at least one language such as Go or Python
  • Experience in building and maintaining container platforms, such as Kubernetes
  • Expert in Observability platforms such as Grafana, Prometheus etc
  • Experienced in using and tuning cloud native technology
  • Solid understanding of basic Linux and cloud networking (e.g., routing, firewalls, DNS, VPCs, subnets, load balancers).

NEP believes that, ?rst and foremost, the e?orts of our people are what contribute to our successes. We o?er a range of bene?ts that assist our team in their professional development and wellbeing, including:

  • Salary continuance insurance
  • NEP Days – additional 5 days of leave per year (conditions apply)
  • NEP Travel benefits & discounts including Qantas Club Membership
  • Discounts through Employment Hero Work app
  • Employee Assistance Programme

This is a full-time role and is a unique opportunity for the right person. So if you want to be part of a global company apply today!
You must have the right to live and work in Australia to apply for this job.
Only shortlisted candidates will be contacted.

Responsibilities
  • Design, implement, and maintain developer-friendly tools to improve productivity, code quality, and deployment efficiency for Kubernetes-based workloads.
  • Identify bottlenecks in integration and deployment pipelines and implement enhancements to support faster, more reliable deployments to on-premise and cloud Kubernetes clusters.
  • Collaborate with development teams to enable self-service tooling for managing deployments, logs, and infrastructure resources in Kubernetes environments.
  • Continuously improve build, test, and deployment automation for Kubernetes infrastructure across on-premise and cloud environments (AWS/GCP).
  • Provide better visibility into Kubernetes environments through improved observability tools, dashboards, and metrics.
  • Manage and improve Kubernetes orchestration across on-premise infrastructure and AWS/GCP clusters to ensure reliability, scalability, and consistency.
  • Enhance observability by implementing robust monitoring, logging, and alerting solutions tailored to Kubernetes workloads using tools like Grafana, Loki or cloud-native tools like CloudWatch (AWS) and Stackdriver (GCP).
  • Collaborate with Engineering Leadership to implement reliability engineering practices such as load testing, chaos testing, and recovery mechanisms for Kubernetes services.
Loading...