Infrastructure & Reliability Engineer at Partly

Christchurch, Canterbury, New Zealand -

Full Time

Start Date

Immediate

Expiry Date

06 Aug, 25

Salary

0.0

Posted On

06 May, 25

Experience

0 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Code, Python, Communication Skills, Knowledge Sharing, Leadership, Collaboration, Iso, Infrastructure, Cloud, It, Servers, Security Compliance, Teamwork, Ownership, Bash, Security, Teams, Critical Systems, Engineers, Devops, Kafka, Containerization

Industry

Information Technology/IT

Description

Note: Partly is headquartered in Christchurch but has employees across NZ, AU, IN, PH, JP, UK and the EU. If you are not based in Christchurch, we will fly you to HQ for 2 weeks for onboarding, as well as 1 week per quarter for our “Season Openers” (we pay for your travel and accommodation). If you are relocating to Christchurch from NZ or from overseas, we can also assist with relocation costs.

YOUR SKILLS

Experience Level: Professional experience in infrastructure, DevOps, or site reliability engineering roles. We’re seeking a track record of ownership over critical systems and successful delivery of complex projects. You should be comfortable operating independently and influencing technology direction.
DevOps & SRE Expertise: Hands-on experience with modern DevOps/SRE practices and tooling – for example, continuous integration pipelines (GitLab CI or similar), containerization (Docker/Kubernetes), infrastructure-as-code (Terraform), and GitOps workflows (ArgoCD or equivalent). You have designed, built, and maintained scalable infrastructure and CI/CD systems in a cloud environment.
Networking & IT Administration: Strong understanding of networking fundamentals (TCP/IP, DNS, firewalls, VPNs).
Cloud & Systems Knowledge: Deep familiarity with at least one major cloud platform and Linux systems administration. You can tune servers, manage databases/storage, and wrangle Kubernetes clusters. You are confident scripting or coding (e.g. in Bash, Python, or Go) to automate tasks and build tooling.
Security Mindset: Practical experience implementing security best practices in an IT or cloud environment. You understand concepts like least privilege, secrets management, network segmentation, OS hardening, and you’ve helped an organisation stay compliant with frameworks such as ISO 27001, SOC 2, or similar.
Ownership & Leadership: High degree of ownership and bias for action, with a proactive approach to solving problems. You take initiative and don’t wait to be told what to do. You have demonstrated leadership through mentoring junior engineers or leading small teams/projects, even if not formally a manager.
Collaboration & Communication: Excellent communication skills (written and verbal) and a collaborative attitude. You can work across teams and departments – from explaining technical issues to non-technical colleagues, to coordinating with engineers on deployments. You value teamwork and knowledge sharing.
Adaptability: Willingness to wear multiple hats and adapt to evolving needs. In a fast-growing startup environment, requirements can change – you’re excited by the chance to learn new skills, take on new challenges, and grow with the role.
Bonus Points: Experience in a high-growth startup environment, which means you’re used to the pace and ambiguity. Any prior experience maintaining security compliance and certifications in a company is a plus. If you have used specific tools we use (GCP, ArgoCD, GitLab CI, Kafka, etc.), that’s great – if not, you can learn quickly.
Please note: if you don’t have all the skills/experience listed above but believe you could be outstanding in this role, please still consider applying. Many folks, especially those from underrepresented or marginalised groups, often count themselves out. Please allow us to learn more about you and why you’re exceptional!

How To Apply:

Incase you would like to apply to this job directly from the source, please click here

Responsibilities

THIS ROLE

We are looking for an exceptional Infrastructure & Reliability Engineer to take ownership of Partly’s cloud and on-prem infrastructure. Reporting to the SRE team lead, you will play a crucial role in maintaining and enhancing our systems across both on-premise and cloud environments. You’ll ensure our networks, platforms, and tools are scalable, secure, and reliable, enabling our engineers to focus on building impactful software. This is a senior role, which means you are expected to operate with a high level of autonomy, leadership, and strategic thinking in your domain. If you are excited by the prospect of designing and supporting the backbone that connects the world’s parts, this role is for you!

WHAT WILL YOU DO

Cloud Infrastructure & Automation: Ensure the stability, scalability, and security of our cloud infrastructure in Google Cloud Platform (GCP). Leverage Infrastructure-as-Code and automation (Terraform, GitOps with ArgoCD, etc.) to deploy and manage our Kubernetes clusters and other cloud resources in a repeatable, automated way. Continuously improve our deployment processes and tooling to support a fast-paced engineering team.
CI/CD & Developer Tooling: Take ownership of running development critical systems and tools, such as GitLab and CI workers infrastructure to maximize developer productivity. Maintain and improve developer environments, making it easy for engineers to build, test, and deploy code efficiently.
Security & Compliance: Lead the infrastructure side of our security and compliance processes. Implement best practices for network and cloud security, manage access controls, and own the preparation and execution of security audits (e.g. ISO 27001). You’ll ensure we meet or exceed requirements for security compliance and maintain documentation/policies to pass audits.
On-premise infrastructure: Manage our on-premise infrastructure – including servers and our development Kubernetes clusters – to ensure great developer experience and seamless operations.
Cost Optimisation: Monitor and optimise costs across our cloud and on-prem infrastructure, ensuring we get maximum value from our investments. Make recommendations for resource allocation or architecture changes to improve cost-efficiency without sacrificing reliability or performance.
Cross-Functional Collaboration: Work closely with developers, data engineers, and leadership to plan infrastructure needs and improvements. Provide tooling, guidance and training to the engineering team on DevOps best practices, and collaborate during software delivery to ensure smooth integrations from code to production. When you see a problem or an opportunity to improve, you drive the solution.
Want to learn more about the problems we’re solving and the culture we’re building at Partly? Hear directly from our team here: https://shorturl.at/iAFUX