DevOps Engineer at NVIDIA
Yokneam Ilit, Haifa District, Israel -
Full Time


Start Date

Immediate

Expiry Date

01 Aug, 26

Salary

0.0

Posted On

03 May, 26

Experience

2 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Kubernetes, Azure AKS, GitLab, CI/CD, Linux, Bash, ArgoCD, Infrastructure-as-code, Datadog, Docker, GitOps, Azure Key Vault, Networking, GPU-enabled workloads, Sealed Secrets

Industry

Computer Hardware Manufacturing

Description
NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world. NVIDIA's Manufacturing Information Systems team builds the data and automation backbone that keeps global manufacturing operations running — from CI/CD and container platforms to the on-prem environments that production workflows depend on. Our DevOps team leads that infrastructure end-to-end: Azure cloud, Kubernetes at scale, delivery coordinated via GitLab, and multiple business-critical on-prem sites . With major initiatives on the roadmap — GPU-enabled Kubernetes, per-site cluster rollouts, AKS upgrades, and a broader Vault rollout — the work ahead offers a rare mix of greenfield infrastructure and hands-on stewardship of systems NVIDIA relies on daily. The team is small, senior, and deeply accountable, with a strong mentorship culture under an experienced tech lead. We are adding a third DevOps engineer to increase delivery capacity, reduce single-person risk on critical systems, and grow our database infrastructure capability from within. If building resilient cloud and on-prem systems at scale sounds like the right challenge, we'd like to hear from you. What you'll be doing: Design, build, and operate Kubernetes infrastructure across Azure AKS and on-prem clusters, including ingress, autoscaling with Keda, TLS management, and GPU-enabled workloads Extend and harden CI/CD pipelines in GitLab, manage runners across multiple environments, and evolve GitOps-based deployments through ArgoCD Maintain and improve the critical on-prem infrastructure — Linux servers, NGINX, container platforms, and networking — that several production workflows depend on Partner with development, data, and architecture teams to streamline delivery, improve observability across Datadog, and shorten time-to-recovery during incidents Contribute to flagship initiatives on the roadmap: per-site Kubernetes cluster rollouts, AKS upgrades and node pool reorganization, GPU cluster enablement, and secret management with Azure Key Vault, and Sealed Secrets Automate provisioning and configuration across Azure resources and on-prem systems using infrastructure-as-code and scripting Troubleshoot across the full stack — from networking and certificates to container runtime and pipeline internals — turning incidents into durable improvements What we need to see: Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent experience) 3+ years in a DevOps, SRE, or infrastructure engineering role Hands-on proficiency with Kubernetes and container tooling (Docker for example) in production environments Track record of building and maintaining CI/CD pipelines, ideally in GitLab, including runner management and pipeline-as-code Fluency using AI-assisted development tools (such as Cursor, Codex or Claude) as a regular part of daily engineering work Solid Linux administration skills and fluency in Bash Practical background with a major cloud platform, Azure preferred (or AWS o/GCP) Working knowledge of GitOps workflows and tooling such as ArgoCD or Flux Collaboration and ownership mentality, with the accountability needed to operate business-critical systems Ways to stand out from the crowd: Hands-on experience with on-prem Kubernetes at scale, including cluster bootstrap, MetalLB, and ingress configuration Familiarity with secret management via HashiCorp Vault, Azure Key Vault, or Sealed Secrets Operational background with SQL (PostgreSQL, MySQL) and/or MongoDB, including backups, replication, or performance tuning Contributions to observability improvements with Datadog Widely considered to be one of the technology world’s most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive benefits package. As you plan your future, see what we can offer to you and your family www.nvidiabenefits.com/ NVIDIA pioneered accelerated computing. Today, our AI infrastructure powers global intelligence, transforming every industry. Learn more about NVIDIA.
Responsibilities
Design, build, and operate Kubernetes infrastructure across Azure AKS and on-prem clusters while maintaining critical Linux-based production systems. Partner with development and data teams to streamline delivery, improve observability, and automate infrastructure provisioning.
Loading...