Sr. Infrastructure & Security Engineer at HerculesAI
Campbell, California, United States -
Full Time


Start Date

Immediate

Expiry Date

30 Jul, 26

Salary

230000.0

Posted On

01 May, 26

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Terraform, Pulumi, Kubernetes, AWS, Azure, GCP, Python, Bash, PowerShell, Cloudflare, Zero-trust architecture, IAM, RBAC, CI/CD, GPU orchestration, Security scanning

Industry

technology;Information and Internet

Description
What you'll do Provision and optimize GPU compute across AWS, Azure, GCP, and specialized providers (CoreWeave, Lambda Labs), including Kubernetes GPU orchestration and hardware evaluation (NVIDIA H100/B200, AMD MI300X, Intel Gaudi) Design and maintain IaC foundations (Terraform, Pulumi, Helm) for agentic AI systems, including agent orchestration platforms, RAG stacks, vector databases, and model serving endpoints Implement policy-as-code guardrails (OPA, Sentinel, Kyverno) for autonomous agent workloads Design and enforce zero-trust architectures with network segmentation, IAM/RBAC least-privilege, and secrets management (Vault, AWS Secrets Manager) Configure and manage Cloudflare (or equivalent) for DDoS protection, WAF, bot management, SSL/TLS termination, and Zero Trust access Manage DNS security (DNSSEC, DMARC, SPF, DKIM), certificate lifecycle, and API security controls (mTLS, token management) Lead vulnerability management, penetration testing coordination, and CIS benchmarking Partner with customer success teams to assess, secure, and threat-model customer deployment environments Build and maintain CI/CD pipelines (GitHub Actions, GitLab CI) with integrated security scanning (SAST, DAST, SCA, container scanning) Deploy and manage Kubernetes clusters across cloud and on-prem with security-hardened, GPU-enabled configurations Implement observability (Prometheus, Grafana, Splunk, Datadog) and SIEM integrations Lead incident response and drive compliance (SOC 2, ISO 27001, HIPAA, FedRAMP) through audit automation Qualifications Proven expertise with Terraform/Pulumi, IaC, policy-as-code, and scripting (Python, Bash, PowerShell) Hands-on GPU compute provisioning across major cloud and specialized providers Experience with Cloudflare or equivalent CDN/WAF/DDoS platforms for perimeter security and Zero Trust Strong background in AWS, Azure, GCP, and on-prem infrastructure with secure architecture focus Proficiency in Kubernetes and Docker, including container security, GPU scheduling, and runtime protection Deep understanding of network security, zero-trust principles, IAM/RBAC, and secrets management CI/CD experience with integrated security scanning Ability to conduct security assessments, threat modeling, and work directly with customers Pay Range $175,000 - $230,000
Responsibilities
The role involves provisioning and optimizing GPU compute across multiple cloud providers while designing secure infrastructure-as-code foundations. You will also lead incident response, manage vulnerability assessments, and implement zero-trust security architectures for autonomous agent workloads.
Loading...