Principal Platform Engineer at AVANT LLC
Chicago, IL 60654, USA -
Full Time


Start Date

Immediate

Expiry Date

15 Sep, 25

Salary

190000.0

Posted On

17 Jun, 25

Experience

0 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Good communication skills

Industry

Information Technology/IT

Description

The Infrastructure team is responsible for the platform, tools, and infrastructure supporting Avant’s customer-facing products. As a team member, you will be developing tools to create and maintain cloud infrastructure, automate the management of complex service-oriented applications, and develop frameworks to ensure the Avant platform’s stability and scalability.

Responsibilities
  • Design and implement production-grade Kubernetes platforms on AWS, incorporating EKS using Terraform.
  • Develop automated infrastructure provisioning systems using workflows with Terraform, Helm, and ArgoCD for declarative configuration management
  • Architect and maintain cloud-native platform services including service meshes, API gateways, and identity management systems
  • Build developer self-service platforms with custom operators, platform APIs, and internal developer portals using tools developed internally.
  • Implement comprehensive observability solutions with Datadog and Sentry, enabling automated alerting, distributed tracing, and performance optimization
  • Design and maintain secure multi-tenant environments with robust network policies, pod security standards, and least-privilege access controls
  • Create and maintain infrastructure scaling solutions using Karpenter, Cluster Autoscaler, and KEDA for optimized resource utilization
  • Implement robust disaster recovery and business continuity strategies including multi-region deployments and automated failover systems
  • Establish platform security practices including supply chain security, image scanning, and runtime threat detection
  • Lead the migration of traditional workloads to containerized environments with minimal disruption and maximum automation
  • Continuously improve platform reliability through chaos engineering, failure testing, and SLO/SLI measurements
  • Research and evaluate emerging cloud-native technologies, providing technical leadership on adoption strategies
Loading...