Senior Principal Cloud Operations Engineer, Infrastructure (Multiple Openin at Pegasystems
Dulles, VA 20166, USA -
Full Time


Start Date

Immediate

Expiry Date

07 Sep, 25

Salary

0.0

Posted On

08 Jun, 25

Experience

2 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Good communication skills

Industry

Information Technology/IT

Description

WHO YOU ARE:

  • Requires Bachelor’s degree in Computer Science, Computer Engineering or related field of study, and
  • 7 years of experience in any job title/occupation/position involving enterprise cloud environment supporting SAAS applications, focusing on operational delivery excellence and customer service.
  • Experience specified must include 7 years of experience with each of the following:
  • Supporting, configuring, troubleshooting, and tuning mission critical production Java applications and Apache Tomcat application servers in a global enterprise;
  • Hands-on operational experience with Amazon Web Services (AWS);
  • Linux systems administration; cloud capacity management for optimizing the performance and cost of cloud resources, including conducting performance reviews and recommending the appropriate sizing of cloud resources, which covers optimal performance-to-cost ratio;
  • Cloud infrastructure, platform, and application operational admin tasks, including monitoring & observability, high availability, connection pooling, and application load balancing;
  • Systems engineering including Linux-based system performance, memory management, I/O tuning, configuration, clusters and troubleshooting;
  • Analyzing application performance, using tools such as thread and heap dumps and JVM memory structure and garbage collection concepts;
  • Administration of web servers running Tomcat, Apache or Nginx; and
  • Working with cross-functional global and remote teams.
  • Experience specified must also include 5 years of experience with scripting languages such as Bash/Shell, Python or similar; and
  • 2 years of experience with microservices architecture with Kubernetes.
  • Telecommuting permitted up to 3 days per week.
  • Must work shifts on rotation including weekend coverage.
Responsibilities
  • Ensure the reliability, availability, and security of Pegasystems cloud services.
  • Act as a mentor in Pega Cloud area for other parts of Pega organization.
  • Recommend optimal usage of the cloud resources by performing capacity planning and reviews, which include the cost of running infrastructure.
  • Handle alerts, incidents, service requests, and changes within SLA.
  • Own customer escalations.
  • Perform provisioning and upgrade of the infrastructure components & solutions.
  • Troubleshoot and resolve customer environment issues, root cause analysis, and blameless post-mortems.
  • Influence product teams on defects, features, and enhancement requests to help build scalable, reliable, observable, available, and highly performant services.
  • Create, review, and update operational runbooks and Standard Operating Procedures.
  • Participate in pre-release product enhancement testing with Engineering.
  • Identify the needs and build tools to automate repeated operational tasks and reduce toil.
  • Manage multiple projects simultaneously and adapt to changing business goals.
Loading...