Senior Principal Cloud Operations Engineer, Infrastructure (Multiple Openin
at
Pegasystems
Dulles, VA 20166, USA
-
Full Time
Start Date
Immediate
Expiry Date
07 Sep, 25
Salary
0.0
Posted On
08 Jun, 25
Experience
2 year(s) or above
Remote Job
Yes
Telecommute
Yes
Sponsor Visa
No
Skills
Good communication skills
Industry
Information Technology/IT
Description
WHO YOU ARE:
Requires Bachelor’s degree in Computer Science, Computer Engineering or related field of study, and
7 years of experience in any job title/occupation/position involving enterprise cloud environment supporting SAAS applications, focusing on operational delivery excellence and customer service.
Experience specified must include 7 years of experience with each of the following:
Supporting, configuring, troubleshooting, and tuning mission critical production Java applications and Apache Tomcat application servers in a global enterprise;
Hands-on operational experience with Amazon Web Services (AWS);
Linux systems administration; cloud capacity management for optimizing the performance and cost of cloud resources, including conducting performance reviews and recommending the appropriate sizing of cloud resources, which covers optimal performance-to-cost ratio;
Cloud infrastructure, platform, and application operational admin tasks, including monitoring & observability, high availability, connection pooling, and application load balancing;
Systems engineering including Linux-based system performance, memory management, I/O tuning, configuration, clusters and troubleshooting;
Analyzing application performance, using tools such as thread and heap dumps and JVM memory structure and garbage collection concepts;
Administration of web servers running Tomcat, Apache or Nginx; and
Working with cross-functional global and remote teams.
Experience specified must also include 5 years of experience with scripting languages such as Bash/Shell, Python or similar; and
2 years of experience with microservices architecture with Kubernetes.
Telecommuting permitted up to 3 days per week.
Must work shifts on rotation including weekend coverage.
Responsibilities
Ensure the reliability, availability, and security of Pegasystems cloud services.
Act as a mentor in Pega Cloud area for other parts of Pega organization.
Recommend optimal usage of the cloud resources by performing capacity planning and reviews, which include the cost of running infrastructure.
Handle alerts, incidents, service requests, and changes within SLA.
Own customer escalations.
Perform provisioning and upgrade of the infrastructure components & solutions.
Troubleshoot and resolve customer environment issues, root cause analysis, and blameless post-mortems.
Influence product teams on defects, features, and enhancement requests to help build scalable, reliable, observable, available, and highly performant services.
Create, review, and update operational runbooks and Standard Operating Procedures.
Participate in pre-release product enhancement testing with Engineering.
Identify the needs and build tools to automate repeated operational tasks and reduce toil.
Manage multiple projects simultaneously and adapt to changing business goals.