URGENT ************ Senior Mainframe Performance & Capacity Management Engi at Prav

Remote, British Columbia, Canada -

Full Time

Start Date

Immediate

Expiry Date

30 Nov, 25

Salary

0.0

Posted On

31 Aug, 25

Experience

1 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Job Scheduling, Rexx, Batch Processing, Data Visualization, Reporting, Adabas, Wlm, Python, Vlookup, Communication Skills, Powerpivot, Capacity Management, Powerpoint, Optimization Techniques

Industry

Information Technology/IT

Description

EXPERIENCE LEVEL

Level 4 (Advanced): 7–15 years

WHAT SKILLS YOU BRING

10+ years of mainframe systems experience in large, multi-processor, multi-LPAR, Parallel Sysplex z/OS environments
Proven track record in real-time performance monitoring, capacity management, and tuning via WLM and batch initiator adjustments
Deep knowledge of SMF/RMF data, WLM service definitions, PR/SM, and z/OS workload behavior
Proficiency in scripting with REXX or Python, JCL, and DB2 utilities
Strong understanding of batch processing, job scheduling, and resource optimization techniques
Advanced Excel skills (PivotTables, VLOOKUP, PowerPivot) and PowerPoint for data visualization and reporting
Excellent analytical, problem-solving, and strategic thinking abilities with attention to detail
Outstanding written and verbal communication skills; comfortable presenting to technical and non-technical audiences

Responsibilities

WHY THIS ROLE MATTERS

You will safeguard the performance and scalability of mission-critical mainframe workloads, preventing incidents before they happen and ensuring seamless operations across Morgan Stanley’s enterprise infrastructure. Your expertise in real-time monitoring, anomaly detection, and capacity forecasting will directly impact uptime, cost efficiency, and service quality for high-value business processes.

WHAT YOU WILL DO

Monitor and analyze real-time z/OS health across CPU, memory, DASD, and WLM workloads using RMF, SmartIS, IzPCA, MICS, and internal tools
Detect, troubleshoot, and resolve performance anomalies and workload degradation in production, partnering with incident response teams
Develop and implement tuning strategies: adjust service definitions, dispatch priorities, and workload placements to optimize throughput
Forecast resource demand and model capacity requirements to support long-term infrastructure sizing, cost modeling, and vendor reporting (SCRT)
Collect, visualize, and present performance KPIs and capacity metrics—creating dashboards and reports for senior leadership and technical stakeholders
Participate in on-call rotations, responding promptly to performance and observability alerts
Lead migration of performance and capacity tooling into Git-based change management and DevOps deployment pipelines