Lead Systems Engineer at DTCC Candidate Experience Site
Coppell, Texas, United States -
Full Time


Start Date

Immediate

Expiry Date

12 Aug, 26

Salary

0.0

Posted On

14 May, 26

Experience

10 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

IBM MQ, IBM App Connect Enterprise, L3 Troubleshooting, Linux/Unix, Ansible, Chef, Terraform, Splunk, Grafana, Prometheus, OpenShift, Kubernetes, SRE, High-Availability Design, CI/CD, Problem Management

Industry

Financial Services

Description
Are you ready to make an impact at DTCC?  Do you want to work on innovative projects, collaborate with a dynamic and supportive team, and receive investment in your professional development? At DTCC, we are at the forefront of innovation in the financial markets. We are committed to helping our employees grow and succeed. We believe that you have the skills and drive to make a real impact. We foster a thriving internal community and are committed to creating a workplace that looks like the world that we serve.  The Information Technology group delivers secure, reliable technology solutions that enable DTCC to be the trusted infrastructure of the global capital markets. The team delivers high-quality information through activities that include development of essential, building infrastructure capabilities to meet client needs and implementing data standards and governance.   Pay and Benefits:   * Competitive compensation, including base pay and annual incentive * Comprehensive health and life insurance and well-being benefits, based on location * Pension / Retirement benefits * Paid Time Off and Personal/Family Care, and other leaves of absence when needed to support your physical, financial, and emotional well-being. * DTCC offers a flexible/hybrid model of 3 days onsite and 2 days remote (onsite Tuesdays, Wednesdays and a third day unique to each team or employee).   Job Description:   We are seeking an experienced L3 Messaging Platform Engineer to support and evolve our enterprise IBM MQ messaging ecosystem. This role is designed for a deeply technical engineer who goes beyond reactive support and actively drives the team toward a proactive, automation‑first mindset, and reliability‑focused operating model. The ideal candidate thrives in complex production environments, has strong L3 troubleshooting expertise, and brings curiosity and enthusiasm for modern messaging technologies, including containers, cloud platforms, and next‑generation messaging architectures. This role provides hands‑on opportunities to influence platform strategy, tooling, and operational maturity while partnering closely with L2 teams, application owners, and infrastructure partners.   Primary Responsibilities * Serve as the L3 technical SME for IBM MQ / Messaging Technologies, providing deep troubleshooting, design guidance, and resolution of the most complex messaging incidents across production and non‑production environments. * Partner closely with L2 support teams to uplift operational maturity by shifting knowledge left through improved runbooks, tooling, automation, and clear escalation patterns. * Lead problem management and root‑cause analysis efforts, ensuring recurring incidents are fully understood, permanently remediated, and prevented from reoccurring. * Design and implement proactive monitoring, alerting, and health indicators that detect leading signals of failure and reduce customer‑impacting incidents. * Identify repetitive operational failure patterns and engineer self‑healing automation to automatically detect, mitigate, or recover from known failure scenarios. * Build and maintain automation for MQ lifecycle operations, including provisioning, configuration validation, certificates, health checks, and recovery workflows. * Drive reduction of operational toil by continuously replacing manual intervention with policy‑driven, automated, and resilient solutions. * Actively contribute to incident postmortems, blameless retrospectives, and reliability reviews with a focus on systemic improvements and long‑term fixes. * Support and influence platform modernization initiatives, including adoption of containerized messaging, cloud and hybrid architectures, and improved CI/CD integration where applicable. * Collaborate with engineering, infrastructure, security, and application teams to ensure secure, resilient, and standards‑compliant messaging solutions. * Mentor engineers across L2/L3 teams, promoting best practices in reliability engineering, automation, and proactive operations. * Operate with a Site Reliability Engineering (SRE) mindset, focusing on improving platform reliability, availability, scalability, and resilience rather than reactive incident handling alone.   Qualifications: * Minimum of 10 years of related experience * Strong expertise with IBM MQ (including MQ IPT, and NativeHA) and IBM App Connect Enterprise (ACE) * Proven L3 Troubleshooting experience, including performance analysis and failure recovery * Solid experience in Linux/Unix environments * Strong understanding of high-availability, fault-tolerant system design * Experience with automation tools such as Ansible, Chef, Terraform * Experience with observability and monitoring (e.g., Splunk, Grafana, Prometheus, APM6) * Exposure to containerized messaging platforms and modern deployment models (e.g., OpenShift/Kubernetes, cloud or hybrid environments) * Ability to identify recurring operational issues and design long‑term, sustainable fixes instead of short‑term workarounds * Proven ability to remain calm, structured, and decisive during major incidents, providing technical leadership and clear communication * Strong Site Reliability Engineering (SRE) mindset, with a proven track record of improving system reliability, stability, and availability through engineering solutions rather than manual intervention * Strong ownership mindset with the ability to drive outcomes without excessive oversight   The salary range is indicative for roles at the same level within DTCC across all US locations. Actual salary is determined based on the role, location, individual experience, skills, and other considerations. We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, sex, gender, gender expression, sexual orientation, age, marital status, veteran status, or disability status. We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation. Serves as a dedicated technology resource for advancing DTCC’s business opportunities and providing industry thought leadership for leveraging new technology. The goal of this new department is to partner internally with IT, our business and regulatory divisions and externally with clients, regulators, and fintech vendors, to help build new platforms and business models to advance DTCC’s mission to support the financial markets.
Responsibilities
Serve as the L3 technical SME for IBM MQ and messaging technologies to resolve complex incidents and drive platform reliability. Lead automation efforts to reduce operational toil and implement proactive monitoring and self-healing solutions.
Loading...