Senior Linux HPC Systems Engineer at Cognizant - Thailand
Cogswell, North Dakota, United States -
Full Time


Start Date

Immediate

Expiry Date

06 Feb, 26

Salary

115000.0

Posted On

08 Nov, 25

Experience

10 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

RedHat Enterprise Linux (RHEL 8 & 9) Administration, High-Performance Computing (HPC) Techniques, Clustering, Troubleshooting And Performance Optimization, Scientific Application Support, Stakeholder Relationship Management, ServiceNow, Networking Concepts, Performance Monitoring Tools, GPU Technologies

Industry

IT Services and IT Consulting

Description
Role: Senior Linux HPC Systems Administrator/Engineer Location: Boston, MA Overview Experienced Senior Linux HPC Systems Administrator/Engineer with minimum 10 years of enterprise IT experience to manage and support our critical Linux-based infrastructure This role is critical for managing and supporting our advanced computing environments, which are pivotal to scientific research and high-performance computing (HPC) initiatives. The position requires hands-on expertise with high-end workstation hardware and scientific applications, as well as a strong background in HPC techniques, including clustering and workload management with tools like Slurm. The ideal candidate will be proficient in RedHat Enterprise Linux (RHEL 8 & 9) and have experience with scientific and high-performance computing environments and will also have excellent stakeholder relationship skills and the ability to communicate complex technical concepts effectively to various stakeholders, ensuring our scientists receive top-tier in-person support onsite. Key Responsibilities · Enterprise Linux Administration: o Administer, configure, and maintain RHEL environments (specifically RHEL 8 & 9) ensuring stability, performance, and security. o Provide hands-on support with high-end workstation hardware for scientists, promptly addressing hardware and software issues. · Scientific and HPC Support: o Offer technical support to scientific users, bridging the gap between research demands and IT infrastructure. o Leverage any scientific computing experience to optimize system performance and manage specialized applications. o Assist with management of high-performance compute resources, including experience with Slurm, clustering, and related HPC technologies. · Collaboration and Stakeholder Management: o Work closely with other technical teams and stakeholders to align IT services with organizational needs. o Build and maintain strong stakeholder relationships, communicating complex technical concepts. o Provide in-person support onsite to ensure effective resolution of issues and a high level of customer satisfaction. · Service Management and Process Improvement: o Utilize ServiceNow for tracking incidents, managing change requests, and ensuring timely resolution of service tickets. o Implement and follow IT best practices for incident management, performance monitoring, and network troubleshooting. · Additional Technical Duties: o Manage SSL certificates and configure web servers as needed. o Monitor and troubleshoot system performance issues, including understanding the impact of GPUs, networking, and other hardware components. o Handle vendor relationships effectively, coordinating with external partners to resolve issues and optimize service delivery. o Maintain familiarity with MacOS systems to provide assistance when necessary. Required Qualifications · Technical Expertise: o Minimum 10 years of enterprise IT experience with extensive hands-on expertise in RedHat Enterprise Linux (RHEL), specifically RHEL 8 & 9. o Proven experience with high-end workstation hardware setups and scientific application support. o Demonstrated knowledge of scientific computing and experience in high performance compute environments, including experience with Slurm and clustering, is highly desirable. o Strong troubleshooting skills for both hardware and software issues. · Interpersonal Skills: o Excellent communication skills with a proven ability to engage and build relationships with stakeholders at various levels. o Experience working collaboratively with other technical teams to resolve complex problems and drive operational improvements. o Strong stakeholder relationship building skills and the ability to manage vendor relationships effectively. · Additional Desirable Skills: o Working knowledge of ServiceNow and its application in incident and service management. o Familiarity with networking concepts, performance monitoring tools, and GPU technologies. o Any experience with scientific applications will be a significant advantage. o Exposure to MacOS environments is useful but not essential. · Onsite Requirement: o Must be able to work onsite to provide in-person technical support to scientists and ensure optimal system performance. Mandatory Skills (Top 5 Keywords or skills) Skill Proficiency Years of Experience Basic Knowledge Medium Expert RedHat Enterprise Linux (RHEL 8 & 9) Administration 10 expert High-Performance Computing (HPC) Techniques 5 expert Clustering 3 medium Troubleshooting and Performance Optimization 10 expert Salary and Other Compensation: Applications will be accepted until 11/14/25. The annual salary for this position is between $99,000-$115,000 depending on experience and other qualifications of the successful candidate. This position is also eligible for Cognizant’s discretionary annual incentive program, based on performance and subject to the terms of Cognizant’s applicable plans. Benefits: Cognizant offers the following benefits for this position, subject to applicable eligibility requirements: · · Medical/Dental/Vision/Life Insurance · · Paid holidays plus Paid Time Off · · 401(k) plan and contributions · · Long-term/Short-term Disability · · Paid Parental Leave · · Employee Stock Purchase Plan Disclaimer: The salary, other compensation, and benefits information is accurate as of the date of this posting. Cognizant reserves the right to modify this information at any time, subject to applicable law.

How To Apply:

Incase you would like to apply to this job directly from the source, please click here

Responsibilities
The Senior Linux HPC Systems Engineer will manage and support critical Linux-based infrastructure, ensuring stability, performance, and security. They will provide hands-on support for high-end workstation hardware and assist scientific users with technical support and HPC resource management.
Loading...