Lead Specialist Engineer - HPC & Cloud at UK Health Security Agency
London, England, United Kingdom -
Full Time


Start Date

Immediate

Expiry Date

16 Jul, 25

Salary

54416.0

Posted On

16 Apr, 25

Experience

0 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Ubuntu, Lustre, Ceph, Smb, Rhel, Performance Tuning, Nfs, Norway, Suse, Apache, Debian, Centos

Industry

Other Industry

Description

JOB SUMMARY

We pride ourselves as being an employer of choice, where Everyone Matters promoting equality of opportunity to actively encourage applications from everyone, including groups currently underrepresented in our workforce.
UKHSA ethos is to be an inclusive organisation for all our staff and stakeholders. To create, nurture and sustain an inclusive culture, where differences drive innovative solutions to meet the needs of our workforce and wider communities. We do this through celebrating and protecting differences by removing barriers and promoting equity and equality of opportunity for all.
The Digital and Data Directorate has primary responsibility for scientific computing and research computing services and support. The key functions of the Digital and Data Directorate are to provide and support such platforms required by the staff of The UK Health Security Agency, and to provide the technical capabilities to enable public health services, both within the Organisation and between the Organisation and its customers and stakeholders.

JOB DESCRIPTION

  • Plan, configure, manage and maintain all hardware and software components of all High Performing Computing HPC, UNIX operating system, Virtualization and Cloud platforms in UKHSA to deliver optimum system availability to users, and ensuring all supplier provided patches and upgrades to the operating system, database, tools and utilities are applied in a timely manner. Support High Performance and High Throughput computing operations.
  • Maintaining the security and integrity of all HPC, UNIX, Virtualization and Cloud platforms in UKHSA, including managing all user access rights and implementation of backup regimes and other disaster recovery procedures.
  • Providing technical and administrative support for all HPC, UNIX, Virtualization and Cloud platforms within UKHSA to all levels of staff. Ensuring systems are documented and formulate relevant procedures and protocols.
  • Liaise with the relevant HPC specialist suppliers to ensure that the organisation is equipped with correct and appropriate technology to support the achievement of UKHSA’s objectives.
  • Creating and maintaining comprehensive documentation, including procedures and protocols for technical staff and users, on the licensing, components, connectivity, configurations and operation of specialist systems and services, and supporting relevant hardware and software. Maintaining such documentation and ensuring it is up to date and in an auditable condition. Providing training, where appropriate, to technical staff and users to enable them to utilize HPC systems and services optimally.
  • Monitoring and managing HPC, UNIX, Virtualization and Cloud platforms performance and capacity growth, providing advice on necessary upgrades and replacement of hardware and software so as to maintain the ability of UKHSA HPC, UNIX, Virtualization and Cloud platforms to support UKHSA business. Implementing hardware changes, upgrades, database upgrades and migrations to maintain system performance and growth capacity.
  • Ensuring compliance with all relevant policies in UKHSA, HPC, UNIX, Virtualization and Cloud platforms usage.
  • To maintain awareness of technical developments and research new technologies in HPC, UNIX, Virtualization and Cloud platforms with a view to providing advice on suitable deployment strategies for UKHSA. Advise on the choice of software solutions and hardware platforms for the management of Big Data and analytics platforms and solutions.
  • Provide a level of work that adheres to the high standards and best practices in line with the SLAs as agreed with UKHSA Users.
    The main purpose of the role is to manage, support and maintain the hardware and software components of mission critical High Performance Computing (HPC), Unix/Linux, virtualization and cloud platform required for the execution of UKHSA business. The post holder will be responsible for availability, performance, efficiency, monitoring, capacity planning, change management, emergency response, and expected to work in conjunction other UKHSA departments to ensure that the organisation is equipped with state-of-the-art technology to support the rapidly expanding public health services.
    The role holder will also ensure that the HPC and Unix/Linux systems are correctly maintained and managed to provide authorized users with optimum levels of access to data and applications as and when required, in order to effectively conduct UKHSA business.
    An in-depth working knowledge of Linux clustered computing environments, hybrid networks (Ethernet and InfiniBand), high performance parallel filesystems, software defined storage and enterprise class open source technologies is an essential requirement of this role.
    This role will also support the expansion of HPC Cloud computing platform and associated environments to support the wider achievement of UKHSA business objectives. Software engineering skills are desirable to solve problems relating to mission critical services and build automation to prevent problem recurrence, with the goal of automating response to all non-exceptional service conditions.

KNOWLEDGE/SUBSTANTIVE EXPERIENCE OF: ENTERPRISE CLASS LINUX DISTRIBUTION SUCH AS RHEL, CENTOS, SUSE, DEBIAN, UBUNTU; BASIC STORAGE CONFIGURATION: LVM, ISCSI; UNIX/LINUX SCRIPTING; TCP/IP, DHCP, VLANS, SPANNING TREE PROTOCOL, LINK AGGREGATION FOR PERFORMANCE (MTU SETTINGS) AND RELIABILITY REQUIREMENTS; DESIGN/IMPLEMENTING UNIX/LINUX SYSTEM AND SERVICES OPEN SOURCE SOLUTIONS AND PERFORMANCE TUNING; OPEN-SOURCE STORAGE TECHNOLOGIES SUCH AS: LUSTRE, CEPH, NFS, SMB, APACHE, NGNIX, HAPROXY

Desirable criteria may be used in the event of a large number of applications / large amount of successful candidates (see attached job description)
If you are successful at this stage, you will progress to interview and assessment
Please do not exceed 500 words. We will not consider any words over and above this number.
Feedback will not be provided at this stage.

NATIONALITY REQUIREMENTS

This job is broadly open to the following groups:

  • UK nationals
  • nationals of the Republic of Ireland
  • nationals of Commonwealth countries who have the right to work in the UK
  • nationals of the EU, Switzerland, Norway, Iceland or Liechtenstein and family members of those nationalities with settled or pre-settled status under the European Union Settlement Scheme (EUSS)
  • nationals of the EU, Switzerland, Norway, Iceland or Liechtenstein and family members of those nationalities who have made a valid application for settled or pre-settled status under the European Union Settlement Scheme (EUSS)
  • individuals with limited leave to remain or indefinite leave to remain who were eligible to apply for EUSS on or before 31 December 2020
  • Turkish nationals, and certain family members of Turkish nationals, who have accrued the right to work in the Civil Service

Further information on nationality requirements

Responsibilities

Engineering
Information Technology
Other

Loading...