Technology/Domain Specialist II (Site Reliability Engineer) at Nedbank
Johannesburg, Gauteng, South Africa -
Full Time


Start Date

Immediate

Expiry Date

09 Aug, 25

Salary

0.0

Posted On

03 Jul, 25

Experience

3 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Docker, Linux, Asset Management, Working Experience, Decision Making, Bash, It, Information Technology, Microservices, Framework, Windows, Data Warehousing, Conflict, Containerization, Architecture

Industry

Information Technology/IT

Description

JOB CLASSIFICATION

140754 - Technology Domain Specialist (Site Reliability Engineer)
Closing date - 10 July 2025
Job Family
Information Technology
Career Stream
Application Development
Leadership Pipeline
Manage Self: Technical

TECHNICAL SKILLS

  • Working Experience of Operating System (Linux or Windows)
  • Knowledgeable with microservices and containerization; K8s or Docker
  • Troubleshooting and rout cause Analysis
  • SRE Best practices
  • In-depth knowledge of DevOps framework
  • Experience and knowledge of programming languages(C#, Java, Python, Bash)
  • Proactivity in seeking Improvement opportunitiesExperience with troubleshooting production systems/applications
-

ESSENTIAL QUALIFICATIONS - NQF LEVEL

  • Advanced Diplomas/National 1st Degrees
  • Professional Qualifications/Honour’s Degree
    Preferred Qualification
    Degree or Diploma in IT
    Preferred Certifications
    Certificate in relevant Technology or Domain
    Minimum Experience Level
    Min 5 IT Experience with 3 years in relevant technology or domain

TECHNICAL / PROFESSIONAL KNOWLEDGE

  • Asset management
  • IT Assets management processes
  • Data Warehousing
  • Information Technology (IT) Architecture

Behavioural Competencies

  • Decision Making
  • Courage
  • Stress Tolerance
  • Quality Orientation
  • Technical/Professional Knowledge and Skills
  • Emotional Intelligence Essentials
  • Resolving Conflict

-

Responsibilities

JOB PURPOSE

To actively own and participate in the overall evolution of the Technology or Domain asset while influencing and maintaining the health of the asset. Play a leadership role on the associated COE’s

JOB RESPONSIBILITIES

  • Collaborating with stakeholders, engineers, and operational SMEs to ensure all relevant parties are up to date with what is top of mind within the reliability service offerings
  • Evolve services based on customer needs and technology to ensure we remain competitive in the market
  • Influence and collaborate with squads during service or platform design to proactively prevent system failures and enhance performance
  • Engage with Asset/Journey squads to adopt SRE practices with a core focus to contribute towards incident management and advocate for blameless post mortems.
  • Engage and influence squads with regards to observability, high availability utilising new or existing technology and Improve disaster recovery plans.
  • Implement automated-based solutions to achieve high availability, efficiency, reduce cost and performance to systems.
  • Coach squads on best practices within the organisation via internal forums to position SRE fundamental knowledge and promote enterprise-wide knowledge sharing
  • Assist with creating and maintaining system health and performance metrics reflecting real-time data, enabling proactive resolution and faster troubleshooting.
  • Collaborate and partner with DevOps engineer/coach to ensure efficient (CI/CD) pipelines and resolve any failures or improve.
  • Take charge of technical leadership, engage, with squads to identify best solutions, and support and guide Junior SRE’s.
  • Assist in defining and implementing metrics related to performance of services such as SLO’s, SLI’s and SLO’s.
  • Defining and delivering Site Reliability Engineering technical standards in partnership with all disciplines of software engineering.
  • Participate and closely work with relevant COE’s to improve release of new features to facilitate time to market.
  • Ability to build and maintain strategic relationships with the business units and vendors in order to be in sync on current ways of work and business decisions that are being embraced
  • Conduct assessments within squads to measure SRE maturity, provide report and outline a plan to assist on moving to next level with continuous feedback.
  • Adhere and comply with Nedbank group information management, data integrity and security policies and best practices.
  • Participate and support corporate responsibility initiatives for the achievement of business strategy.Manage multiple concurrent objectives, projects, groups, or activities, making effective judgements as to prioritisation and time allocation
-
Loading...