Site Reliability Engineer
at Police Digital Service
Remote, Scotland, United Kingdom -
Start Date | Expiry Date | Salary | Posted On | Experience | Skills | Telecommute | Sponsor Visa |
---|---|---|---|---|---|---|---|
Immediate | 10 Feb, 2025 | GBP 80000 Annual | 10 Nov, 2024 | N/A | Good communication skills | No | No |
Required Visa Status:
Citizen | GC |
US Citizen | Student Visa |
H1B | CPT |
OPT | H4 Spouse of H1B |
GC Green Card |
Employment Type:
Full Time | Part Time |
Permanent | Independent - 1099 |
Contract – W2 | C2H Independent |
C2H W2 | Contract – Corp 2 Corp |
Contract to Hire – Corp 2 Corp |
Description:
JOIN POLICE DIGITAL SERVICE AS A SITE RELIABILITY ENGINEER - STARTING SALARY £80,000
As Site Reliability Engineer (SRE) you will be a cornerstone of the Technical Operations team, dedicated to ensuring the seamless operation and reliability of our systems that deliver critical services to our Policing customers. This role is at the heart of our technological infrastructure, requiring a unique blend of skills that combine software engineering with operational acumen.
Responsibilities:
KEY RESPONSIBILITIES
- Design Scalable Infrastructure: Architect and engineer cloud solutions that are inherently scalable, ensuring they can manage varying loads and demands with ease, while maintaining performance and reliability.
- Automate Operations: Develop and implement robust scripts and automation tools to streamline deployment, configuration, and management tasks, thereby increasing efficiency and reducing the potential for human error.
- Monitor System Health: Utilize comprehensive monitoring solutions to continuously track system performance and health indicators, allowing for proactive identification and resolution of potential issues.
- Lead Incident Response: Take charge during service disruptions, coordinating and leading the response to ensure rapid resolution, minimal impact, and clear communication throughout the incident.
- Enforce Security Standards: Vigilantly uphold security protocols and compliance standards to protect sensitive data and infrastructure against threats and vulnerabilities.
- Plan for Capacity: Engage in strategic capacity planning to accurately predict and prepare for future infrastructure needs, scaling resources accordingly to handle increased load and service demands.
- Document Systems: Create and maintain clear, detailed, and up-to-date documentation of cloud infrastructure, including architecture designs, configurations, and operational procedures.
- Mentor Team Members: Provide expert guidance and mentorship to less experienced team members, promoting a culture of knowledge sharing, continuous learning, and technical excellence.
- Research New Technologies: Actively investigate and evaluate new technologies, tools, and practices that can enhance system reliability, efficiency, and the overall cloud service offering.
- Develop Resilience Strategies: Formulate and implement strategies to enhance the resilience and fault-tolerance of cloud services, ensuring they can withstand and recover from unexpected disruptions.
- Problem Management: Lead comprehensive post-mortem analysis following incidents to identify root causes, extract lessons learned, and implement preventive measures to avoid future occurrences.
WHAT YOU NEED TO SUCCEED IN THE ROLE
- Technical Expertise: In depth knowledge of Azure cloud infrastructure, including services like Azure Compute, Azure Storage, and Azure Networking. Familiarity with implementing and managing Azure solutions such as Azure Kubernetes Service, Azure Functions, and Azure DevOps is crucial.
- Software Engineering Skills: Strong coding skills in languages such as PowerShell, Python, Go, or Ruby, and experience with software development life cycles and agile methodologies. Understanding of Azure SDKs and APIs for integration and automation purposes.
- Automation and Orchestration: Experience with automation tools like Azure Resource Manager, Azure Automation, Ansible, or Chef and orchestration platforms like Kubernetes or Docker Swarm. Proficiency in Azure Bicep would be a significant advantage.
- Monitoring and Analytics: Proficiency with Azure monitoring tools such as Azure Monitor, Application Insights, and Network Watcher. Ability to analyse and interpret complex datasets to inform decision-making.
- Continuous Learning: A commitment to continuous professional development, staying abreast of the latest industry trends and emerging technologies in cloud computing and SRE practices, particularly within the Azure ecosystem.
- Leadership and Mentorship: The ability to lead initiatives, mentor junior team members, and contribute to a culture of technical excellence and continuous improvement.
REQUIREMENT SUMMARY
Min:N/AMax:5.0 year(s)
Information Technology/IT
IT Software - Other
Software Engineering
Graduate
Proficient
1
Remote, United Kingdom