Site Reliability Engineer

at  Nooxit

10407 Berlin, Prenzlauer Berg, Germany -

Start DateExpiry DateSalaryPosted OnExperienceSkillsTelecommuteSponsor Visa
Immediate18 Feb, 2025Not Specified19 Nov, 2024N/AAutomation,Resource Management,Python,Code,Azure,Metrics,System Monitoring,Scripting,Distributed SystemsNoNo
Add to Wishlist Apply All Jobs
Required Visa Status:
CitizenGC
US CitizenStudent Visa
H1BCPT
OPTH4 Spouse of H1B
GC Green Card
Employment Type:
Full TimePart Time
PermanentIndependent - 1099
Contract – W2C2H Independent
C2H W2Contract – Corp 2 Corp
Contract to Hire – Corp 2 Corp

Description:

Full-time (40 h), as soon as possible, permanent and based in Berlin or remotely in home office.
We’re seeking an experienced Site Reliability Engineer (SRE) with a solid foundation in Python, a passion for performance optimization, and a proactive approach to infrastructure management. In this role, you’ll work closely with development and operations teams to maintain, monitor, and improve the reliability of our systems, leveraging cutting-edge tools and methodologies to ensure peak performance.

Tasks

  • Design, implement, and optimize systems to improve the reliability, performance, and scalability of our services.
  • Build and maintain observability solutions using tools like Jaeger, Prometheus, and Grafana to enhance monitoring, tracing, and alerting across applications.
  • Collaborate with development teams to build, manage, and scale Kubernetes environments, ensuring high availability and robust service delivery.
  • Develop automation scripts and tools in Python to enhance system reliability and reduce manual intervention.
  • Diagnose and resolve incidents, conduct root-cause analysis, and implement measures to prevent recurrence.
  • Participate in on-call rotations, ensuring rapid response to system issues while continuously improving incident management processes.

Requirements

  • Proficiency in Python for scripting and automation.
  • Experience with tracing tools such as Jaeger or similar to troubleshoot and monitor complex distributed systems.
  • Experience with monitoring tools such as Prometheus or similar for collecting and alerting on metrics.
  • Experience with dashboarding tools such as Grafana or similar for creating visualizations that aid in system monitoring and diagnostics.
  • Experience working in Kubernetes environments, with an understanding of container orchestration, scaling, and resource management.

PREFERRED QUALIFICATIONS (OPTIONAL):

  • Hands-on experience with CI/CD pipelines and DevOps practices.
  • Familiarity with cloud platforms (AWS, GCP, Azure) and infrastructure-as-code tools like OpenTofu.

Benefits

  • Competitive salary
  • Flexible work hours and remote work opportunities.
  • A beautiful Gather remote office
  • An ambitious and helpful team
  • Opportunity to work with cutting-edge technologies and make a significant impact in a fast-growing startup environment

Are you interested?
Then apply right now by sending your CV If available, please include a Github link. A cover letter is not necessary.
If you have any questions, please contact us or just give us a call

How To Apply:

Incase you would like to apply to this job directly from the source, please click here

Responsibilities:

  • Design, implement, and optimize systems to improve the reliability, performance, and scalability of our services.
  • Build and maintain observability solutions using tools like Jaeger, Prometheus, and Grafana to enhance monitoring, tracing, and alerting across applications.
  • Collaborate with development teams to build, manage, and scale Kubernetes environments, ensuring high availability and robust service delivery.
  • Develop automation scripts and tools in Python to enhance system reliability and reduce manual intervention.
  • Diagnose and resolve incidents, conduct root-cause analysis, and implement measures to prevent recurrence.
  • Participate in on-call rotations, ensuring rapid response to system issues while continuously improving incident management processes


REQUIREMENT SUMMARY

Min:N/AMax:5.0 year(s)

Information Technology/IT

IT Software - Application Programming / Maintenance

Software Engineering

Graduate

Proficient

1

10407 Berlin, Germany