Senior Site Reliability Engineer (SRE)

at  Qdrant

Home Office, Nordrhein-Westfalen, Germany -

Start DateExpiry DateSalaryPosted OnExperienceSkillsTelecommuteSponsor Visa
Immediate04 Jul, 2024Not Specified05 Apr, 2024N/AProgramming Languages,Infrastructure,Code,Python,Aws,Docker,Database Systems,Platforms,Communication Skills,Azure,Ansible,Reliability EngineeringNoNo
Add to Wishlist Apply All Jobs
Required Visa Status:
CitizenGC
US CitizenStudent Visa
H1BCPT
OPTH4 Spouse of H1B
GC Green Card
Employment Type:
Full TimePart Time
PermanentIndependent - 1099
Contract – W2C2H Independent
C2H W2Contract – Corp 2 Corp
Contract to Hire – Corp 2 Corp

Description:

Qdrant is an Open-Source Vector Database.
We help businesses take advantage of modern AI technologies. We are developing neural search solutions that allow everyone to use state-of-the-art neural network encoders at the production scale. At the same time, we help companies to integrate our technology into their infrastructure. Our flagship product is the open-source Vector Database: https://github.com/qdrant/qdrant
Among the technical challenges, we are facing is the implementation of our cloud infrastructure to serve our engine as a scalable cloud API solution. We are looking for a Site Reliability Engineer to ensure stable and secure operability of our managed solutions. If you’re passionate about Site Reliability Engineering, Python, Go, Kubernetes, and contributing to the growth of a cutting-edge Database as a Service, we want to hear from you! Apply now and become a key player in shaping the reliability and scalability of our DBaaS platform.

Tasks

  • Infrastructure Automation: Design, implement, and manage infrastructure code using Terraform, focusing on the reliability and scalability of our Database as a Service (DBaaS) platform.
  • Programming Mastery: Utilize Python and Go to improve our service quality and develop automation scripts and tools for monitoring, deployment, and maintenance tasks specific to database operations.
  • Kubernetes Expertise: Demonstrate a deep understanding of Kubernetes, ensuring optimal performance, scalability, and reliability for our DBaaS platform.
  • Operator Frameworks: Develop and maintain Kubernetes Operators for automating database platform operations, enhancing the reliability of our services.
  • Multi-Cloud Management: Architect and maintain infrastructure in multi-cloud environments (AWS, GCP, Azure) to provide a resilient and available DBaaS solution.
  • Monitoring and Incident Response: Implement effective monitoring solutions tailored for database services and collaborate on incident response procedures to maintain the high availability of our systems.
  • Service Level Objectives (SLOs) and Agreements (SLAs): Define, measure, and maintain SLOs and SLAs specific to database performance and reliability, actively monitoring and optimizing systems to meet these targets.

Requirements

  • Site Reliability Engineering Focus: Proven experience in a Site Reliability Engineering or similar role, with a strong emphasis on database systems.
  • Programming Languages: Proficiency in Python and Go; experience with other languages is a plus.
  • Kubernetes Skills: Proven hands-on experience managing and optimizing Kubernetes clusters, particularly in the context of database services.
  • Operator Frameworks: Strong background in developing and maintaining Kubernetes Operators, with a focus on database automation.
  • Infrastructure as Code (IaC): Solid understanding and experience with Terraform, Ansible, or Pulumi, specifically applied to database infrastructure.
  • Multi-Cloud Expertise: Experience working with multi-cloud environments (AWS, GCP, Azure), ensuring seamless database operations across platforms.
  • Container Orchestration: Deep understanding of containerization concepts and orchestration tools (Docker, Kubernetes) within the DBaaS context.
  • SLOs and SLAs: Demonstrated experience in defining, implementing, and meeting Service Level Objectives and Agreements, particularly in the context of database reliability.
  • Problem Solving: Strong analytical and problem-solving skills, with a keen attention to detail.
  • Communication Skills: Excellent communication and collaboration skills, with the ability to convey complex technical concepts to diverse audiences.

Benefits

  • Working in a passionate international team
  • Competitive salary plus perks
  • Flexible working hours
  • Company events
  • Choose any hardware
  • Remote first/home office
  • Relocation option

Responsibilities:

  • Infrastructure Automation: Design, implement, and manage infrastructure code using Terraform, focusing on the reliability and scalability of our Database as a Service (DBaaS) platform.
  • Programming Mastery: Utilize Python and Go to improve our service quality and develop automation scripts and tools for monitoring, deployment, and maintenance tasks specific to database operations.
  • Kubernetes Expertise: Demonstrate a deep understanding of Kubernetes, ensuring optimal performance, scalability, and reliability for our DBaaS platform.
  • Operator Frameworks: Develop and maintain Kubernetes Operators for automating database platform operations, enhancing the reliability of our services.
  • Multi-Cloud Management: Architect and maintain infrastructure in multi-cloud environments (AWS, GCP, Azure) to provide a resilient and available DBaaS solution.
  • Monitoring and Incident Response: Implement effective monitoring solutions tailored for database services and collaborate on incident response procedures to maintain the high availability of our systems.
  • Service Level Objectives (SLOs) and Agreements (SLAs): Define, measure, and maintain SLOs and SLAs specific to database performance and reliability, actively monitoring and optimizing systems to meet these targets


REQUIREMENT SUMMARY

Min:N/AMax:5.0 year(s)

Information Technology/IT

IT Software - DBA / Datawarehousing

Software Engineering

Graduate

Proficient

1

Home Office, Germany