Site Reliability Engineer (Remote)

at  Tinybird

Home Office, Nordrhein-Westfalen, Germany -

Start DateExpiry DateSalaryPosted OnExperienceSkillsTelecommuteSponsor Visa
Immediate31 Jan, 2025Not Specified31 Oct, 2024N/ACloud,Load,Linux,Python,Redis,Software,It,Infrastructure,C++,Provisioning,Virtual Machines,Storage,Computing,Disaster RecoveryNoNo
Add to Wishlist Apply All Jobs
Required Visa Status:
CitizenGC
US CitizenStudent Visa
H1BCPT
OPTH4 Spouse of H1B
GC Green Card
Employment Type:
Full TimePart Time
PermanentIndependent - 1099
Contract – W2C2H Independent
C2H W2Contract – Corp 2 Corp
Contract to Hire – Corp 2 Corp

Description:

ABOUT TINYBIRD:

At Tinybird, we help developers and data teams take flight by unlocking the power of real-time data to quickly build data pipelines and innovative data products. With Tinybird, you can effortlessly ingest multiple data sources at scale, query and shape it using the 100% pure SQL you already know and love, and publish results as low-latency, high-concurrency APIs for your applications to chirp about. Developers can create fast APIs, faster—what used to take hours and days now only takes minutes! Tinybird is the essential tool that data engineers and software developers have been waiting for enabling you to drive innovation with ease.

How To Apply:

Incase you would like to apply to this job directly from the source, please click here

Responsibilities:

  • ‍We run our stack in Linux. We try to keep things simple. Technologies we use:
  • OpenResty: SSL termination and load balancing
  • Varnish: load balancing and, sometimes, caching
  • Redis: metadata store
  • Python: most of our backend uses Python except some small bits that rely on C++ for hot paths
  • ClickHouse: our main data store
  • Zookeeper: for ClickHouse replicas coordination
  • We use Grafana, Loki and Mimir for monitoring and alerting
  • Terraform: Cloud provisioning (virtual machines, networks, Kubernetes clusters)
  • Ansible: Deploys and software and config provisioning
  • Our number of machines is still manageable, but the number keeps growing as we keep adding customers. This is not about managing infrastructure but about making sure that our software uses the hardware resources wisely and flexibly. This means you will not only have to worry about automating machines, but about helping the product team to design and develop the architecture of the system as a whole. That will require you to work with our backend code and to understand how ClickHouse works.
-

Some challenges and things we want to improve:

  • High-availability and elasticity: as we keep adding customers, we need to architect our system to be more efficient and flexible
  • Observability: from specific resource usage to a bird’s eye view of the whole platform. This requires good knowledge of storage, networking, and computing
  • Disaster recovery: improving our tooling to manage and discover problems, but also improving our on-call procedures
  • As a specific challenge: when our customers grow, we need to upgrade their accounts. Now, we do it manually—not in the traditional sense of manual because we have tools that automate much of the process, but we need to take care of that one customer at a time: deciding what machines we need to spin-up, how much compute capacity we will provision, etc. Ideally, our architecture should allow our customers to upgrade themselves and assign more resources to them dynamically and seamlessly in the most dynamic, safe and transparent way possible.


REQUIREMENT SUMMARY

Min:N/AMax:5.0 year(s)

Computer Software/Engineering

IT Software - System Programming

Software Engineering

Graduate

Proficient

1

Home Office, Germany