Data Engineer

at  Immunai

Praha, Praha, Czech -

Start DateExpiry DateSalaryPosted OnExperienceSkillsTelecommuteSponsor Visa
Immediate15 Feb, 2025Not Specified18 Nov, 20243 year(s) or aboveApache Spark,Coding Experience,Programming Languages,Bioinformatics,Communication Skills,Python,Relational Databases,Etl Tools,Postgresql,Data Science,Computer Science,Sql,Data Storage TechnologiesNoNo
Add to Wishlist Apply All Jobs
Required Visa Status:
CitizenGC
US CitizenStudent Visa
H1BCPT
OPTH4 Spouse of H1B
GC Green Card
Employment Type:
Full TimePart Time
PermanentIndependent - 1099
Contract – W2C2H Independent
C2H W2Contract – Corp 2 Corp
Contract to Hire – Corp 2 Corp

Description:

ABOUT IMMUNAI:

Immunai is an engineering-first platform company aiming to improve therapeutic decision-making throughout the drug discovery and development process. We are mapping the immune system at unprecedented scale and granularity and applying machine learning to this massive clinico-immune database, in order to generate novel insights into disease pathology for our partners - pharma companies and research institutes. We provide a comprehensive, end-to-end solution - from data generation and curation to therapeutics development, that continuously supports and validates the capabilities of our platform.
As drug development is becoming increasingly inefficient, our ultimate goal is to help bring breakthrough medicines to patients as quickly and successfully as possible.
Immunai is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.

REQUIRED QUALIFICATIONS:

  • Bachelor’s or Master’s degree in Computer Science, Data Science, Software Engineering, Bioinformatics, or a related field
  • 3+ years experience as a Data Engineer, ideally with a strong track-record in handling complex datasets and mastering sophisticated data processes
  • Good programming skills (5+ years experience) with Python, building modular and reusable code by leveraging standard data libraries (e.g. Pandas)
  • Proficiency in SQL, experience with relational databases (PostgreSQL preferred)
  • Familiarity with other data storage technologies (data warehousing technologies, ideally BigQuery, and NoSQL databases, especially MongoDB)
  • Some knowledge of data orchestration tools (e.g. Apache Airflow, Dagster)
  • An analytical mindset with attention to detail
  • Very good English communication skills (Czech/Slovak is an advantage)

PREFERRED QUALIFICATIONS:

  • Experience with cloud environments (Google preferred) is highly desirable
  • Coding experience in other programming languages (Java / R) is a big plus
  • Experience with ETL tools (e.g. Apache Beam, Apache Spark) is also a big plus
  • Familiarity with biotech or healthcare data is a plus

Responsibilities:

ABOUT THE ROLE:

The Metadata Developers Team, part of the Immunai Software group, focuses on advanced tools and solutions to retrieve, store, handle, and analyze complex descriptive biological data, such as laboratory and clinical metadata, with extensive use of domain ontologies. In close collaboration with Immunai’s biocurators and biology experts, we are continuously advancing our technology and data models to improve the consistency and descriptive power of our vast clinical database, as well as the usability and accessibility of data that feed the company’s most advanced algorithms for therapeutic decision-making and discovery.
As a Data Engineer at Immunai, you will specialize in designing, building, and maintaining top-notch, scalable data pipelines and software solutions for biological data and clinical metadata, ensuring robust and resilient data flows and processes, as well as a smooth integration of different data sources.
You will act as a partner for our Prague-based biocuration team to maximize the exchange of knowledge that will drive advances and breakthroughs in our metadata curation and delivery infrastructure. You will understand their operational and data governance needs and dive deep into the characteristics of our highly-specialized data models. You will design and create lean and modular solutions that will leverage a larger ecosystem of integrated data tools and will empower our internal analysts and most valuable customers in their research.
Location: Prague

WHAT WILL YOU DO?

  • Collaborate with other Metadata Dev team members in an agile setting to build and maintain new metadata pipelines and infrastructure, or to enhance existing ones
  • Interact with biologists and bioinformaticians to understand their data needs and tooling requirements, propose and discuss solutions, bring new ideas
  • Strengthen the liaison between the Prague site and the other Metadata Dev team members located in Zurich by collaborative work and continuous knowledge sharing
  • Provide support, training, and guidance to internal users and other stakeholders
  • Collaborate with the Metadata Operations team to ensure smooth data processing, ingestion, and delivery, striving for the resolution of any time-critical issues
  • Develop and maintain documentation for our tools and products
    Requirements:


REQUIREMENT SUMMARY

Min:3.0Max:8.0 year(s)

Information Technology/IT

IT Software - DBA / Datawarehousing

Software Engineering

Graduate

Computer Science, Software Engineering, Engineering

Proficient

1

Praha, Czech