Data Scientist / Curator (contract) at Pfizer
Cambridge, MA 02139, USA -
Full Time


Start Date

Immediate

Expiry Date

14 Oct, 25

Salary

40.0

Posted On

14 Jul, 25

Experience

0 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Biochemistry, Graph Databases, Aws, Analytical Chemistry, Mass Spectrometry, Bioinformatics, Shell Scripting, Molecular Biology, Biology, Scripting Languages, Cell Biology, Cluster

Industry

Information Technology/IT

Description

SUMMARY:

We are looking for a Data Scientist / Curator who will help annotate proteomics data at the MLCS. The contractor will apply proteomics related knowledge with bioinformatics skills to help curate and register high dimensional proteomics data into internal data registry. The deliverable will facilitate data processing, analysis, machine learning with high reproducibility and scalability, as well as data management and visualization.

QUALIFICATIONS:

  • B.S or advanced degree in Bioinformatics, Biology, Molecular Cell Biology, Biochemistry, Analytical Chemistry or related fields
  • Familiarity with mass spectrometry based proteomics data type
  • Wet-lab experience in Biology, Biochemistry, Molecular Biology, Analytical Chemistry or related field is strongly preferred
  • Knowledge of relational or graph databases is preferred
  • Experience in scripting languages like R/Python is desirable
  • Familiarity with workflow languages (Nextflow, CWL, etc) is desirable
  • Familiarity with shell scripting, cluster or cloud computing infrastructure (AWS, GCP) is desirable
Responsibilities
  • Annotate internal mass spec proteomics datasets to be compliant
  • Register datasets into internal databases
  • Transform and engineer datasets so they are ready for downstream analysis
  • Extract, transform and load high value external datasets
  • Document workflows and ensure data injection, metadata capture, versioning control and curation
  • The deliverable will facilitate data processing, analysis, machine learning with high reproducibility and scalability, as well as data management and visualization.
Loading...