Data Scientist (NLP)

at  Binance

Remote, Maluku Utara, Singapore -

Start DateExpiry DateSalaryPosted OnExperienceSkillsTelecommuteSponsor Visa
Immediate07 May, 2025Not Specified08 Feb, 2025N/ALearning Techniques,Apache Spark,Programming Languages,Java,Computational Linguistics,Python,Statistics,Data Science,Natural Language Processing,Apache Kafka,Scikit Learn,Mathematics,Nltk,Neural Networks,Ml,Computer Science,Machine LearningNoNo
Add to Wishlist Apply All Jobs
Required Visa Status:
CitizenGC
US CitizenStudent Visa
H1BCPT
OPTH4 Spouse of H1B
GC Green Card
Employment Type:
Full TimePart Time
PermanentIndependent - 1099
Contract – W2C2H Independent
C2H W2Contract – Corp 2 Corp
Contract to Hire – Corp 2 Corp

Description:

Binance is a leading global blockchain ecosystem behind the world’s largest cryptocurrency exchange by trading volume and registered users. We are trusted by over 250 million people in 100+ countries for our industry-leading security, user fund transparency, trading engine speed, deep liquidity, and an unmatched portfolio of digital-asset products. Binance offerings range from trading and finance to education, research, payments, institutional services, Web3 features, and more. We leverage the power of digital assets and blockchain to build an inclusive financial ecosystem to advance the freedom of money and improve financial access for people around the world.
This position will be under our Risk AI team, focusing on NLP-related projects. You will utilize internal data to train models and develop applications based on these models. As a data scientist, you will leverage PB-level data and state-of-the-art machine learning infrastructure to create data products for millions of cryptocurrency users. You will collaborate with engineers, data analysts, business operations, and product/marketing managers to define and build solutions, features, algorithms, and products.

REQUIREMENTS:

  • Holds a Master’s degree or higher in Computer Science, Data Science, Statistics, Mathematics, Computational Linguistics, or a related field.
  • A minimum of 3 years of relevant industry experience in AI/ML and Natural Language Processing is required. Experience in multimodal AI is highly preferred.
  • Proficient in big data technologies such as Apache Spark, Apache Hadoop and Apache Kafka and VectorDB.
  • Deep understanding of modern machine learning techniques and mathematical underpinning, such as classifications, neural networks, hyperparameter optimisation, etc.
  • Solid understanding and practical experience with deep learning architectures, including transformer models (e.g., BERT, GPT). Ability to implement, optimize, and fine-tune these models for various tasks using techniques such as LoRA.
  • Proficiency in programming languages such as Python, Java, or similar, with experience in machine learning (ML), natural language processing (NLP) libraries, and deep learning frameworks such as TensorFlow, PyTorch, Scikit-learn, SpaCy, and NLTK.
  • Demonstrated experience in handling severely imbalanced datasets. Knowledge of techniques and strategies to address imbalances in data.

Responsibilities:

  • Apply Natural Language Processing (NLP) techniques to preprocess, analyse, and extract insights from large textual datasets. Develop and fine-tune Large Language Models (LLMs) and multimodal models to derive actionable insights and enhance business decision-making processes.
  • Work closely with business units to identify opportunities for leveraging company data and AI models to drive innovative business solutions and improve decision-making processes.
  • Perform data cleaning, transformation, and preprocessing to create high-quality datasets for analysis and modeling. Ensure data integrity and consistency throughout the process.
  • Conduct exploratory data analysis to uncover patterns, trends, and relationships within the data. Generate visualisations and summaries to effectively communicate findings to stakeholders and support data-driven decision-making.
  • Stay abreast of the latest developments in artificial intelligence, with a particular focus on advancements in multimodal AI, to ensure the integration of cutting-edge technologies and methodologies into our data-driven solutions.
  • Develop and apply feature engineering techniques to create meaningful features that improve the performance of models. This includes deriving new features from raw data, selecting relevant features, and transforming existing features to enhance model accuracy and efficiency.


REQUIREMENT SUMMARY

Min:N/AMax:5.0 year(s)

Information Technology/IT

IT Software - Other

Software Engineering

Graduate

Computer science data science statistics mathematics computational linguistics or a related field

Proficient

1

Remote, Singapore