Data Scientist at Homethrive

Remote, Oregon, USA -

Full Time

Start Date

Immediate

Expiry Date

22 Jul, 25

Salary

0.0

Posted On

22 Apr, 25

Experience

3 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Json, Business Intelligence, High Pressure Environment, Research, Deep Learning, Data Warehousing, Data Science, Web Application Development, Statistical Modeling, Nlp, Numpy, Natural Language Understanding, Computer Science, Natural Language Processing, Learning, Sql, R

Industry

Information Technology/IT

Description

Homethrive was born from personal experience. Our founders grappled with the overwhelming challenges of caregiving for family members while balancing their work lives. The journey was fraught with confusion, a myriad of unanswered questions, and countless hours delving into endless online searches. After taking numerous days off and spending extended hours on the phone, the answers remained elusive. They recognized the need for a streamlined, more efficient solution. Enter Homethrive!

POSITION OVERVIEW

Homethrive is growing, and we need to bring on additional talent to help us on our growth journey! We are looking for a Data Scientist who will be responsible for collecting, analyzing, and interpreting complex data sets to drive business decisions and strategies. We are looking for someone with strong hands-on experience in all layers of data integration, analytics, and ML/AI!

The technology we currently utilize includes:

Python
Snowflake
AWS RDS (MySQL), MongoDB Atlas
Salesforce CRM, Tableau
Cloud hosting on AWS using a mixture of Lambda, Glue, DynamoDB, and S3

REQUIREMENTS

Education: BS/MS in Computer Science, Data Science, another related discipline or equivalent experience
3+ years of related professional experience required
Strong proficiency in programming languages such as Python, R, or SQL
Experience with libraries like Pandas, NumPy, Scikit-learn, PyTorch/TensorFlow
Expertise in machine learning algorithms, statistical modeling, and data mining techniques (Hypothesis Testing, Regression, Classification, Clustering)
Experience with data visualization tools (e.g., Tableau, Power BI, D3.js)
Familiarity with cloud computing platforms (e.g., AWS, GCP, Azure) and big data technologies
Strong knowledge of Data Warehousing (Snowflake) and Data Modeling techniques and principles
Strong background in Natural Language Processing (NLP) and Natural Language Understanding (NLU) techniques
Experience with Retrieval Augmented Generation (RAG) models and Large Language Models (LLMs)
Familiarity with state-of-the-art NLP libraries and frameworks (e.g., Hugging Face, Transformers, spaCy, NLTK)
Preferred experience in developing and deploying large language models, generative AI models, or chatbot systems
Knowledge of machine learning and deep learning concepts, including model training, evaluation, and deployment
Experience with model selection, tuning, evaluation (cross-validation, ROC, AUC, etc)
Experience with text data preprocessing, feature engineering, and model development
Ability to handle and process large-scale text data efficiently
A successful history of integrating source systems and delivering self-serve Business Intelligence
Experience building data pipelines and deploying to public clouds such as AWS
1+ years in web application development, including Implementing Application Programming Interfaces (APIs)
Experience and knowledge in back-end and front-end: SQL, Python, REST services, JSON, etc.
Self-directed and comfortable supporting the data needs of cross-functional teams, systems, and products in a high-pressure environment
Experience analyzing and documenting business processes
Passion for learning & results-oriented

Responsibilities

Data Collection and Preprocessing:

Acquire, clean, and transform structured and unstructured data from various sources.
Identify and address data quality issues, missing values, and outliers.
Develop and maintain data pipelines and data warehousing solutions.

Exploratory Data Analysis and Modeling:

Perform exploratory data analysis to identify patterns, trends, and relationships within data sets.
Design and implement statistical and machine learning models to solve complex business problems.
Evaluate and optimize model performance using appropriate techniques and metrics.

Data Visualization and Storytelling:

Create compelling data visualizations and dashboards to effectively communicate findings and insights.
Collaborate with cross-functional teams to translate data-driven insights into actionable recommendations.
Present complex analytical results to both technical and non-technical audiences.

Model Deployment and Monitoring:

Deploy and integrate machine learning models into production environments.
Monitor model performance, identify potential issues, and iterate on models as needed.
Collaborate with engineering teams to ensure smooth integration and scalability of data solutions.

Generative AI, Large Language Model (LLM), and Chatbot Technologies:

Explore and implement novel approaches for fine-tuning and adapting LLMs to specific domains, tasks, and use cases.
Collaborate with our engineering teams to integrate LLM models into production systems and chatbot interfaces.
Conduct research and experimentation to enhance the performance, safety, and robustness of our generative AI models.

Continuous Learning and Innovation:

Stay up-to-date with the latest developments in data science, machine learning, and related technologies.
Explore and experiment with new techniques and tools to enhance data-driven decision making.