Data Scientist at Ford Global Career Site

Dearborn, Michigan, United States -

Full Time

Start Date

Immediate

Expiry Date

14 Jan, 26

Salary

182338.0

Posted On

16 Oct, 25

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Natural Language Processing, Python, Machine Learning, SQL, Google Cloud Platform, Data Processing Pipelines, Feature Engineering, Model Evaluation, Text Preprocessing, ETL Processes, Git, Containerization, Large Language Models, CI/CD Pipelines, Cloud Functions, OpenShift

Industry

Motor Vehicle Manufacturing

Description

Note, this is a purely telecommuting/work-from-home position whereby the employee may reside anywhere within the U.S. Develop and deploy Natural Language Processing (NLP) models to extract insights from unstructured textual data. Collaborate with cross-functional teams to identify opportunities and develop strategies for applying NLP techniques to enhance quality analytics. Design and implement data pre-processing and feature engineering techniques for NLP tasks. Utilize supervised and unsupervised machine learning techniques to solve complex NLP problems. Evaluate and fine-tune models for performance optimization, accuracy, and efficiency. Contribute maintainable code to existing and new pipelines. Stay up-to-date with the latest advancements in NLP and contribute to the continuous improvement of methodologies and algorithms. Communicate findings, insights, and recommendations to stakeholders in a clear and concise manner. Master's degree or foreign equivalent in Computer Science, Data Science, Computer Engineering or related field and 3 years of experience in the job offered or related occupation. 3 years of experience with each of the following skills is required: 1. Developing NLP pipelines in Python using at least 2 of the following: NLTK, spaCy, Gensim, or HuggingFace. 2. Implementing text preprocessing workflows, creating feature extraction algorithms, and building and training models with scikit-learn, TensorFlow, or PyTorch. 3. Developing reusable modules for NLP and writing production-ready code. 4. Querying large datasets in SQL to extract textual information. 5. Designing database schemas optimized for NLP applications. 6. Writing complex queries to join structured and unstructured data sources. 7. Creating ETL processes for text data, optimizing query performance for large text corpora, and implementing database operations in analytics pipelines. 8. Applying supervised Machine Learning techniques to NLP problems, implementing unsupervised methods for text analysis, evaluating model performance with appropriate metrics, building ensemble models, conducting hyperparameter optimization, and applying transfer learning with pre-trained embeddings. 9. Deploying NLP models and pipelines on Google Cloud Platform (GCP) infrastructure. 10. Utilizing AI Platform for training and serving ML models. 11. Managing data storage with Cloud Storage, BigQuery, or Cloud SQL. 12. Implementing data processing pipelines with Dataflow or Dataproc. 2 years of experience with each of the following skills is required: 1. Managing code versioning for collaborative NLP model development, implementing code review processes, and resolving merge conflicts in multi-developer environments. 2. Using Git for CI/CD integration with model deployment and organizing repositories for maintainable ML codebases. 1 year of experience with each of the following skills is required: 1. Using Cloud Functions for serverless text processing and monitoring model performance. 2. Containerizing NLP applications for deployment, creating and managing deployment configurations, and setting up routes and services with OpenShift. 3. Implementing resource allocation and scaling strategies, configuring persistent storage for models and data, and managing deployments with rolling updates. 4. Fine-tuning pre-trained Large Language Models (BERT, GPT, or T5) for domain-specific tasks. 5. Implementing prompt engineering techniques and evaluating LLM outputs for accuracy. 6. Creating embeddings for semantic search, optimizing inference for production, and reducing hallucinations and improving factuality. 7. Building CI/CD pipelines in Tekton for NLP model deployment. 8. Creating reusable pipeline components for text processing, managing workflow triggers, and implementing testing and validation steps. 9. Configuring resource requirements, integrating model evaluation metrics, and setting up automated retraining pipelines. We are offering a salary of $107,848.00 - $182,338.56/yr. As an established global company, we offer the benefit of choice. You can choose what your Ford future will look like: will your story span the globe, or keep you close to home? Will your career be a deep dive into what you love, or a series of new teams and new skills? Will you be a leader, a changemaker, a technical expert, a culture builder or all the above? No matter what you choose, we offer a work life that works for you, including: Immediate medical, dental, and prescription drug coverage Flexible family care, parental leave, new parent ramp-up programs, subsidized back-up child care and more Vehicle discount program for employees and family members, and management leases Tuition assistance Established and active employee resource groups Paid time off for individual and team community service A generous schedule of paid holidays, including the week between Christmas and New Year's Day Paid time off and the option to purchase additional vacation time. https://fordcareers.co/GSR-HTHD Verification of employment eligibility will be required at the time of hire.

Responsibilities

Develop and deploy NLP models to extract insights from unstructured textual data. Collaborate with cross-functional teams to enhance quality analytics through NLP techniques.