Senior Data Scientist at ATT

Dallas, Texas, USA -

Full Time

Start Date

Immediate

Expiry Date

22 Jun, 25

Salary

0.0

Posted On

22 Mar, 25

Experience

3 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Physics, Statistics, Data Science

Industry

Information Technology/IT

Description

JOB DESCRIPTION:

As a Senior Data Scientist, you´re tasks may include, but are not limited to, the following: will translate business problems into actionable insights through a comprehensive workflow involving coding, data extraction, cleansing, feature engineering, exploratory data analysis, model creation and tuning, visualization, and deployment, leveraging statistical analysis, machine learning, and big data technologies to drive informed decision-making and innovation.

EDUCATION/EXPERIENCE:

Master’s degree (MS/MA) required from an accredited University in a Quantitative field of study such as Data Science, Math, Statistics, Engineering or Physics. 3+ years of related experience. Certification is required in some areas.

Responsibilities

Data Extraction and Preparation: Collect data from various structured and unstructured sources (data lakes, databases, data warehouses, on cloud, internal, external) and ensure its quality for analysis through cleaning and preprocessing. Designs, builds, and analyzes large (e.g. 100’s of Terabytes or higher as technology advances) and complex data sets while thinking strategically about data use and data design. Tools can include
Coding Solutions, Algorithms and Feature Engineering: Create relevant features and conduct exploratory data analysis. Codes solutions following typical workflow; data extraction, cleansing, feature engineering, exploratory data analysis, model selection/creation, hyper-parameter tuning, model interpretation, model retraining, business process and/or system implementations, high level proof of concept and trials, visualization, deployment to production, post deployment ML ops monitoring/diagnosis/resolutions. Coding proficiency required in at least one data science language (Python, R, Scala, etc.), as well as expertise with modern ML packages and libraries (Spark, SciKitLearn, Pandas, PyTorch, TidyVerse, Tensorflow, Keras, Shiny, and/or AutoML tools).
Model Development, Deployment and Optimization: Build, evaluate, and optimize machine learning models through hyperparameter tuning. Implement models into production, continuously monitor their performance, and ensure they remain explainable and reliable to minimize model decay. Ability to develop custom Machine Learning (ML). Highly proficient in the full AI workflow such as (1) data extraction, cleansing, feature engineering, exploratory data analysis, model selection/creation, hyper-parameter tuning, model interpretation, model retraining and (2) Uses concepts like mlflow to log metrics. Well-versed in Interactive Development Environments (IDEs) such as Databricks Workspaces or Visual Studio Code. Proficiency in algorithm categories such as Supervised Learning, Unsupervised Learning, Optimization Algorithms, Deep Learning, AI-Computer Vision, Natural Language Processing, Deep Reinforcement Learning, Search Algorithms, and AI- Knowledge Graphs.
Visualization and Collaboration: Create visualizations and reports for stakeholders while working closely with cross-functional teams to align efforts with business objectives. Can utilize advanced coding methods to produce visualizations (e.g. ggplot, D3.js, etc.).
Generative AI: Develop and implement generative AI models, focusing on creating new content or augmenting existing data. Generative Models-Understanding of GANs (Generative Adversarial Networks), VAEs (Variational Autoencoders), and Transformers. Fine-Tuning-Techniques for adapting pre-trained models to specific tasks using smaller, task-specific datasets. Agentics-Understanding of agentic architecture, concepts and optimization of solutions. Prompt Engineering-Crafting effective prompts to guide generative models in producing desired outputs. Retrieval-Augmented Generation (RAG)-Combining generative models with retrieval systems to enhance performance and relevance. Text Generation-Proficiency in using models like GPT-3/4 for generating human-like text. Image Generation-Familiarity with tools like DALL-E and Stable Diffusion for creating images from text descriptions.