Data & Applied Scientist II at Microsoft
Bengaluru, karnataka, India -
Full Time


Start Date

Immediate

Expiry Date

23 Feb, 26

Salary

0.0

Posted On

25 Nov, 25

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Python, PySpark, SQL, ML, GenAI, MLOps, AIOps, Docker, Kubernetes, Azure, AWS, GCP, ETL, Apache Airflow, Git, CI/CD

Industry

Software Development

Description
Work with key stakeholders to understand the underlying business needs and formulate the needs into discrete, manageable problems with well-defined measurable objectives and outcomes. Transform formulated problems into implementation plans by defining success metrics, applying/creating the appropriate methods, algorithms, and tools, as well as delivering statistically valid and reliable results. Write robust, reusable, and extensible code to support analysis and modeling. Develop new ML or GenAI based models using advanced statistical and ML techniques. Lead the evaluation of various GenAI based solutions, diagnosing issues and identifying root causes to support potential fine-tuning or reinforcement learning based fixes. Use AI-powered tools in your daily work to accelerate coding, analysis, and other tasks. Bachelor's degree in data science, Mathematics, Statistics, Computer Science, or related field AND 5+ years related experience (e.g., managing structured and unstructured data, applying statistical techniques and reporting results) OR master's degree in data science, Mathematics, Statistics, Computer Science, or related field AND 4+ year(s) related experience OR Doctorate in Data Science, Mathematics, Statistics, Computer Science, or related field AND 3+ year(s) related experience Expertise in Python , PySpark, and SQL, LLMs Experience in Spark and ability to write/maintain/understand declarative Spark code and to cleanly write and maintain SQL. Knowledge of finer operational aspects of Spark such as setting optimal cluster size, executor memory, number of executors etc. and experience in Azure Synapse/Databricks (or similar) is a bonus. Build and deploy end-to-end ML pipelines and GenAI solutions for both batch and real-time use cases, ensuring scalability and reliability. Apply MLOps and AIOps best practices including model versioning, prompt versioning, experiment tracking, environment management, and understanding of CI/CD automation on cloud platforms (AWS, Azure, GCP). Manage environment dependencies and containerization (Docker/Kubernetes) for reproducible, scalable deployments in any of the cloud platforms (AWS, Azure, GCP etc.). Exposure to some ETL pipelining frameworks such as Azure Data factory, Apache Airflow etc. Expertise in designing and implementing Generative AI pipelines (e.g., RAG systems, domain/task-specific finetuning) Experience in managing codebases in Git, creating release pipelines on CI/CD frameworks like Azure DevOps, Jenkins etc. along with creating/managing Python deployment scripts
Responsibilities
Work with key stakeholders to understand business needs and formulate them into manageable problems. Develop new ML or GenAI based models and lead the evaluation of various GenAI solutions.
Loading...