Data Engineer II at Microsoft
United States, North Carolina, USA -
Full Time


Start Date

Immediate

Expiry Date

13 Jul, 25

Salary

98300.0

Posted On

13 Apr, 25

Experience

2 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Artificial Intelligence, Data Modeling, Computer Engineering, Business Analytics, Data Science, Neo4J, Graph Databases, Data Governance, Groups, Citizenship, Base Pay, Microsoft, Metadata Management, Computer Science, Puppet, Machine Learning

Industry

Information Technology/IT

Description

Does pioneering new and innovative ways to reimagine and transform end-user productivity across the breadth and depth of Microsoft’s global workforce sound exciting to you? Are you passionate about the future of work, driving innovation and showcasing an employee experience blueprint that inspires customers and partners to navigate their digital transformation? If so, Microsoft Digital (MSD) team is an excellent place for you to grow your career as a Data Engineer II.
Microsoft Digital (MSD)’s mission is to power, protect, and transform the employee experience at Microsoft around the world. Come build community, explore your passions, do your best work and be a part of the team within Microsoft Digital (MSD). Microsoft Digital (MSD) is the team that innovates, creates, and delivers the vision for Microsoft’s employee experience, human resources, corporate and legal affairs, global real estate products, and runs Microsoft’s internal network and infrastructure, plus builds campus modernization and hybrid solutions. You will leverage the latest technologies and focus on empowering Microsoft employees with the tools and services that define both the physical and digital future of work.
Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

REQUIRED QUALIFICATIONS:

  • Bachelor’s Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 2+ years experience in business analytics, data science, data modeling or data engineering work
  • OR Master’s Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field
  • OR equivalent experience.
  • 2+ years experience using Python / C#.
  • 2+ years experience with Spark platforms like Azure Synapse, Fabric, or Databrick.
  • Experience with secure cloud authentication and secure data management.
  • Experience with metadata management, data lineage, and principles of data governance.

OTHER REQUIREMENTS:

Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings:
-
-

Citizenship & Citizenship Verification: This position requires verification of U.S citizenship due to citizenship-based legal restrictions. Specifically, this position supports United States federal, state, and/or local United States government agency customers and is subject to certain citizenship-based restrictions where required or permitted by applicable law. To meet this legal requirement, citizenship will be verified via a valid passport, or other approved documents, or verified US government clearance.

  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

PREFERRED QUALIFICATIONS:

  • Experience with vector databases, particularly LanceDB and Pinecone.
  • Experience with graph databases such as Neo4j, Apache Gremlin, or Puppet.
  • Experience with document databases including Azure CosmosDB, AWS DynamoDB, and AWS DocumentDB.
  • Experience in Agile development practices and Continuous Integration/Continuous Deployment (CI/CD).
  • Experience with machine learning, artificial intelligence, and data science.
  • 2+ years experience using Azure Cloud platform.
  • Demonstrated ability to communicate clearly and effectively in both oral and written mediums with individuals and groups in order to socialize information and knowledge with a diverse group of colleagues.
    Data Engineering IC3 - The typical base pay range for this role across the U.S. is USD $98,300 - $193,200 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $127,200 - $208,800 per year.
    Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay
    Microsoft will accept applications for the role until April 21, 2025.
Responsibilities
  • Follows data modeling and data handling procedures to maintain compliance with applicable laws and policies across assigned workstreams. Works with others to tag data based on categorization (e.g., personally identifiable information [PII], pseudo-anonymized, financial). Helps others document data type, classifications, and lineage to ensure traceability. Governs accessibility of data within assigned data pipelines and/or data model(s). Contributes to the relevant data glossary to document the origin, usage, and format of data for each program.
  • Applies standard modification techniques and operations (e.g., inserting, aggregating, joining) to transform raw data into a form that is compatible with downstream data sources, databases, and formats. Uses software, query languages, and computing tools to transform raw data from assigned pipelines, under direction from others. Assesses data quality and completeness using queries, data wrangling, and basic statistical techniques. Helps others merge data into distributed systems, products, or tools for further processing.
  • With guidance, independently implements basic code to extract raw data from identified upstream sources using common query languages or standard tools, and contributes to checks that support data accuracy, validity, and reliability across a data pipeline component. Participates in code reviews and provides constructive feedback to team members. Uses knowledge of one or more use cases to implement basic orchestration techniques that automate data extraction logic from one source to another. Uses basic data protocols and reduction techniques to validate the quality of extracted data across specific parts of the data pipeline, consistent with the Service Level Agreement. Uses existing approaches and tools to record, track, and maintain data source control and versioning. Applies knowledge of data to validate that the correct data is ingested and that the data is applied accurately across multiple areas of work.
  • Designs and maintains assigned data tools that are used to transform, manage, and access data. Writes efficient code to test and validate storage and availability of data platforms and implements sustainable design patterns to make data platforms more usable and robust to failure and change. Works with others to analyze relevant data sources that allow others to develop insights into data architecture designs or solution fixes.
  • Supports collaborations with appropriate stakeholders and records and documents data requirements. Evaluates project plan to understand data costs, access, usage, use cases, and availability for business or customer scenarios related to a product feature. Works with advisement to explore the feasibility of data needs and finds alternative options if requirements cannot be met. Supports negotiation of agreements with partners and system owners to understand project delivery, data ownership between both parties, and the shape and cadence of data extraction for an assigned feature. Proposes project-relevant data metrics or measures to assess data across varied service lines.
  • Contributes to the appropriate data model for the project and drafts design specification documents to model the flow and storage of data for specific parts of a data pipeline. Works with senior engineers and appropriate stakeholders (e.g., Data Science Specialists) to contribute basic improvements to design specifications, data models, or data schemas, so that data is easy to connect, ingest, has a clear lineage, and is responsive to work with. Demonstrates knowledge of the tradeoff between analytical requirements and compute/storage consumption for data and begins to anticipate issues in the cadence of data extraction, transformation, and loading into multiple, related data products or datasets in cloud and local environments. Demonstrates an understanding of costs associated with data that are used to assess the total cost of ownership (TOC).
  • Performs root cause analysis in response to detected problems/anomalies to identify the reason for alerts and implement basic solutions that minimize points of failure. Implements and monitors improvements across assigned product feature to retain data quality and optimal performance (e.g., latency, cost) throughout the data lifecycle. Uses cost analysis to suggest solutions that reduce budgetary risks. Works with others to document the problem and solution through postmortem reports and shares insights with team or leadership. Provides data-based insights into the health of data products owned by the team according to service level agreements (SLAs) across assigned features.
  • Follows existing documentation to implement performance monitoring protocols across a data pipeline. Builds basic visualizations and smart aggregations (e.g., histograms) to monitor issues with data quality and pipeline health that could threaten pipeline performance. Contributes to troubleshooting guides (TSGs) and operating procedures for reviewing, addressing, and/or fixing basic problems/anomalies flagged by automated testing. Contributes to the support and monitoring of platforms.
  • Embody our Culture and Values
Loading...