Expoint - all jobs in one place

מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר

Limitless High-tech career opportunities - Expoint

Microsoft Data Engineer II 
Taiwan, Taoyuan City 
395551069

17.04.2025

is an excellent place for you to grow your career as a Data Engineer II.

mission is to power, protect, and transform the employee experience at Microsoft around the world.  Come build community, explore your passions, do your best work and be a part of the team within Microsoft Digital (MSD).is the team that innovates, creates, and delivers the vision for Microsoft’s employee experience, human resources, corporate and legal affairs, global real estate products, and runs Microsoft’s internal network and infrastructure, plus builds campus modernization and hybrid solutions. You willthe latest technologies and focus on empowering Microsoft employees with the tools and services that define both the physical and digital future of work.

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Required Qualifications:

  • Bachelor's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 2+ years experience in business analytics, data science, data modeling or data engineering work
    • OR Master's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field
    • OR equivalent experience.
  • 2+years experienceusing Python / C#.
  • 2+years experiencewith Spark platforms like Azure Synapse, Fabric, or Databrick.
  • Experience with secure cloud authentication and secure data management.
  • Experience with metadata management, data lineage, and principles of data governance.


Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings:

    • Citizenship & Citizenship Verification: This position requires verification of U.S citizenship due to citizenship-based legal restrictions. Specifically, this position supports United States federal, state, and/or local United States government agency customers and is subject to certain citizenship-based restrictions where required or permitted by applicable law. To meet this legal requirement, citizenship will be verified via a valid passport, or other approved documents, or verified US government clearance.
    • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

Preferred Qualifications:

  • Experience with vector databases, particularlyLanceDBand Pinecone.
  • Experience with graph databases such as Neo4j, Apache Gremlin, or Puppet.
  • Experience with document databases including AzureCosmosDB, AWS DynamoDB, and AWSDocumentDB.
  • Experience in Agile development practices and Continuous Integration/Continuous Deployment (CI/CD).
  • Experience with machine learning, artificial intelligence, and data science.
  • 2+years experienceusing Azure Cloud platform.
  • Demonstrated ability to communicate clearly and effectively in both oral and written mediums with individuals and groupsin order tosocialize information and knowledge with a diverse group of colleagues.

Data Engineering IC3 - The typical base pay range for this role across the U.S. is USD $98,300 - $193,200 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $127,200 - $208,800 per year.Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:Microsoft will accept applications for the role until April 21, 2025.


Responsibilities
  • Follows data modeling and data handling procedures tomaintaincompliance with applicable laws and policies across assigned workstreams. Works with others to tag data based on categorization (e.g., personally identifiable information [PII], pseudo-anonymized, financial). Helps others document data type, classifications, and lineage to ensure traceability. Governs accessibility of data within assigned data pipelines and/or data model(s). Contributes to the relevant data glossary to document the origin, usage, and format of data for each program.
  • Applies standard modification techniques and operations (e.g., inserting, aggregating, joining) to transform raw data into a form that is compatible with downstream data sources, databases, and formats. Uses software, query languages, and computing tools to transform raw data from assigned pipelines, under direction from others. Assesses data quality and completeness using queries, data wrangling, and basic statistical techniques. Helps others merge data into distributed systems, products, or tools for further processing.
  • With guidance, independently implements basic code to extract raw data fromidentifiedupstream sources using common query languages or standard tools, and contributes to checks that support data accuracy, validity, and reliability across a data pipelinecomponent. Participates in code reviews and provides constructive feedback to team members. Uses knowledge ofone or more usecases to implement basic orchestration techniques that automate data extraction logic from one source to another. Uses basic data protocols and reduction techniques tovalidatethe quality of extracted data across specific parts of the data pipeline, consistent with the Service Level Agreement. Uses existing approaches and tools to record, track, andmaintaindata source control and versioning. Applies knowledge of data tovalidatethat the correct data isingestedand that the data is applied accurately across multiple areas of work.
  • Designs and maintains assigned data tools that are used to transform, manage, and access data.Writes efficient code to test andvalidatestorage and availability of data platforms and implements sustainable design patterns to make data platforms more usable and robust to failure and change. Works with others to analyze relevant data sources that allow others to develop insights into data architecture designs or solution fixes.
  • Supports collaborations withappropriate stakeholdersand records and documents data requirements. Evaluates projectplanto understand data costs, access, usage, use cases, and availability for business or customer scenarios related to a product feature. Works with advisement to explore the feasibility of data needs andfindsalternative options if requirements cannot be met. Supports negotiation of agreements with partners and system owners to understand project delivery, data ownership between both parties, and the shape and cadence of data extraction for an assigned feature. Proposes project-relevant data metrics or measures to assess data across varied service lines.
  • Contributes to theappropriate datamodel for the project and drafts design specification documents to model the flow and storage of data for specific parts of a data pipeline. Works with senior engineers andappropriate stakeholders(e.g., Data Science Specialists) to contribute basic improvements to design specifications, data models, or data schemas, so that data is easy to connect, ingest, has a clear lineage, and is responsive to work with. Demonstrates knowledge of the tradeoff between analytical requirements andcompute/storage consumption for data and begins toanticipateissues in the cadence of data extraction, transformation, and loading into multiple, related data products or datasets in cloud and local environments.Demonstrates an understanding of costs associated with data that are used to assess the total cost of ownership (TOC).
  • Performs root cause analysis in response to detected problems/anomalies toidentifythe reason for alerts and implement basic solutions that minimize points of failure. Implements and monitors improvements across assigned productfeaturetoretaindata quality andoptimalperformance (e.g., latency, cost) throughout the data lifecycle.Usescost analysis to suggest solutions that reduce budgetary risks. Works with others to document the problem and solution through postmortem reports andsharesinsights with team or leadership.Providesdata-based insights into the health of data products owned by the team according to service level agreements (SLAs) across assigned features.
  • Follows existing documentation to implement performance monitoring protocols across a data pipeline. Builds basic visualizations and smart aggregations (e.g., histograms) tomonitorissues with data quality and pipeline health that could threaten pipeline performance. Contributes to troubleshooting guides (TSGs) and operating procedures for reviewing, addressing, and/or fixing basic problems/anomalies flagged by automated testing. Contributes to the support and monitoring of platforms.
  • Embody our