Expoint - all jobs in one place

The point where experts and best companies meet

Limitless High-tech career opportunities - Expoint

Microsoft Senior Data Engineer 
India, Telangana, Hyderabad 
532762751

10.09.2024
Qualifications
  • Required : Bachelor's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field
  • Must skills : 8+ years experience in business analytics, data science, software development, data modeling or data engineering work OR Master's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field
  • 5+ years experience in business analytics, data science, software development, or data engineering work OR equivalent experience.

Data Requirements and Modeling

  • Collaborates with appropriate stakeholders across teams and escalates concerns around data requirements by assessing and conducting feature estimation.
  • Assesses data costs, access, usage, use cases, dependencies across products, and availability for business or customer scenarios related to one or more product features. Informs clients on feasibility of data needs and suggests transformations or strategies to acquire data if requirements cannot be met.
  • Negotiates agreements with partners and system owners to align on project delivery, data ownership between both parties, and the shape and cadence of data extraction for one or more features.
  • Proposes new data metrics or measures to assess data across varied service lines.
  • Leads the design of a data model that is appropriate for the project and prepares design specification documents to model the flow and storage of data for a data pipeline.
  • Designs assigned components of the data model for a functional area of a project. Partners with stakeholders (e.g., Data Science Specialists) to make iterative improvements to design specifications, data models, or data schemas, so that data is easy to connect, ingest, has a clear lineage, and is responsive to work with.
  • Considers tradeoffs between analytical requirements with compute/storage consumption for data and anticipates cost that could be influenced by the cadence of data extraction, transformation, and loading into moderately complex data products or datasets in cloud and local environments.
  • Demonstrates an advanced understanding of costs associated with data that are used to assess the total cost of ownership (TOC).

Engineering Fundamentals

  • Performs root cause analysis in response to detected problems/anomalies to identify the reason for alerts and implement solutions that minimize points of failure.
  • Implements and monitors self-healing processes across multiple product features to prevent issues from recurring in the future and retain data quality and optimal performance (e.g., latency, cost) throughout the data lifecycle.
  • Uses cost analysis to drive product/program level solutions that reduce budgetary risks. Documents the problem and solutions through postmortem reports and shares insights with team and the customer. Provides data-based insights into the health of data products owned by the team according to service level agreements (SLAs) across multiple features.
  • Writes code to implement performance monitoring protocols across data pipelines. Builds visualizations and smart aggregations (e.g., advanced statistics) to monitor issues with patterns in data quality and pipeline health that could threaten pipeline performance.
  • Develops and updates troubleshooting guides (TSGs) and operating procedures for reviewing, addressing, and/or fixing advanced problems/anomalies flagged by automated testing. Supports and monitors platforms.

Compliance

  • Anticipates the need for data governance and designs data modeling and data handling procedures, with direct support and partnership with Corporate, External, and Legal Affairs (CELA), to ensure compliance with applicable laws and policies across all aspects of the data pipeline.
  • Tags data based on categorization (e.g., personally identifiable information [PII], pseudo-anonymized, financial).
  • Documents data type, classifications, and lineage to ensure traceability.
  • Governs accessibility of data within assigned data pipelines.
  • Provides guidance on contributions to the data glossary to document the origin, usage, and format of data for each program.

Data Management and Transformation

  • Identifies opportunities to leverage and contribute to the development of data tools that are used to transform, manage, and access data. Writes, implements, and validates code to test storage and availability of data platforms and drives the implementation of sustainable design patterns to make data platform