The ideal candidate will blend strong technical expertise in data solutioning with a sharp business acumen, supporting data analysts, scientists, and business partners. Key responsibilities include:
- Design and develop scalable data ingestion pipelines from multiple structured/unstructured sources like Azure Data Lake, SQL Server, Kusto, flat files etc.
- Implement data orchestration using Spark, PySpark, and Python.
- Implement ETL jobs to optimize data flow and reliability.
- Model and optimize data architecture by designing logical and physical data models supporting near real-time analytics.
- Perform data profiling and gap analysis to support migration from legacy BI platforms to next-gen platforms like Microsoft Fabric, Keystone based data sourcing etc.
- Ensure models support future scalability, privacy, and data lifecycle governance.
- Adhere to Microsoft’s SFI guidelines, data residency policies, and data privacy regulations
- Ensure Data Security, Privacy, and Compliance by implementing data masking, and encryption at required levels.
- Collaborate with Engineering teams to ensure timely patches, system updates, incorporate audit trails and data lineage tracking mechanisms.
- Define and implement robust data validation, anomaly detection, and reconciliation logic and monitor and track data pipeline performance
- Enable Self-Service BI and Analytics by partnering with SMEs, business stakeholders to enable self-service capabilities using Power BI, Power Platform, and Azure Synapse.
- Create reusable datasets, certified data models, and intuitive visualizations that align with business priorities.
- Collaborate with Engineering and Business Stakeholders by translating business requirements into technical specs and into scalable data solutions.