A motivated individual with a strong background in data platforms, ETL pipeline setup and analysis, you will lead the data catalog initiatives to establish and maintain a cohesive and effective data cataloging system, integral to the broader data governance strategy and objectives.
This position requires excellent communication skills, a keen eye for detail, and the ability to work effectively in a fast-paced environment.
Job responsibilities
- Oversee the coordination of activities concerning metadata identification, standardization, and updates within their respective data lakehouse tables.
- Develop and publish metrics for metadata completeness and data catalog usage, fostering its wider adoption.
- Serve as a subject matter expert for data catalog, create training materials and educate users as needed
- Execute software solutions, design, development, and technical troubleshooting with ability to think beyond routine or conventional approaches to build solutions or break down technical problems
- Create secure and high-quality production code and maintains algorithms that run synchronously with appropriate systems
- Produce architecture and design artifacts for complex applications while being accountable for ensuring design constraints are met by software code development
- Gather, analyze, synthesize, and develop visualizations and reporting from large, diverse data sets in service of continuous improvement of software applications and systems
- Proactively identify hidden problems and patterns in data and uses these insights to drive improvements to coding hygiene and system architecture
- Contribute to software engineering communities of practice and events that explore new and emerging technologies
- Add to team culture of diversity, equity, inclusion, and respect
Required qualifications, capabilities, and skills
- Formal training or certification on software engineering concepts and 5+ years applied experience in ETL/Data pipeline and data lake platforms like Databrick, Spark/Hadoop, and Snowflake.
- Strong understanding of data management principles, governance frameworks and best practices.
- Experience with data cataloging tools and systems (e.g., Collibra, Alation, Informatica, Atlan, Databricks Unity Catalog).
- Experience in developing, debugging, and maintaining code in a large enterprise with one or more modern programming languages (e.g. Java/SpringBoot, Python/Flask).
- Expert in SQL and data profiling tools.
- Solid understanding of SDLC practices, Application Resiliency, Security, Agile methodologies, and CI/CD.
- Knowledge of integration technologies (e.g. GraphQL, REST etc..).
- Excellent analytical and problem-solving skills.
- Strong attention to detail and organizational skills.
- Effective communication and interpersonal skills.
Preferred qualifications, capabilities, and skills
- Practical cloud native experience, with exposure of designing and deploying applications on AWS
- Hands on experience with data reporting, BI Tools Tableau, Alteryx