Job responsibilities
- Provides recommendations and insight on data management, governance procedures, and intricacies applicable to the acquisition, maintenance, validation, and utilization of data
- Designs and delivers trusted data collection, storage, access, and analytics data platform solutions in a secure, stable, and scalable way
- Defines database back-up, recovery, and archiving strategy
- Generates advanced data models for one or more teams using firm wide tooling, linear algebra, statistics, and geometrical algorithms
- Approves data analysis tools and processes
- Creates functional and technical documentation supporting best practices
- Advises junior engineers and technologists
- Evaluates and reports on access control processes to determine effectiveness of data asset security
- Adds to team culture of diversity, equity, inclusion, and respect
Required qualifications, capabilities, and skills
- Formal training or certification on Data Engineering concepts and 5+ years applied experience
- Demonstrated leadership abilities, including leading and mentoring a team of data engineers.
- Expertise in designing and implementing scalable data architectures and pipelines.
- Strong proficiency in programming languages such as Python and PySpark.
- Experience with AWS cloud platform including their data services.
- In-depth knowledge of data warehousing solutions and ETL processes.
- Ability to work collaboratively with cross-functional teams, including data scientists, analysts, and business stakeholders.
- Experience with data governance and ensuring data quality and integrity.
- Familiarity with machine learning frameworks and integrating them into data pipelines.
- Excellent communication skills, with the ability to convey complex technical concepts to non-technical audiences.
- Experience with version control systems like Git and CI/CD pipelines for data engineering workflows.
Preferred qualifications, capabilities, and skills
- Experience with Data bricks
- Proficiency in Big Data technologies, with a strong focus on Performance Optimization using best practices
- Solid understanding of the Parquet file format.