Job responsibilities
- Provides recommendations and insight on data management, governance procedures, and intricacies applicable to the acquisition, maintenance, validation, and utilization of data
- Designs and delivers trusted data collection, storage, access, and analytics data platform solutions in a secure, stable, and scalable way
- Generates data models for their team using firmwide tooling, linear algebra, statistics, and geometrical algorithms
- Delivers data collection, storage, access, and analytics data platform solutions in a secure, stable, and scalable way
- Evaluates and reports on access control processes to determine effectiveness of data asset security with minimal supervision
- Adds to team culture of diversity, equity, inclusion, and respect
Required qualifications, capabilities, and skills
- Formal training or certification on data engineering concepts and 5+ years applied experience
- Strong experience with both relational and NoSQL databases
- Experience and proficiency across the data lifecycle
- Expertise in designing and implementing scalable data architectures and pipelines
- Strong proficiency in programming languages such as Python and PySpark
- In-depth knowledge of data warehousing solutions and ETL processes
- Ability to work collaboratively with cross-functional teams, including data scientists, analysts, and business stakeholders
- Experience with data governance and ensuring data quality and integrity
- Excellent communication skills, with the ability to convey complex technical concepts to non-technical audiences
- Experience with version control systems like Git and CI/CD pipelines for data engineering workflows
Preferred qualifications, capabilities, and skills
- Experience with Trino / Presto
- Proficiency in Big Data technologies, with a strong focus on Performance Optimization using best practices
- Solid understanding of the Parquet file format