Job responsibilities
- Generates data models for their team using firmwide tooling, linear algebra, statistics, and geometrical algorithms
- Delivers data collection, storage, access, and analytics data platform solutions in a secure, stable, and scalable way
- Implements database back-up, recovery, and archiving strategy
- Evaluates and reports on access control processes to determine effectiveness of data asset securitywith minimal supervision
- Adds to team culture of diversity, equity, inclusion, and respect
Required qualifications, capabilities, and skills
- At least 5 years of recent hands-on professional experience (actively coding) working as a data engineer
- Hands-on experience with Apache Kafka, Spark Experience in using Databricks for big data analytics and processing.
- Hands-on practical experience in system design, application development, testing, and operational stability, particularly data pipelines for moving/transforming data.
- Experience and proficiency across the data lifecycle. Overall knowledge of the Software Development Life Cycle
- Experience with SQL and understanding of NoSQL databases and their niche in the marketplace
- Experience designing and implementing data pipelines in a cloud environment is required.
- Solid understanding of agile methodologies such as CI/CD, Applicant Resiliency, and Security
- Hands on experience in building applications using AWS services like Lambda, EMR and EKS, REST API.
- Strong experience migrating/developing data solutions in the AWS cloud is required. Experience needed in AWS Services, Apache Airflow.
- Experience with database back-up, recovery, and archiving strategy
- Proficient knowledge of linear algebra, statistics, and geometrical algorithms
Preferred qualifications, capabilities, and skills
- 1+ years of experience building/implementing data pipelines using Databricks such as Unity Catalog, Databricks workflow, Databricks Live Table etc.
- AWS Data Platform experience
- Experience with Data Orchestrator tool like Airflow.
- Familiarity with data governance and metadata management.
- Spark programming and advanced at SQL (e.g., joins and aggregations)