What you'll be doing - Develop and maintain scalable data pipelines to ensure data integrity, monitoring, and lineage, handling large-scale daily event processing in the data lake and data warehouse.
- Design and implement reusable systems for extracting, transforming, and loading data from diverse sources into centralized data lakes and reporting warehouses.
- Continuously improve data quality, workflow reliability, and scalability by exploring and implementing pioneering technologies and solutions.
- Collaborate with multiple teams and stakeholders to drive key business initiatives and deliver data-driven solutions.
What we're looking for - Bachelor’s degree, or foreign equivalent degree, in Computer Science, Engineering, or a related field, and three (3) years of related work experience.
- Must have two (2) years of experience with Time-series data storage solutions such as Druid or Imply
- Must have three (3) years of work experience with/in:
- Using at least three of the following modern data stack tools for data ingestion, transformation, quality and orchestration: Airbyte, DBT, Soda, BigEye, or Airflow;
- BigQuery or Snowflake cloud data warehouses;
- Building high-performance, distributed systems using Kafka, Spark or Flink; and
- End-to-end solutions involving service integration, data preprocessing, machine learning models, and data presentation.
- 40 hrs/week, Mon-Fri, 8:30 a.m. - 5:30 p.m.
- Telecommuting permitted on a hybrid schedule as determined by the employer.
Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.
This position requires the incumbent to have a sufficient knowledge of English to have professional verbal and written exchanges in this language since the performance of the duties related to this position requires frequent and regular communication with colleagues and partners located worldwide and whose common language is English.
Gross pay salary$158,530—$168,530 USD