What you'll be doing - Design and optimize large-scale and complex next-generation data pipeline systems for efficiency, reliability, and cost-effectiveness.
- Work closely with teams across product, engineering, data science and analytics to assist with designing and implementing data initiatives based on understanding their use cases.
- Build a highly efficient end-to-end data pipeline to transform complex data which can be easily analyzed, ensuring data accuracy and reliability.
- Implement data governance and modeling practices and optimize systems for speed and scalability.
- Engage in on-call rotations applying your deep understanding of infrastructure to quickly identify and resolve complex issues.
- Collaboratively work with cross-functional teams, including ML engineers, data scientists, and analysts teams to understand business and data needs and translate them into data-driven solutions focusing on scalability and cost-efficiency.
- Contribute to data optimization projects across the company, utilizing your expertise to drive significant cost savings without sacrificing service performance.
What we're looking for - A background in Computer Science or Engineering, or equivalent data engineering experience with complex data systems.
- Proficiency in cloud platforms such as GCP, AWS, or Azure, and expertise in databases, data storage concepts, and data infrastructure tools like Hadoop or Kafka.
- Experience in batch, streaming, and real-time data processing using Spark, Flink etc.
- A deep understanding of data warehousing principles, encompassing ETL/ELT, data modeling, and various data patterns such as SCD, CDC, snapshots, and partitioning.
- Proficiency in programming languages such as Python, Go or Java.
- Excellent problem-solving skills, with an ability to work in a fast-paced and dynamic work environment.
- Outstanding interpersonal skills and the ability to work collaboratively with global teams across time zones.
You might also have - A mindset that appreciates engineering as a collective effort.
- Hands-on experience with Protobuf, Kafka, Flink, BigQuery, Druid, Kubeflow
- Experience with Terraform, ArgoCD, Github Actions and similar tools used for CI/CD
- Operational experience in the ML or Data domain and its associated technologies is a huge plus.
Additional information - International relocation support is not available for this position.
Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.
This position requires the incumbent to have a sufficient knowledge of English to have professional verbal and written exchanges in this language since the performance of the duties related to this position requires frequent and regular communication with colleagues and partners located worldwide and whose common language is English.
Gross pay salary$135,800—$258,600 USD