The Difference You Will Make:
As part of our team, you'll build the essential AI/ML data foundations powering all AI/ML use cases across Airbnb—spanning Trust & Safety, Payments, Customer Support, Marketing, Search Ranking, and more. You’ll define and implement industry-leading best practices, pipelines, and tools to streamline data creation and consumption, ensuring efficiency, consistency, and compliance. Your contributions will significantly accelerate AI/ML innovation, enabling rapid development and deployment of high-quality, impactful AI/ML solutions company-wide. Additionally, you'll play a pivotal role in shaping our cutting-edge Generative AI infrastructure, positioning .
A Typical Day:
- Design, build, automate, and maintain robust, scalable data pipelines using SparkSQL, Scala, and Airflow.
- Develop and optimize data models ensuring high-quality, consistent, and accurate data to support broad AI/ML product feature decisions.
- Collaborate closely with peer ML Infra teams to deliver automated data solutions driving AI/ML acceleration.
- Contribute to scalable GenAI infrastructure by leveraging foundational language and vision models to create high quality datasets that power cutting edge GenAI applications.
- Partner with key customer teams to deliver high-impact, high-quality datasets core to Airbnb's roadmap.
- Utilize leading open-source technologies including Spark, Airflow, Ray, MLFlow, TensorFlow, PyTorch, Docker, Kubernetes, and more.
Your Expertise:
- 5+ years of relevant industry experience (BS/Masters) or 2+ years with a PhD.
- Strong coding skills in Python, Java, or equivalent languages.
- Hands-on experience with distributed processing technologies (Spark, Kafka, Flink, Hadoop) and distributed storage (HDFS, S3).
- Solid knowledge of data warehousing concepts and databases (e.g. PostgreSQL, MySQL, Redshift, BigQuery, ClickHouse).
- Expertise building scalable ETL pipelines using schedulers like Airflow, Luigi, Oozie, or AWS Glue.
- Proven ability to analyze large datasets, identify insights, and drive impactful product solutions.
- Excellent written and verbal communication skills; comfortable collaborating cross-functionally.
- Experience building end-to-end Machine Learning platforms and deploying ML models.
- Familiarity with Kubernetes, Docker, and modern infrastructure tools.
- Deep understanding of distributed systems and engineering best practices.
How We'll Take Care of You:
Pay Range
$223,000 USD
Offices: United States