Role Description:
As a Data Engineer, you’ll collaborate with top-notch engineers and data scientists to elevate our platform to the next level and deliver exceptional user experiences. Your primary focus will be on the data engineering aspects—ensuring the seamless flow of high-quality, relevant data to train and optimize content models, including GenAI foundation models, supervised fine-tuning, and more.
Key Job Responsibilities and Duties:
Rapidly developing next-generation scalable, flexible, and high-performance data pipelines.
Dealing with massive textual sources to train GenAI foundation models.
Solving issues with data and data pipelines, prioritizing based on customer impact.
End-to-end ownership of data quality in our core datasets and data pipelines.
Experimenting with new tools and technologies to meet business requirements regarding performance, scaling, and data quality.
Providing tools that improve Data Quality company-wide, specifically for ML scientists.
Providing self-organizing tools that help the analytics community discover data, assess quality, explore usage, and find peers with relevant expertise.
Acting as an intermediary for problems, with both technical and non-technical audiences.
Promote and drive impactful and innovative engineering solutions
Technical, behavioral and interpersonal competence advancement via on-the-job opportunities, experimental projects, hackathons, conferences, and active community participation
Collaborate with multidisciplinary teams: Collaborate with product managers, data scientists, and analysts to understand business requirements and translate them into machine learning solutions. Provide technical guidance and mentorship to junior team members.
Qualifications & Skills:
Bachelor’s or master’s degree in computer science, Engineering, Statistics, or a related field.
Minimum of 3 years of experience as a Data Engineer or a similar role, with a consistent record of successfully delivering ML/Data solutions.
You have built production data pipelines in the cloud, setting up data-lake and server-less solutions; you have hands-on experience with schema design and data modeling and working with ML scientists and ML engineers to provide production level ML solutions.
You have experience designing systems E2E and knowledge of basic concepts (lb, db, caching, NoSQL, etc)
Strong programming skills in languages such as Python and Java.
Experience with big data processing frameworks such, Pyspark, Apache Flink, Snowflake or similar frameworks.
Demonstrable experience with MySQL, Cassandra, DynamoDB or similar relational/NoSQL database systems.
Experience with Data Warehousing and ETL/ELT pipelines
Experience in data processing for large-scale language models like GPT, BERT, or similar architectures - an advantage.
Proficiency in data manipulation, analysis, and visualization using tools like NumPy, pandas, and matplotlib - an advantage.
Experience with experimental design, A/B testing, and evaluation metrics for ML models - an advantage.
Experience of working on products that impact a large customer base - an advantage.
Excellent communication in English; written and spoken.
Booking.com’s Total Rewards Philosophy is not only about compensation but also about benefits. We offer a competitive , as well unique-to-Booking.com benefits which include:
Annual paid time off and generous paid leave scheme including: parent, grandparent, bereavement, and care leave
Hybrid working including flexible working arrangements, and up to 20 days per year working from abroad (home country)
Industry leading product discounts - up to 1400 per year - for yourself, including automatic Genius Level 3 status and Booking.com wallet credit
Application Process:
Let’s go places together:
This role does not come with relocation assistance.
Pre-Employment Screening
If your application is successful, your personal data may be used for a pre-employment screening check by a third party as permitted by applicable law. Depending on the vacancy and applicable law, a pre-employment screening may include employment history, education and other information (such as media information) that may be necessary for determining your qualifications and suitability for the position.