Expoint - all jobs in one place

Finding the best job has never been easier

Limitless High-tech career opportunities - Expoint

Amazon Sr Language Data Scientist Alexa Customer Journeys 
Canada, British Columbia, Vancouver 
653853841

08.07.2024
DESCRIPTION

As a Language Data Scientist, you will start by diving deep into a couple of critical projects across Alexa experiences . You will collaborate with fellow language data scientists, program managers, as well as stakeholders in science, engineering, and product teams to understand the role data plays in developing data sets and exemplars that meet customer needs. You will analyze and automate processes for collecting and annotating LLM inputs and outputs to assess data quality and measurement.
You will apply state-of-the-art Generative AI techniques to analyze how well our data represents human language and run experiments to gauge downstream interactions. You will work collaboratively with other language data scientists and scientists to design and implement principled strategies for data optimization.

BASIC QUALIFICATIONS

- 3+ years of data querying languages (e.g. SQL), scripting languages (e.g. Python) or statistical/mathematical software (e.g. R, SAS, Matlab, etc.) experience
- PhD in Computational Linguistics, Linguistics with a computational component, or an equivalent field; alternatively, MA/MS with 3+ years of experience, Bachelors with 5+ yrs of experience
- Excellent knowledge of semantics, pragmatics, conversation analysis, and/or discourse analysis
- Experience designing and executing data collection projects, including guidelines, labelset and annotation workflow development
- Experience developing and evaluating data annotation and data quality metrics
- Experience designing and executing psychology/linguistic/cognitive science surveys or experiments with human participants


PREFERRED QUALIFICATIONS

- Voice assistants experience
- Working with LLMs
- Experience with synthetic dataset creation
- Experience with surveying
- Experience working with a diverse array of languages or language varieties