The point where experts and best companies meet
Share
As a Language Data Scientist, you will start by diving deep into a couple of critical projects across Alexa experiences . You will collaborate with fellow language data scientists, program managers, as well as stakeholders in science, engineering, and product teams to understand the role data plays in developing data sets and exemplars that meet customer needs. You will analyze and automate processes for collecting and annotating LLM inputs and outputs to assess data quality and measurement.
You will apply state-of-the-art Generative AI techniques to analyze how well our data represents human language and run experiments to gauge downstream interactions. You will work collaboratively with other language data scientists and scientists to design and implement principled strategies for data optimization.
- 3+ years of data querying languages (e.g. SQL), scripting languages (e.g. Python) or statistical/mathematical software (e.g. R, SAS, Matlab, etc.) experience
- PhD in Computational Linguistics, Linguistics with a computational component, or an equivalent field; alternatively, MA/MS with 3+ years of experience, Bachelors with 5+ yrs of experience
- Excellent knowledge of semantics, pragmatics, conversation analysis, and/or discourse analysis
- Experience designing and executing data collection projects, including guidelines, labelset and annotation workflow development
- Experience developing and evaluating data annotation and data quality metrics
- Experience designing and executing psychology/linguistic/cognitive science surveys or experiments with human participants
- Voice assistants experience
- Working with LLMs
- Experience with synthetic dataset creation
- Experience with surveying
- Experience working with a diverse array of languages or language varieties
These jobs might be a good fit