The point where experts and best companies meet
Share
The postdoc is expected to develop machine learning techniques for Large Language Model (LLM) Alignment. Currently, Alignment techniques for LLMs rely primarily on the annotation of preference between alternative responses followed by policy (LLM) optimization either directly (DPO-style) or after learning one or multiple reward functions (PPO-style). However, the data from such annotation is often noisy and confounded by ambiguity, subjectivity, and multi-dimensionality of preference. The goal of the project is to develop 1) high-quality data annotation procedures with clear instructions, 2) preference models which account for noisiness, ambiguity, subjectivity, and multi-dimensionality of preference annotation, and 3) appropriate algorithms for directly optimizing the policy (LLM) or appropriate loss functions for learning rewards using such preference models.Key job responsibilities• Publish your innovation in top-tier academic venues and hone your presentation skills.
• Be inspired by challenges and opportunities to invent cutting-edge techniques in your area(s) of expertise.
• PhD in a relevant field, received within 2 years of starting the program
• Proven publication record in Machine Learning, LLM, Optimization, Reinforcement Learning or other related technical fields
* Experience in data science and quantitative research
* Proficiency in technologies relevant to the subfield
• Ability to independently deliver results in a fast-paced environment
• Publications at top-tier, peer-reviewed conferences and/or journals
• Exceptional verbal and written communication skills
• Expert knowledge in modeling and performance, operationalization, and scalability of scientific techniques and establishing decision strategiesRequired application materials:
• CV, which lists all peer-reviewed publications and conferences,
• Research statement that outlines your research achievements and future research interests, and
• A journal article or book chapter that demonstrates your domain expertise.
These jobs might be a good fit