In this role, you will be at the forefront of evaluating multimodal and generative models for real-world health/wellbeing applications on their objective quality and alignment with human intent and perception, such as truthfulness, adaptability, and model generalizability. You will work on data and evaluation pipeline of both human and synthetic data for model evaluation, leverage ML technologies such as reinforcement learning with human feedback and adversarial models.Responsibilities: Build the back-end system that generate and lead data from a variety of endpoints (e.g. health databases, human annotations, synthetic generations)Build quality and eval pipeline, model experimentation such as adversarial testingBuild insights/interpretability tools; explore methods to understand and predict failure modes