KEY RESPONSIBILITIES INCLUDE: Develop tools to improve human-in-the-loop evaluation processes.Develop automatic evaluation systems that systematically assess model performance and detect areas of weakness.Automate data scientists’ performance analysis workflows to enhance the efficiency and accuracy.Stay at the forefront of ML and GenAI advancements, quickly adopting and applying the latest techniques to improve evaluation methodologies.