Key responsibilities include: * Develop tools to improve human-in-the-loop evaluation processes.* Develop automatic evaluation systems that systematically assess model performance and detect areas of weakness.* Automate data scientists’ performance analysis workflows to enhance the efficiency and accuracy.* Stay at the forefront of ML and GenAI advancements, quickly adopting and applying the latest techniques to improve evaluation methodologies.