What the Candidate Will Do:
- Program Strategy & Planning:
- Define the roadmap and key objectives for data labeling projects to support generative AI initiatives.
- Partner with stakeholders (Data Scientists, Machine Learning Engineers, and Product Managers) to identify data requirements and success criteria.
- Pipeline Development & Management:
- Design scalable data labeling workflows that leverage internal tools, external vendors, and automation.
- Optimize workflows for efficiency, accuracy, and cost-effectiveness, incorporating active learning and pre-labeling techniques where appropriate.
- Vendor & Stakeholder Management:
- Engage and manage relationships with data labeling vendors, ensuring timely delivery and adherence to quality standards.
- Collaborate with cross-functional teams to align labeling efforts with broader AI model development timelines.
- Quality Assurance:
- Implement robust quality assurance processes to validate labeled datasets against gold standards.
- Use metrics such as inter-annotator agreement, precision/recall, and throughput to monitor quality and make improvements.
- Budget & Resource Allocation:
- Manage program budgets, including vendor costs and internal resources.
- Forecast resource requirements and ensure efficient allocation to meet deadlines.
- Compliance & Ethics:
- Ensure compliance with data privacy regulations (e.g., GDPR, CCPA) and ethical guidelines in dataset creation.
- Advocate for inclusive and unbiased labeling practices to mitigate bias in AI models.
Basic Qualifications:
- Bachelor's degree in Computer Science, Data Science, or related field; or equivalent practical experience.
- 5+ years of program/project management experience, preferably in data labeling, AI/ML, or related fields.
- Strong knowledge of machine learning concepts, particularly around supervised learning and training data needs.
- Experience working with data annotation platforms (e.g., Scale AI, Labelbox, Appen) and tools.
- Proven track record of managing large-scale projects with cross-functional teams and external vendors.
Preferred Qualifications:
- Master’s degree in a technical field or MBA.
- Experience in Generative AI, including text, image, or audio data labeling.
- Familiarity with active learning, semi-supervised labeling, and human-in-the-loop systems.
- Proficiency in data annotation tools and scripting languages (Python, SQL) to analyze datasets and processes.
- Strong understanding of ethical AI and best practices for minimizing dataset bias.
- Excellent written and verbal communication skills, with the ability to influence technical and non-technical stakeholders.
For San Francisco, CA-based roles: The base salary range for this role is USD$155,000 per year - USD$172,000 per year.
You will be eligible to participate in Uber's bonus program, and may be offered an equity award & other types of comp. You will also be eligible for various benefits. More details can be found at the following link .