What you’ll be doing:
Lead the generative data roadmap—define and evolve our strategy for synthetic, curated, and real-world data used to train, fine-tune, and evaluate large-scale AI systems for a specific modality or model type.
Craft and lead data pipelines that power model customization, alignment, and evaluation—including synthetic data generation, human-in-the-loop (HITL) workflows, and RL loops.
Establish thorough data quality frameworks, including coverage analysis, bias detection, adversarial testing, and ethical filtering.
Partner with AI researchers and engineers to identify data bottlenecks in model development, then deliver data, tools or workflows to unblock them.
Define product requirements for internal tools supporting data collection, annotation, and augmentation at scale.
Collaborate with enterprise customers to design workflows for domain-specific LLM fine-tuning using synthetic or proprietary datasets.
Drive initiatives on responsible data use, including fairness, visibility, and privacy, in close alignment with legal, compliance, and AI ethics teams.
Lead cross-functional efforts with infrastructure, model training, and deployment teams to ensure that data is usable, scalable, and aligned with research goals.
Forge strategic partnerships with academic labs, data vendors, and open-source communities to expand access to high-quality and diverse datasets.
What We need to see:
Bachelor’s or Master’s in Computer Science, Data Science, AI/ML, or a related technical field (or equivalent experience).
8+ years of experience
Demonstrated ability in product management, data platform leadership, or ML/AI-focused roles at a technology company
Strong understanding of LLM architecture, training regimes, and alignment methods (e.g. fine-tuning, RL, retrieval-augmented generation)
Consistent track record in running large-scale data projects or pipelines for machine learning applications
Deep familiarity with data-centric AI development, from collection to evaluation
Outstanding communication skills—ability to translate between research, engineering, and business collaborators
Proven ability to prioritize optimally and ship sophisticated cross-functional projects
Ways to stand out from the crowd:
Experience with GenAI data workflows: prompt engineering, synthetic data generation, human feedback systems
Prior work on LLM evaluation frameworks or benchmarks
Experience developing data quality, labeling, or annotation tools
Knowledge of data governance, privacy, or AI ethics standard processes
Exposure to enterprise ML use cases and multi-modal model development
You will also be eligible for equity and .
משרות נוספות שיכולות לעניין אותך