המקום בו המומחים והחברות הטובות ביותר נפגשים
Job Responsibilities
Architect and implement comprehensive safety systems for LLM deployments, including content filtering, output validation, and alignment techniques
Design and develop robust guardrailing frameworks that enforce model behavioral boundaries while maintaining performance and user experience
Lead the development of monitoring systems to detect and mitigate potential model hallucinations, harmful outputs, and alignment drift in production
Build and maintain evaluation frameworks for assessing model safety, including automated testing pipelines for toxicity, bias, and harmful behavior
Develop prompt engineering systems and safety layers that ensure reliable and controlled LLM outputs across different use cases and deployment scenarios
Implement fine-tuning and human preference alignment pipelines with a focus on maintaining model alignment and improving safety characteristics
Design and deploy systems for LLM output validation, including fact-checking mechanisms and source attribution capabilities
Lead technical initiatives around model interpretability and transparency, including debugging tools for understanding model decisions
Collaborate with policy and safety teams to translate safety requirements into technical implementations and measurable metrics
Requirements
5+ years of ML engineering experience , with 3+ years specifically working with transformer-based models and LLMs
Deep expertise in prompt engineering , instruction tuning , or human preference alignment techniques
Strong background in implementing AI safety mechanisms and guardrails for production LLM systems
Experience with LLM evaluation frameworks and safety metrics
Proven track record of building production-grade systems for model monitoring and safety enforcement
Strong programming skills in Python and experience with modern LLM frameworks (PyTorch, Transformers, etc.)
Experience implementing content filtering and output validation systems
Understanding of AI alignment principles and practical safety techniques
The following will be considered a plus:
Experience with constitutional AI and alignment techniques
Background in implementing human preference alignment pipelines and fine-tuning large language models
Familiarity with LLM deployment platforms and serving infrastructures
Experience with model interpretation techniques and debugging tools for LLMs
Knowledge of AI safety research and current best practices
Understanding of adversarial attacks and defense mechanisms for language models
Experience with prompt injection prevention and input sanitization techniques
Background in implementing automated testing systems for model safety
Advanced degree in Computer Science, ML, or related field with focus on AI safety
משרות נוספות שיכולות לעניין אותך