We are looking for a Senior Machine Learning Research Engineer with a strong research background and hands-on experience in building and optimizing deep learning models. In this role, you will explore and develop cutting-edge techniques in model compression, including pruning, quantization, knowledge distillation, and speculative decoding. You will help design and evaluate novel algorithms that bridge theory and real-world deployment.
Your Role and Responsibilities
As a core member of our ML research team, you will:
Design and conduct experiments to evaluate model compression strategies for large-scale deep learning models.
Develop scalable and modular research code in Python.
Work closely with software engineers and product teams to translate research into deployable systems.
Explore emerging techniques in efficient inference and help define future directions for model optimization.
Collaborate on publications in top-tier ML/AI conferences and contribute to open-source initiatives.
Benchmark models across hardware configurations, contributing to the broader understanding of how model optimizations affect performance in real-world deployment scenarios.
Participate in reading groups, internal workshops, and mentoring activities.
Required Qualifications
PhD in Machine Learning, Computer Science, Electric Engineering, Applied Mathematics, or a related field.
Strong foundation in machine learning algorithms and numerical optimization.
Proficiency in Python and deep learning frameworks such as PyTorch, TensorFlow, or JAX.
Strong analytical and problem-solving skills.
Experience with experimental design and empirical research, including model evaluation and benchmarking.
Excellent written and verbal communication skills, including the ability to explain complex ideas to a technical audience.
Preferred Qualifications
Familiarity with model compression techniques such as quantization, pruning, knowledge distillation, or speculative decoding.
Experience contributing to open-source machine learning projects.
Experience optimizing model performance for inference efficiency, particularly on GPUs or specialized accelerators.
Publication record in top-tier conferences (e.g., NeurIPS, ICML, ICLR, CVPR).
Comfortable navigating large codebases and collaborating in a research-oriented engineering team.
What We Offer
A dynamic and intellectually stimulating environment with opportunities to shape the future of efficient ML systems.
A collaborative team that values curiosity, creativity, and impact.
Support for academic engagement (publishing, conference travel, workshops).
Access to high-performance computing resources and state-of-the-art ML infrastructure.
Comprehensive benefits, flexible work arrangements, and opportunities for career growth.
The salary range for this position is $170,770.00 - $281,770.00. Actual offer will be based on your qualifications.
Pay Transparency
● Comprehensive medical, dental, and vision coverage
● Flexible Spending Account - healthcare and dependent care
● Health Savings Account - high deductible medical plan
● Retirement 401(k) with employer match
● Paid time off and holidays
● Paid parental leave plans for all new parents
● Leave benefits including disability, paid family medical leave, and paid military leave
משרות נוספות שיכולות לעניין אותך