What you’ll do:
- Optimize and fine-tune LLMs or embedding models.
- Develop and enhance tools and frameworks built around model orchestration.
- Research, design, and implement state-of-the-art algorithms to optimize models at scale for latency and efficiency.
- Investigate complex NLP problems driven by customer use cases.
- Implement hardware-specific optimizations (e.g., GPUs, TPUs, CUDA).
What will you need to have?
- Experience in designing and developing machine learning models, especially in natural language processing.
- Deep understanding of transformer architecture, LLMs, and embedding models.
- Strong proficiency in Python, including experience with ML/NLP libraries (e.g., TensorFlow, PyTorch, Hugging Face Transformers).
- Experience with at least one low-level system language (c, c++, rust).
- Experience with CUDA and high-performance computing.
- Self-directed, ambitious, and eager to learn new skills and technologies.
- Master’s or PhD in Computer Science, Artificial Intelligence, Machine Learning, or a related field.
Extra great if you have:
- Publications or contributions to projects in AI/ML, NLP, or related areas.
- Experience with large-scale distributed training and optimization techniques.
- Contributions to open-source AI projects.
Our culture is what makes Redis a fun and rewarding place to work. To support you at work and beyond, we offer all our US team members fantastic benefits and perks:
- Competitive salaries and equity grants
- Unlimited time off to promote a healthy work-life balance
- H/D/V coverage along with 401K, FSA, and Commuter Benefits
- Frequent team celebrations and recreation events
- Learning and development opportunities
- Ability to influence a high-performance company on its way to IPO
The estimated gross base annual salary range for this role is $179,500–$269,825