Expoint - all jobs in one place

The point where experts and best companies meet

Limitless High-tech career opportunities - Expoint

Red hat Senior Principal Software Engineer 
United States, California, Sacramento 
647392621

18.09.2024

Primary Job Responsibilities (what you’ll do)

  • Drive and coordinate contributions to vLLM as Red Hat’s strategic inferencing server.

  • Act as a bridge between platform research and product teams, supporting the delivery of platform components to IBM watsonx.ai, Red Hat OpenShift AI and others.

  • Connect related work being done across different IBM research teams, helping to channel research into core components of Red Hat’s AI products.

  • Design and develop scalable and efficient architectures for training and deploying large language models.

  • Participate in and lead upstream data science and machine learning project communities, building Red Hat’s credibility and influence in these communities.

  • Identify and resolve complex issues related to model training, inference, and system performance.

  • Collaborate with other engineers, data scientists, and researchers to align on goals and product requirements.

  • Implement optimizations for model training and inference to improve performance, efficiency, and resource utilization.

  • Run experiments, tests, and large-scale distributed jobs in support of AI product features.

  • Mentor junior engineers and provide guidance on complex technical issues and best practices.

Required Skills (what you’ll bring)

  • Bachelor's degree in computer science or equivalent.

  • Advanced programming skills in Python and SQL.

  • Expertise in deep learning frameworks (e.g., TensorFlow, PyTorch) and architectures, particularly for natural language processing (NLP).

  • Expertise with vLLM and other inference runtimes.

  • In-depth understanding of the design, training, and deployment of LLMs.

  • Experience with the design and implementation of scalable and efficient system architectures for training and deploying large models.

  • Knowledge of microservices and containerization technologies (e.g., Kubernetes) for deployment.

  • Strong self-motivation and organizational skills.

  • Demonstrated ability to context switch between multiple concurrent projects.

  • Excellent written and verbal communication skills.

  • Positive attitude and willingness to share ideas openly.

Bonus qualifications

  • Masters or PhD in Machine Learning (ML) / Natural Language Processing (NLP).

  • Experience with unit testing, integration testing, and performance testing.

  • Familiarity with participating in an agile development team.

The salary range for this position is $189,250.00 - $312,230.00. Actual offer will be based on your qualifications.

Pay Transparency

● Comprehensive medical, dental, and vision coverage

● Flexible Spending Account - healthcare and dependent care

● Health Savings Account - high deductible medical plan

● Retirement 401(k) with employer match

● Paid time off and holidays

● Paid parental leave plans for all new parents

● Leave benefits including disability, paid family medical leave, and paid military leave