The task involves integrating LLM inference performance benchmarking tools like LLM load test, guide llm, and lm-eval into Crucible, our comprehensive benchmarking framework. The process includes documenting the installation and basic test execution procedures, automating these procedures using scripts, setting up the test scenario, and automating the execution procedure. The goal is to ensure the workload setup is clearly documented, the installation and basic test execution are automated, and the directory is referenced in the relevant story. This integration aims to enhance our benchmarking capabilities, enabling more accurate and efficient evaluation of different language models.
Primary Job Responsibilities:
Develop and maintain tools and automation software to aid the performance and benchmarking work for the team like Crucible, GuideLLM and other software tools
Formulate test plans, and execute performance and evaluation benchmarks against various instructlab workloads and models to characterize performance, drive performance improvements, and detect performance regressions through data analysis and visualization
Collaborate with other engineering teams to resolve performance issues
Triage, debug, and solve customer cases related to instructlab performance
Submit performance benchmarking results to industry consortia
Publish results, conclusions, recommendations and best practices via internal test reports, presentations, external blogs and official documentation to support our partners and customers.
Participate in internal and external conferences about your work and results
Required Skills:
Programming experience in Python and other various system scripting languages
Familiar with various public clouds and how to provision new systems
Knowledge of popular databases including NoSQL databases like ElasticSearch and SQL databases like postgres
Performance benchmarking, data capture, data analysis, and data
Experience with performance data collection and analysis tools
Knowledge of popular AI technologies, and frameworks
Knowledge of large language model inference and finetuning internals
Experience working with the Linux operating system
Excellent written and verbal language skills in English
Nice to Haves:
Masters degree or Phd in Computer Science or related fields
Knowledge of AI performance benchmarking suites such as MLPerf
Direct experience working with AI performance profiling tools like GuideLLM
Experience in the k8s ecosystem, preferably the OpenShift ecosystem
Familiarity with the modern "accelerator" ecosystem, models, drivers, interconnects
The salary range for this position is $108,760.00 - $173,800.00. Actual offer will be based on your qualifications.
Pay Transparency
● Comprehensive medical, dental, and vision coverage
● Flexible Spending Account - healthcare and dependent care
● Health Savings Account - high deductible medical plan
● Retirement 401(k) with employer match
● Paid time off and holidays
● Paid parental leave plans for all new parents
● Leave benefits including disability, paid family medical leave, and paid military leave
משרות נוספות שיכולות לעניין אותך