מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר
What you'll be doing:
Architecting scalable and reliable software components to enable the core platform to maintain an inventory of resources including hosts, GPUs, and switches; to automate actions to diagnose failures and to repair
Influencing the product roadmap in collaboration with teams across various departments with the goal of reducing SRE toil and improving hardware utilization
Collaborating with various organizations across Nvidia to drive adoption of the platform in order to improve GPU utilization
Defining and running benchmarks for various subsystems
Leading and delivering high impact projects with high quality, performance and stability with the lowest resource consumption
Developing a robust feedback control system that analyzes signals about system health and automatically runs commands to fix discovered issues
Programming in modern languages like Go and Rust
What we need to see:
Bachelor's or Master's degree in Computer Science, Engineering, or a related field (or equivalent experience)
15 years of equivalent experience
Demonstrated ability in building scalable and robust distributed systems
Proven record of product rollouts and collaborating with early adopters
Proficiency in programming in C/C++, Java, Rust or Go.
Technical stewardship of projects across the organization
Ways to stand out from the crowd:
Deep understanding of multi-threading and distributed systems concepts
Excellent track record of delivering projects
Expertise in optimizing SQL queries
Expert level knowledge of Rust programming
You will also be eligible for equity and .
משרות נוספות שיכולות לעניין אותך