What you'll be doing:
Lead the design and deployment of scalable storage systems optimized for AI workloads and high-performance compute clusters.
Drive readiness and operational enablement for upcoming hardware platforms, ensuring seamless integration and performance.
Coordinate the development of internal tools to enhance storage provisioning, usage traceability, and user self-service.
Guide the evaluation and implementation of new technologies to improve efficiency, reliability, and observability.
Collaborate with cross-functional teams to align storage architecture with GPU cluster requirements and evolving research needs.
Improve storage monitoring and metrics infrastructure to surface key insights and enable proactive management.
Find opportunities to modernize existing storage systems for improved quota management, compression, and automation.
What we need to see:
BS or equivalent experience.
12+ overall years of relevant technical experience.
5+ years of leadership experience.
Proven ability to lead engineering teams building infrastructure at scale, especially in environments combining storage and high-performance computing.
Deep technical knowledge in distributed storage systems, with experience improving data access patterns and platform observability.
Familiarity with infrastructure deployment lifecycle – from planning and vendor engagement to rollout and operational readiness.
Strong understanding of aligning storage performance with compute needs, and measuring system behavior based on real-world metrics.
Ability to guide teams through technology evaluations, balancing technical rigor with speed and pragmatism.
Ways to stand out from the crowd:
Experience with large-scale storage and networking systems inperformance-sensitiveenvironments such as HPC, AI, or scientific computing.
Success in building tools or automation for self-service, visibility, and governance in complex infrastructure environments.
Background in data observability and metrics correlation for infrastructure performance, cost efficiency, or capacity forecasting.
Leading teams through cross-functional technical evaluations or RFPs, turning those into successful infrastructure deployments.
Contributions to storage architecture improvements, including filesystem tuning, resource quota management, or data compression strategies.
You will also be eligible for equity and .
משרות נוספות שיכולות לעניין אותך