Expoint – all jobs in one place
Finding the best job has never been easier
Limitless High-tech career opportunities - Expoint

Nvidia Senior Storage Product Engineer 
United States, Texas 
908691665

Yesterday
US, CA, Santa Clara
US, TX, Austin
time type
Full time
posted on
Posted 19 Days Ago
job requisition id

What You Will Be Doing:

  • Architect, deploy, and operate large-scale storage clusters with a focus on scalability, high availability, and data durability.

  • Develop proactive monitoring and alerting frameworks for early detection and remediation of performance and reliability issues.

  • Optimize AI/ML and HPC workloads by crafting intelligent caching, low-latency storage invention, and high-throughput tuning.

  • Own the full lifecycle of storage services—from building and deploying to continuous improvement and scaling.

  • Partner with development teams to deliver automation frameworks, capacity management strategies, and launch readiness reviews.

  • Maintain production storage health by monitoring latency, efficiency, and availability, using predictive analytics and automation.

  • Improve efficiency with compression, deduplication, tiering strategies, and dynamic data placement.

  • Maintain data security and compliance by implementing encryption, access controls, auditing, and governance based on policies,, Automate and scale operations usinginfrastructure-as-code,orchestration, and AI/ML workflows.


What We Need To See:

  • BS degree or equivalent experience in Computer Science, Storage Systems, or a related technical field with 12+ years of practical experience.

  • Experience with distributed and high-performance storage solutions, including clustered and parallel file systems, distributed object storage, and enterprise-grade storage systems.

  • Proven understanding of block, file, and object storage technologies, including their scalability, reliability, and performance characteristics and standard processes.

  • Experience with storage networking protocols such as NFS, SMB, iSCSI, S3, Fibre Channel, RDMA, and NVMe over Fabrics.

  • Expertise in algorithms, data structures, complexity analysis, software development, and automating maintenance of large-scale Linux-based storage systems.

  • Experience in one or more of the following: C/C++, Java, Python, Go, NodeJS, and Bash for storage automation, monitoring, and performance tuning.

  • Hands-on experience with infrastructure configuration management tools like Ansible, Chef, Puppet, and Terraform for automating storage deployments. Experience with observability and tracing tools like InfluxDB, Prometheus, Grafana, and the Elastic stack for monitoring storage system health.

  • Skills in communication, work ethics, teamwork, quality work, and daily dedication are necessary.

Ways to stand out from the crowd:

  • Deep understanding of large-scale distributed storage architectures, replication strategies, and erasure coding techniques.

  • Proficiency in optimizing performance, fine-tuning, and resolving issues with high-throughput storage systems.

  • Experience in analyzing and improving distributed storage system performance at scale.

  • Proven comprehension of network protocols, architectures, and troubleshooting techniques, particularly in connection to storage performance, stability, and availability.

  • Experience using or operating private and public cloud storage solutions based on Kubernetes, OpenStack, or hybrid cloud architectures.

You will also be eligible for equity and .