Expoint – all jobs in one place
המקום בו המומחים והחברות הטובות ביותר נפגשים
Limitless High-tech career opportunities - Expoint

Nvidia Senior System Software Engineer - Infrastructure 
India, Karnataka, Bengaluru 
154945716

Yesterday
India, Bengaluru
time type
Full time
posted on
Posted 3 Days Ago
job requisition id

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people.

What you'll be doing:

  • Develop and manage enterprise-scale platforms that unify storage infrastructure and services, integrating enterprise appliances, networks, and open-source technologies.

  • Develop and scale REST APIs in Python/Go, enabling thousands of engineers to seamlessly run on-demand storage workflows.

  • Automate storage operations — provisioning, monitoring, metrics, telemetry, and solving — ensuring high reliability and performance.

  • Integrate intelligent observability and tracing into workflows to improve accuracy, reduce latency, and optimize efficiency across infrastructure services.

  • Implement agentic workflows that empower self-healing, proactive remediation, and automation to decrease operational overhead.

  • Build proof-of-concept integrations between infrastructure services and emerging agentic AI frameworks, laying the foundation for intelligent infrastructure platforms.

  • Document practices and procedures, evaluate new technologies, and drive adoption of next-gen automation in enterprise storage services.

What we need to see:

  • BS in Computer Science (or equivalent experience) with 12+ years of relevant experience, MS with 10+ years, or Ph.D. with 8+ years.

  • Extensive expertise building large-scale, multi-threaded, distributed backend systems.

  • Experience designing and building RESTful APIs using Python or Go.

  • Familiarity with containerization & orchestration (Docker, Kubernetes).

  • Exposure to cloud platforms (AWS, Azure, GCP).

  • Experience with telemetry stacks (Prometheus, Grafana, Alert manager, ELK/Kibana).

  • Ability to collaborate across teams and communicate technical solutions effectively.

  • Growth mindset to quickly adopt new frameworks in observability, AI automation, and infrastructure management.

Ways to stand out from the crowd:

  • Contributions to open-source projects (infrastructure, storage, or Python-based libraries).

  • Strong background in Linux storage systems and solving at enterprise scale.

  • Experience with Enterprise NAS (NetApp, Pure Storage), distributed filesystems (Lustre, GPFS, Ceph), or S3-compatible object storage.

  • Experience with GenAI/agentic application frameworks (LangChain, LlamaIndex, AutoGen) or observability platforms (LangSmith, Arize Phoenix, W&B Weave).

  • Proven track record to prototype and productionize intelligent automation workflows, especially for self-service, large-scale infrastructure.