Expoint – all jobs in one place
The point where experts and best companies meet
Limitless High-tech career opportunities - Expoint

Nvidia Principal Staff SRE - Core Infrastructure 
India, Karnataka, Bengaluru 
870503322

26.08.2025
India, Bengaluru
time type
Full time
posted on
Posted Today
job requisition id

What You Will Be Doing:

  • Lead initiatives to transform IT Compute Core Team, architecture to build new service offerings across On-Prem and Cloud

  • You will design, scale, and deploy core infrastructure services including DNS, NTP/PTP, DHCP, and LDAP. This includes building for performance and reliability at global scale, covering automation, monitoring, high availability, capacity planning, and lifecycle management.

  • Define and implement metrics to measure the efficiency of services and drive efficiency with software and hardware optimizations (SR-IOV/ DPU)

  • Experience with Technologies like eBPF and XDP for Observability & DDoSmitigation

  • Collect and review system data for capacity and planning purposes, analyze capacity data and develop plans for appropriate level enterprise-wide systems, and coordinate with management personnel in implementing changes.

  • Develop and maintain tools for collecting, analyzing, and visualizing data for reporting, alerting, monitoring.

  • Collaborate with NVIDIA leadership, senior engineers, program managers, and product managers to develop compelling IT products and services that meet customer needs.

What We Need To See:

  • Bachelor’s degree in Engineering, Computer Science, Mathematics, or related field, or equivalent experience

  • 12+ years of proven experience in compute platform engineering with a focus on automation.

  • Experience in designing and deployingContainerizationarchitectures and Distributed Systems Infrastructure

  • Proven experience evaluating existing application architectures and identify opportunities for containerization to improve scalability, reliability, and efficiency.

  • Strong analytical skills with the ability to define and track key performance metrics.

  • Experience in developing tools for data analysis and performance profiling, Development with Terraform, Config Management tools.

  • Proficiency in programming languages such as Go and/or Python.

  • Linux OS Proficiency with Kernel Internals

  • Experience with running large environments consisting of BareMetal Build Infrastructure

  • Understanding of Network Protocols and Architectures(VLAN/VxLAN/SDN/BGP/Anycast)


Ways To Stand Out From The Crowd:

  • Deep understanding of other infrastructure components like, DNS, LDAP, Security Tools etc..

  • Hands-on experience with containers and its implementation

  • Deploying and Managing Services like DNS , LDAP at scale

  • Solid understanding of microservices architecture, infrastructure as code (IaC) and configuration management tools.