The point where experts and best companies meet

Limitless High-tech career opportunities - Expoint

Nvidia Senior Site Reliability Engineer
India, Maharashtra, Pune
396977805

20.05.2025

India, Pune

India, Bengaluru

time type: Full time

posted on: Posted 5 Days Ago

job requisition id

NVIDIA is looking for a world class engineer to join its multifaceted and fast-paced Infrastructure, Planning and Processes organization where you will be working as a Senior Devops and SRE Engineer. The position will be part of a fast-paced crew that develops and maintains sophisticated build & test environments for a multitude of hardware platforms both NVIDIA GPUs and Tegra Processors along with various operating systems(Windows/Linux/Android).

What you’ll be doing:

End-to-end Implementation of the Kubernetes architecture - design, deploy, hardening, networking, sizing, scaling etc.
Implementing high availability clusters and disaster recovery solutions
Strong System Admin experience using Configuration as Code,infrastructure-as-codewith tools such as ansible, puppet, chef & terraform.
Design and implement logging & monitoring solution to gain more insight into applications and system health. Implement critical metric using various analytics methods and dashboards.
Craft and develop tools needed for automating workflows. Reuse AI techniques to extract useful signals about machines and jobs from the data generated.
Take part in prototyping, crafting and developing cloud infrastructure for Nvidia.
Participating in on-call support and critical issue coverage as a SRE engineer.

What we need to see:

Solid programming background in python/Go and/or similar scripting languages.
Excellent debugging, problem solving and analytical skills.
Strong understanding of architectural requirements and development processes involved in building reliable, robust, scalable data products and pipelines.
Proficient in configuration management & IaC tools like Ansible, Puppet, Chef, Terraform
Strong background with Gitlab, Jenkins, Flux, ArgoCD and/or other tools to build secure CI/CD systems.
Strong expertise in Kubernetes architecture, networking, RBAC, persistent storage solutions like Trident, Ceph, EBS, Longhorn, etc.
Proficient in secret management tools like hashicorp vault, aws secrets manager, etc.
Proficient in dataanalytics/visualization& monitoring tools like Kibana, Grafana, Splunk, Zabbix, Prometheus and/or similar systems.
5+ years of proven experience.
Bachelor’s or master’s degree in computer science, Software Engineering, or equivalent experience.

Ways to stand out from the crowd:

Thrives in a multi-tasking environment with constantly evolving priorities.
Prior experience with large scale operations team. Experience with using and improving data centers. Expertise with windows server infrastructure.
Outstanding interpersonal skills and communication with all levels of management.
Ability to analyze complex problems into simple sub problems and then reuse available solutions to implement most of those. Ability to design simple systems that can work efficiently without needing much support.
Ability to leverage AI/ML to proactively detect & resolve incidents, automated alert triaging, log analysis and automate repetitive workflows.

These jobs might be a good fit

IE-

Intercontinental Exchange - ICE Senior Site Reliability Engineer India, Maharashtra, Pune

NIC

NICE Senior Site Reliability Engineer India, Maharashtra, Pune

Tesla Site Reliability Engineer India, Maharashtra, Pune

IE-

Intercontinental Exchange - ICE Site Reliability Engineer India, Maharashtra, Pune

Professional CV Builder tool from Expoint.

Get to the top of the "yes list" with a standout CV!

CREATE CV

Nvidia Senior Site Reliability Engineer India, Maharashtra, Pune 396977805

Intercontinental Exchange - ICE Senior Site Reliability Engineer India, Maharashtra, Pune

NICE Senior Site Reliability Engineer India, Maharashtra, Pune

Tesla Site Reliability Engineer India, Maharashtra, Pune

Intercontinental Exchange - ICE Site Reliability Engineer India, Maharashtra, Pune

Nvidia Senior Site Reliability Engineer
India, Maharashtra, Pune
396977805