Expoint - all jobs in one place

The point where experts and best companies meet

Limitless High-tech career opportunities - Expoint

Nvidia Site Reliability Engineer - GPU Cloud
India, Karnataka, Bengaluru
446569928

01.12.2024

Share

Log in to apply

What you’ll be doing:

As an SRE, you are responsible for:

Providing scalable and robust service oriented infrastructure automation, monitoring and analytics solutions for NVIDIA's on-prem and cloud based GPU infrastructure.
You will own the whole life cycle of new tools and services - from requirements gathering, to design documentation, validation and deployment.
Provide customer support on a rotation basis.

What we need to see:

Minimum of 3 years Experience in automating and handling large-scale distributed system software deployments in on-prem/cloud environments.
Proficiency in any language - Go/Python /Perl/C++/Java/C.
Strong command on terraform, Kubernetes and cloud infra administration.
Excellent debugging and troubleshooting skills.
Excellent interpersonal, and written communication skills.
B.E in Computer Science or a related technical field involving coding (e.g., physics or mathematics)

Ways to stand out from the crowd:

Ability to decompose complex requirements into simple tasks and reuse available solutions to implement most of those.
Unit testing and benchmarking are an integral part of your code.
Ability to reason and choose the best possible algorithm to meet scaling and availability challenges.

These jobs might be a good fit

Nvidia Senior Site Reliability Engineer - GPU Cloud India, Karnataka, Bengaluru

Google Site Reliability Engineer Cloud Databases India, Karnataka, Bengaluru

Apple Site Reliability Engineer - Platform Cloud India, Karnataka, Bengaluru

Google Site Reliability Engineer Cloud Spanner India, Karnataka, Bengaluru

Professional CV Builder tool from Expoint.

Get to the top of the "yes list" with a standout CV!

CREATE CV