Expoint - all jobs in one place

Finding the best job has never been easier

Limitless High-tech career opportunities - Expoint

Nvidia Senior Software Engineer 
India, Maharashtra, Pune 
127145873

01.12.2024

What you'll be doing:

  • Architect the scaling operation in our data centers. Deploy and Support end-to-end container management solution with Kubernetes, Docker, Containerd.

  • Design solutions with service discovery, networking, monitoring, logging, scheduling in Kubernetes

  • You will be working on challenging problems in area of infrastructure such as job scheduling, resource management and automated recovery.

  • Use your depth in algorithms and system software background!

  • Work in teams to deploy new data center infrastructure.

  • Plan and implement critical metrics tracking using various data analytics mining methods and dashboards.

  • Reuse AI techniques to extract useful signals about machines and jobs from the data generated!

  • Take part in prototyping, crafting and developing cloud infrastructure for Nvidia.

  • Develop various device plugins / Operator on Kubernetes

  • Build complete solutions including Metrics, Alert and Storage Services

  • You want to dig more data, analyze much more, apply deep learning algorithms / machine learn to improve theperformance/predictabilityof the system


What we need to see:

  • Strong object-oriented programming background in python/Golang/java and/or relevant scripting languages

  • Background in developing large scale cloud infrastructure applications

  • Knowledge of various technologies (Kubernetes, Message broker)

  • Experience with Relational Databases such as MySQL, NoSQL DBs such as Elastic Search

  • Proficient with configuration management tools like Ansible, Chef, Puppet and strong experience with Jenkins and/or other CI systems.

  • Ability to collaborate across multiple team and across people working in different time zones.

  • Experience withanalytics/visualizationtools like Kibana, Grafana, Splunk etc. and experience with monitoring systems such as Zabbix and/or Nagios is nice to have

  • BS/MS in Computer Science or Computer Engineering or equivalent experience

  • 5+ years of proven experience.

Ways to stand out from the crowd:

  • Real world experience with distributed systems, containers, and Kubernetes API.

  • Previous experience with DevOps teams

  • You have worked on computer algorithms and demonstrated ability to choose the best possible algorithms to nail sophisticated problems

  • Able to divide sophisticated problems into simple sub problems and then reuse available solutions to implement the solutions.

  • Experience in design, implementation and deployment of major infrastructure features across multiple servers in incremental rollout mode