Expoint - all jobs in one place

Finding the best job has never been easier

Limitless High-tech career opportunities - Expoint

Dell Principal Site Reliability Engineer -Redhat OpenShift GO K8 & NVIDIA AI 
India, Karnataka, Bengaluru 
951311877

27.06.2024

You will:

  • Participate in 24x7x365 shift coverage for infrastructure support for Redhat OpenShift, specialist in NVIDIA, containers, and Kubernetes skills.
  • Perform daily system monitoring, verifying the integrity and availability of all hardware, resources, systems, and key processes, reviewing system and application logs, and verifying completion of scheduled jobs.
  • Provide support per request from various constituencies. Investigate and troubleshoot issues.
  • Repair and recover from hardware or software failures. Coordinate and communicate with impacted infrastructure.
  • Perform preventative maintenance (and upgrade, as required) on devices, and related peripherals to meet IT specifications.
Essential Requirements:
  • 8+ Years strong experience in Redhat OpenShift, Go-Lang programming.
  • Experience in containers and Kubernetes Administration.
  • Hands on knowledge on NVIDIA AI Enterprise, NVIDIA GPU & Network Operations
  • Knowledge on NVIDIA base command manager & Cluster manager
  • Knowledge on Network Administration with NVIDIA ONYX Switch System

Desirable Requirements

  • Knowledge on Observability & log collection (Prometheus and Grafana)
  • OpenShift Administrator certification preferred.