Expoint – all jobs in one place
מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר
Limitless High-tech career opportunities - Expoint

Nvidia Senior Site Reliability Engineer - Enterprise Identity Access 
India, Maharashtra, Pune 
695124306

Today
India, Pune
time type
Full time
posted on
Posted 7 Days Ago
job requisition id

What You’ll Be Doing:

  • Architect, operationalize, and scale zero trust identity and access platforms—driving reliability, automation, and secure credential and policy management across on-premise and cloud environments.

  • Integrate and automate the deployment, monitoring, and lifecycle management of existing commercial and open-source products (SPIRE, Teleport, etc.), emphasizing ephemeral certificate-based authentication, mTLS, and SPIFFE protocols.

  • Advocate for operational guidelines for CI/CD, infrastructure as code (IaC), policy as code, and security observability, using tools like Kubernetes, Argo CD, Gitlab CI, Terraform, Vault, Prometheus, and Grafana.

  • Apply AI-assisted and data-driven approaches to automate anomaly detection, incident response, and compliance reporting, driving continuous improvement in system uptime and threat mitigation.

  • Collaborate with engineering, DevSecOps, and security teams to minimize manual intervention, limit privileged access, and enforce policy compliance through scalable automation.

  • Lead incident management, triaging, and blameless postmortems with security context, ensuring rapid root-cause analysis and recovery.

  • Conduct ongoing risk assessments, proactively address emerging threats and vulnerabilities, and contribute to post-incident reviews passionate about reliability and trust boundary breaches.

What We Need to See:

  • Bachelor’s or Master’s degree in Computer Science or related field, or proven experience.

  • 10+ years of softwareengineering/DevOps/SREexperience, with a significant focus on operational security, automation, and identity management.

  • Proficiency in Linux administration, networking concepts, and security protocols.

  • Proven track record integrating and operating container platforms (Kubernetes, OpenShift, Nomad), with strong emphasis on automation and CI/CD (Argo CD, GitLab CI, Jenkins, Spinnaker, etc.).

  • Hands-on knowledge of zero trust security principles, including SPIFFE/SPIRE, mTLS, X.509 rotation, SSO, OAuth2/OIDC, LDAP, and cloud IAM services.

  • Experience with secrets management (Vault, AWS/Azure/Google Secret Manager, K8s Secrets) and infrastructure as code (Terraform, Pulumi, Ansible, CloudFormation).

  • Proficient in observability and monitoring tools (Prometheus, Grafana, ELK Stack, OpenTelemetry or equivalent experience) and policy automation frameworks.

  • Proficient in automation using Python, Go, or similar languages.

  • Demonstrated ability leading operational and incident response efforts at scale, developing runbooks and playbooks that leverage both automation and AI tools.

Ways to Stand Out from the Crowd:

  • Direct experience operationalizing service mesh, identity federation, or policy engines in reliability-focused environments (Istio, Linkerd, Consul Connect).

  • Track record advancing zero trust architecture through automation and minimized human access, including ephemeral credentials and policy enforcement.

  • Background in integrating AI/ML-assisted tools for operational intelligence, anomaly detection, and reliability improvements.

  • Experience driving compliance, audit readiness, and operational security in cloud (AWS/GCP/Azure) and hybrid environments.

  • Relevant security/DevOps/SRE certifications and open-source contributions.