Expoint – all jobs in one place
המקום בו המומחים והחברות הטובות ביותר נפגשים

דרושים Network Site Reliability Engineer ב-אנבידיה ב-United Kingdom, Southampton

מצאו את ההתאמה המושלמת עבורכם עם אקספוינט! חפשו הזדמנויות עבודה בתור Network Site Reliability Engineer ב-United Kingdom, Southampton והצטרפו לרשת החברות המובילות בתעשיית ההייטק, כמו Nvidia. הירשמו עכשיו ומצאו את עבודת החלומות שלך עם אקספוינט!
חברה (1)
אופי המשרה
קטגוריות תפקיד
שם תפקיד (1)
United Kingdom
Southampton
נמצאו 8 משרות
24.11.2025
N

Nvidia Senior AI Engineer Security Architect United Kingdom, England, Southampton

Limitless High-tech career opportunities - Expoint
Research novel AI techniques to secure next-generation networks and apply them to existing NVIDIA products. Investigate and analyze network telemetry for indicators of vulnerabilities in secure environments. Collaborate with NVIDIA...
תיאור:
UK, Cambridge
UK, Remote
Germany, Remote
time type
Full time
posted on
Posted 6 Days Ago
job requisition id

What you'll be doing:

  • Research novel AI techniques to secure next-generation networks and apply them to existing NVIDIA products.

  • Investigate and analyze network telemetry for indicators of vulnerabilities in secure environments.

  • Collaborate with NVIDIA researchers to explore innovative ways to improve security in networking products.

  • Contribute to projects that have potential real-world impact on NVIDIA's product portfolio.

What we need to see:

  • Holding a PhD or MSc or equivalent experience in Electrical Engineering, Computer Science, or a related field with a focus on AI.

  • 5+ years of relevant experience.

  • Experience with innovative AI tools, frameworks, and methods related to cybersecurity incident detection and prevention.

  • Background in cybersecurity, networking (TCP/IP), and network security (TLS/IPSec).

  • Solid programming skills and a deep understanding of secure system design.

Ways to stand out from the crowd:

  • PhD with a track record of publication in top peer-reviewed AI conferences or equivalent experience leading AI projects.

  • Expertise with LLMs and recent advancements in neural networks.

  • Architectural knowledge of system security.

  • Understanding of common attack vectors targeting network devices and methods to mitigate them.

  • A proven track record to translate sophisticated research into practical solutions.

Show more
09.11.2025
N

Nvidia Senior Systems Engineer Artificial Intelligence Operations United Kingdom, England, Southampton

Limitless High-tech career opportunities - Expoint
You will bring together and understand internal and external customer requirements to improve AI cluster resiliency and design AIOps-based solutions that address these needs. develop automated workflows for issue detection...
תיאור:
UK, Remote
Finland, Remote
France, Remote
Spain, Remote
Sweden, Remote
time type
Full time
posted on
Posted 13 Days Ago
job requisition id

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people.

What will you be doing:

  • You will bring together and understand internal and external customer requirements to improve AI cluster resiliency and design AIOps-based solutions that address these needs.

  • develop automated workflows for issue detection and root cause analysis and closely collaborate with operators to debug sophisticated, full-stack AI cluster problems. We will bring to bear the findings for product improvements!

  • deliver compelling technical presentations and lead hands-on demos or training. You'll also handle evaluation deployments (POC/POV) and ensure smooth, reliable installations by staying engaged and encouraging throughout the customer journey.

What we need to see:

  • Bachelor of Science or equivalent experience

  • 12+ years of networking experience in enterprise or service provider environments, with strong hands-on expertise in routing and switching.

  • Proficient in scripting and automation using Python or similar languages, with strong Linux expertise.

  • Proven experience working directly with customers to resolve issues and ensure success in Systems Engineer or SRE roles.

  • Exceptional oral, written, and presentation skills for clearly communicating complex technical topics.

  • Demonstrated ability to collaborate effectively across teams, partnering with operations, engineering, and product development

Ways to stand out from the crowd:

  • Experience with data center infrastructure and cloud architectures

  • Background in network performance monitoring or observability

  • Previous experience working at a technological start-up

Show more

משרות נוספות שיכולות לעניין אותך

08.11.2025
N

Nvidia Software Configuration Management Engineer – Hardware United Kingdom, England, Southampton

Limitless High-tech career opportunities - Expoint
Responsible for the full SCM environment including application, OS, and server hardware components, developing the continued automation and innovation needed for our large environment. Create new solutions to improve the...
תיאור:
UK, Cambridge
UK, Remote
time type
Full time
posted on
Posted 11 Days Ago
job requisition id

For two decades, we have pioneered visual computing, the art and science of computer graphics. With our invention of the GPU - the engine of modern visual computing - the field has expanded to encompass video games, movie production, product design, medical diagnosis and scientific research. Today, we stand at the beginning of the next era, the AI computing era, ignited by a new computing model, GPU deep learning. This new model - where deep neural networks are trained to recognize patterns from massive amounts of data - has shown to be deeply effective at solving some of the most complex problems in everyday life.

NVIDIA runs one of the largest Perforce installations in the world, and a very large Git installation as well. Our Software Configuration Management (SCM) Tools and Infrastructure group is looking for a top SCM architect. You will tackle the challenges that we face with operating at scale to produce a best-in-industry solution and enable us to continue to provide unprecedented performance and reliability for our users. You will work in our team to engineer new solutions to scale our Perforce and Git infrastructure to handle large and ever-growing load and data volume. You will design and code processes and automation tools to improve productivity managing and administering the SCM systems and applications used by our globally distributed engineering teams.

What you'll be doing:

  • Responsible for the full SCM environment including application, OS, and server hardware components, developing the continued automation and innovation needed for our large environment

  • Create new solutions to improve the reliability and performance of our ever-growing infrastructure, and work with automated orchestration tools to deploy those improvements to hundreds of systems worldwide

  • Be part of a global team and will evaluate technology alternatives, work closely with other project members to specify solutions, craft schedules, and lead ongoing enhancements and support

  • Learn and greatly improve the daily productivity of the world’s top chip designers and software engineers

What we need to see:

  • MS (preferred) or BS in Computer Science (or equivalent experience) or a related field with at least 3+ years of experience

  • Deep understanding of Software Configuration Management (SCM) processes and tools such as Perforce, Git, Subversion, or ClearCase for large, multi-site development

  • You've configured/deployed Continuous Integration (CI) and Continuous Deployment (CD) systems in your past experience

  • Excellent interpreted language skills highly desired – Object Oriented Perl or Python preferred and Strong software engineering process skills required

  • Strong object-oriented programming and design pattern knowledge and background - Object Oriented Perl, Python, C++, or Java preferred

  • Experience with databases, MySQL or Postgres preferred, experience with NoSQL databases a plus

  • Experience with DevOps or system administration with Linux systems required (CentOS/RHEL and Ubuntu preferred)

  • Strong experience with automation required, Ansible or Puppet preferred and Excellent interpersonal skills, including written and verbal communication

  • You are comfortable and enjoy working with dynamic and ever evolving environments

Ways to stand out from the crowd:

  • Meticulous organizer with an ever positive, can-do attitude

  • Demonstrate use of out-of-box thinking for creative solutions to highly sticky problems

  • Fun and enthusiastic teammate who enjoys a challenge and celebrates success

Show more

משרות נוספות שיכולות לעניין אותך

25.10.2025
N

Nvidia Senior System Software Engineer Platform Operations United Kingdom, England, Southampton

Limitless High-tech career opportunities - Expoint
Architect, build, and evolve the scalable technology stack for global learner and instructor technical support. Lead the global operationalization of support systems, to ensure high availability, performance, and efficient resource...
תיאור:
UK, Remote
France, Remote
Germany, Remote
time type
Full time
posted on
Posted 4 Days Ago
job requisition id

What you’ll be doing:

  • Architect, build, and evolve the scalable technology stack for global learner and instructor technical support.

  • Lead the global operationalization of support systems, to ensure high availability, performance, and efficient resource utilization.

  • Provide technical leadership and mentorship to a distributed operations team, driving excellence in the use of support technologies and processes.

  • Collaborate cross-functionally to translate support insights and user feedback into systemic improvements to shared NVIDIA services, the DLI platform, and overall experience for enterprises, learners, and instructors.

What we need to see:

  • Bachelor’s degree in Computer Science, a related technical field, or equivalent experience

  • Over 6 years of DevOps experience optimizing, deploying and running containerized applications (Docker, Kubernetes) across AWS, Azure, and GCP, including hands-on work with EKS, AKS, and GKE.

  • Proficient in Python and Linux shell scripting for automation, application development, system administration, and troubleshooting.

  • Validated experience architecting, implementing, and managing cloud infrastructure using Terraform.

  • Demonstrated ability as a meticulous problem-solver with strong analytical skills, capable of diagnosing and resolving complex technical challenges under pressure.

  • Excellent communication, teamwork, and collaboration skills, with an ability to articulate technical concepts clearly to diverse audiences and lead technical responses during incidents.

Ways to stand out from the crowd:

  • Proven experience designing and implementing event-driven architectures using pub/sub patterns with platforms like AWS SNS / SQS, Google Pub / Sub, or Azure Service Bus.

  • Knowledge of generative AI architectures (LLMs, diffusion models) and concepts such as Retrieval Augmented Generation (RAG) and vector databases.

  • Hands-on experience with the NVIDIA AI stack (NeMo, Triton Inference Server, TensorRT) for model development, serving, and optimization. Production experience with NVIDIA NIM is a strong plus.

  • Experienced in building and running CI/CD pipelines (Jenkins, GitLab CI) and managed software development environments, applying SRE principles to automate, enhance reliability, and improve performance.

  • Familiarity with Python-based Learning Management Systems (LMS) such as Open edX.


Show more

משרות נוספות שיכולות לעניין אותך

14.10.2025
N

Nvidia Network Site Reliability Engineer United Kingdom, England, Southampton

Limitless High-tech career opportunities - Expoint
Owning the operational aspect of the network infrastructure, ensuring its high availability and reliability. Partnering with architecture and deployment teams to guarantee that new implementations are supportable and align with...
תיאור:
UK, Reading
UK, Remote
time type
Full time
posted on
Posted 25 Days Ago
job requisition id

This crucial role will be focused on user satisfaction and brilliance in Network Operations. This SRE engineer will focus on tackling significant projects and is committed to fostering a supportive atmosphere that offers the mentorship necessary for professional development and growth. They will bring a wealth of skills and experience to be a sought after mentor, who leads by example.

What you'll be doing:

  • Owning the operational aspect of the network infrastructure, ensuring its high availability and reliability.

  • Partnering with architecture and deployment teams to guarantee that new implementations are supportable and align with production standards.

  • Advocating for and implementing automation to reduce toil and enhance operational efficiency.

  • Monitoring network performance, identifying areas for improvement, and coordinating with relevant teams to execute enhancements.

  • Collaborating with SMEs to resolve production issues swiftly and effectively, maintaining customer satisfaction.

  • Identifying opportunities for operational improvements and partnering with teams to develop solutions that drive excellence and sustainability in network operations.

What we need to see:

  • BS degree in Computer Science, Electrical Engineering, or a related technical field, or equivalent experience.

  • Minimum of 8 years of industry experience in network site reliability engineering, network automation, network operations, or related areas. Experience on both campus and data center networks.

  • Familiarity with network management tools such as Prometheus, Grafana, Alert Manager, Nautobot/Netbox, BigPanda

  • Expertise in automating networks using frameworks such as Salt, Ansible, or similar.

  • In depth experience in one or more of the following: Python, Go.

  • Knowledge in network technologies such as TCP/UDP, IPv4/IPv6, Wireless, BGP, VPN, L2 switching, , Firewalls, Load Balancers, EVPN, VxLAN, Segment Routing. Proven track record in network operations.

  • Skills with ServiceNow and Jira

  • Knowledge of Linux system fundamentals is a plus.

  • Systematic problem-solving approach, coupled with excellent communication skills and a sense of ownership and drive.

Ways to stand out from the crowd:

  • Track record of taking operational signals through means such as SNMP, Syslog, Streaming Telemetry to solve operational challenges

  • History of debugging and optimizing code; automating routine tasks.

  • Experience with Mellanox/Cumulus Linux, Palo Alto firewalls, Netscalers and F5 load balancers

  • Previous SRE experience

Show more

משרות נוספות שיכולות לעניין אותך

26.08.2025
N

Nvidia Senior HPC AI Cluster Engineer United Kingdom, England, Southampton

Limitless High-tech career opportunities - Expoint
Designing, implementing and maintaining large scale HPC/AI clusters with monitoring, logging and alerting. Managing Linux job/workload schedules and orchestration tools. Developing and maintaining continuous integration and delivery pipelines. Developing tooling...
תיאור:
UK, Remote
Poland, Remote
France, Remote
Spain, Remote
Germany, Remote
time type
Full time
posted on
Posted 14 Days Ago
job requisition id
What you will be doing:
  • Designing, implementing and maintaining large scale HPC/AI clusters with monitoring, logging and alerting

  • Managing Linux job/workload schedules and orchestration tools

  • Developing and maintaining continuous integration and delivery pipelines

  • Developing tooling to automate deployment and management of large-scale infrastructure environments, to automate operational monitoring and alerting, and to enable self-service consumption of resources

  • Deploying monitoring solutions for the servers, network and storage

  • Troubleshooting and fixing, bottom up from bare metal, operating system, software stack and application level

  • Being a technical resource, developing, re-defining and documenting standard methodologies to share with internal teams

  • Supporting Research & Development activities and engaging in POCs/POVs for future improvements

What we need to see:
  • Bachelor's Degree in Computer Science, Engineering, or a related field; or equivalent experience

  • 5+ years of experience

  • Knowledge of HPC and AI solution technologies from CPU’s and GPU’s to high speed interconnects and supporting software

  • Experience with job scheduling workloads and orchestration tools such as Slurm, K8s

  • Excellent knowledge of Windows and Linux (Redhat/CentOS and Ubuntu) networking (sockets, firewalls, iptables, wireshark, etc.) and internals, ACLs and OS level security protection and common protocols e.g. TCP, DHCP, DNS, etc.

  • Experience with multiple storage solutions such as Lustre, GPFS, zfs and xfs. Familiarity with newer and emerging storage technologies.

  • Python programming and bash scripting experience.

  • Comfortable with automation and configuration management tools such as Jenkins, Ansible, Puppet/chef

  • Deep knowledge of Networking Protocols like InfiniBand, Ethernet

  • Deep understanding and experience with virtual systems (for example VMware, Hyper-V, KVM, or Citrix)

  • Familiarity with cloud computing platforms (e.g. AWS, Azure, Google Cloud)

Ways to stand out from the crowd:
  • Knowledge of CPU and/or GPU architecture

  • Knowledge of Kubernetes, container related microservice technologies

  • Experience with GPU-focused hardware/software (DGX, Cuda)

  • Background with RDMA (InfiniBand or RoCE) fabrics

Show more

משרות נוספות שיכולות לעניין אותך

26.07.2025
N

Nvidia Senior System Software Engineer Defined Networking United Kingdom, England, Southampton

Limitless High-tech career opportunities - Expoint
Design, develop, deploy and operate next generation multi-tenant cloud SDN control and data planes software. DevOps automation tasks for SDN stack - CI/CD, GitOps for secure and seamless integration with...
תיאור:
UK, Remote
Poland, Remote
time type
Full time
posted on
Posted 2 Days Ago
job requisition id
What you’ll be doing:
  • Design, develop, deploy and operate next generation multi-tenant cloud SDN control and data planes software

  • DevOps automation tasks for SDN stack - CI/CD, GitOps for secure and seamless integration with cloud infrastructure components.

  • To complement the efficient networking architecture, you will help designingInfrastructure-as-a-Servicevirtual network orchestration API-driven services to support tenants workloads security and performance SLAs for BMaaS, VMaaS and Kubernetes.

  • You will also develop software for network observability (monitoring and telemetry) to enable intelligent metering and performance analysis for KPIs enforcement for tenants workloads

What we need to see:

  • BA/BS degree in Computer Science, related technical discipline (or equivalent experience), MS preferred

  • 10+ years of experience developing software for large scale distributed environments according to industry standard best DevOps practices

  • Deep understanding of the modern network stack and protocols

  • Hands-on experience developing secure and performant API-driven services (gRPC, ReST with transport encryption and strong authentication)

  • Background in private cloud/large distributed systems architecture design.

  • Experience with modern data center servers and network equipment (out-of-band management, provisioning, monitoring - IPMI, RedFish, zero-touch provisioning)

  • Hands-on experience with SDN - OpenFlow, Open Virtual Switch or equivalent solutions

  • Hands-on experience with one or more SDN solutions (control and data planes)

Ways to stand out from the crowd:

  • Experience with RDMA (InfiniBand or RoCE) protocols and fabrics designs and deployments

  • Understanding of container networking (CNI) APIs and implementations

  • SRE/DevOps: top-level expertise

  • Hands-on background with Tier 1 CSPs (AWS, Azure and others) services and tools

  • Hands-on experience with networking hardware acceleration

Show more

משרות נוספות שיכולות לעניין אותך

Limitless High-tech career opportunities - Expoint
Research novel AI techniques to secure next-generation networks and apply them to existing NVIDIA products. Investigate and analyze network telemetry for indicators of vulnerabilities in secure environments. Collaborate with NVIDIA...
תיאור:
UK, Cambridge
UK, Remote
Germany, Remote
time type
Full time
posted on
Posted 6 Days Ago
job requisition id

What you'll be doing:

  • Research novel AI techniques to secure next-generation networks and apply them to existing NVIDIA products.

  • Investigate and analyze network telemetry for indicators of vulnerabilities in secure environments.

  • Collaborate with NVIDIA researchers to explore innovative ways to improve security in networking products.

  • Contribute to projects that have potential real-world impact on NVIDIA's product portfolio.

What we need to see:

  • Holding a PhD or MSc or equivalent experience in Electrical Engineering, Computer Science, or a related field with a focus on AI.

  • 5+ years of relevant experience.

  • Experience with innovative AI tools, frameworks, and methods related to cybersecurity incident detection and prevention.

  • Background in cybersecurity, networking (TCP/IP), and network security (TLS/IPSec).

  • Solid programming skills and a deep understanding of secure system design.

Ways to stand out from the crowd:

  • PhD with a track record of publication in top peer-reviewed AI conferences or equivalent experience leading AI projects.

  • Expertise with LLMs and recent advancements in neural networks.

  • Architectural knowledge of system security.

  • Understanding of common attack vectors targeting network devices and methods to mitigate them.

  • A proven track record to translate sophisticated research into practical solutions.

Show more
בואו למצוא את עבודת החלומות שלכם בהייטק עם אקספוינט. באמצעות הפלטפורמה שלנו תוכל לחפש בקלות הזדמנויות Network Site Reliability Engineer בחברת Nvidia ב-United Kingdom, Southampton. בין אם אתם מחפשים אתגר חדש ובין אם אתם רוצים לעבוד עם ארגון ספציפי בתפקיד מסוים, Expoint מקלה על מציאת התאמת העבודה המושלמת עבורכם. התחברו לחברות מובילות באזור שלכם עוד היום וקדמו את קריירת ההייטק שלכם! הירשמו היום ועשו את הצעד הבא במסע הקריירה שלכם בעזרת אקספוינט.