Expoint - all jobs in one place

המקום בו המומחים והחברות הטובות ביותר נפגשים

Limitless High-tech career opportunities - Expoint

Nvidia System Software Engineer Intern Apache Spark Solutions - 
China, Shanghai 
111218319

01.12.2024

We are seeking System Software Engineer interns to accelerate Apache Spark and related distributed frameworks on GPUs. Apache Spark is the most popular data processing engine in data centers for data science. It is used for a wide variety of data workloads, from data preparation, to running ML experiments, and all the way to deployment of ML applications. Data scientists spend a considerable amount of time exploring data and iterating over machine learning (ML) experiments. Every hour of compute required to sort through datasets, extract features and fit ML algorithms impedes an efficient business workflow.

At NVIDIA, we are passionate about working on hard problems that have an impact. You will work with the open source community to enable Apache Spark data processing with GPUs. As an intern you will be paired with a senior engineer and assigned one or more projects to further the performance and functionality of our software. The projects will give you an opportunity to work with modern functional and modern C++. We will benchmark solutions and measure performance against theoretical optimal results. Data workflows can benefit tremendously from being accelerated, enabling data scientists to explore many more and larger datasets to achieve their business goals, faster and more efficiently.

What you'll be doing:

  • Creating a collection of GPU accelerated libraries for data processing, data analytics and ML

  • Designing and implementing solutions to enhance Apache Spark for GPU aware scheduling, distributed ML execution and beyond

  • Engaging open source communities, including Apache Spark and, for technical discussions and contributions

  • Working with NVIDIA strategic partners to deploy sophisticated machine learning and data analytics solutions in public cloud or on-premise clusters

  • Presenting technical solutions in industry conferences and meetups

What we need to see:

  • Pursuing a BS, MS, or PhD in Computer Science, Computer Engineering, or closely related field

  • Outstanding problem solving skills

  • Excellent programming skills in C++, Java, and/or Scala

  • Working experience with key open source big-data projects including Apache Spark, Apache Hadoop, Apache Flink, and Apache Kafka

  • Able to work successfully with multi-functional teams across organizational boundaries and geographies

  • Highly motivated with strong communication skills

Ways to stand out from the crowd:

  • Prior contribution at major open source distributed system projects would be a huge plus

  • Working experience with acceleration libraries (CUDA, RAPIDS, UCX) is helpful