Finding the best job has never been easier
Share
Develop deeply optimized deep learning kernels for inference.
Conduct performance analysis and modeling to understand the performance limiter of current software stack as well as underlying hardware architecture.
Collaborate with different teams to improve both the software and architectures to extend the state of the art in performance, efficiency, reliability, and programmability.
Work with cross-collaborative teams across automotive, image understanding, speech understanding, and large language models to develop creative solutions.
Pursuing BS or higher degree in Computer Engineering, Computer Science, Electrical Engineer, or related computing focused degree.
Excellent C/C++ programming and software designing skills, including debugging, performance analysis, and test design.
Python experience is a plus.
Performance modelling, profiling, debugging, and code optimization or architectural knowledge of CPU and GPU.
GPU programming experience (CUDA or OpenCL).
Experience working with deep learning frameworks like TensorFlow or PyTorch.
Strong curiosity about artificial intelligence, awareness of the latest developments in LLMs, generative, and recommender models.
These jobs might be a good fit