Developing Python libraries to run, monitor, measure, and troubleshoot deep learning workflows running on Dojo
The role will be programming approximately 50% of the time in Python. The remaining time will be spent debugging, experimenting, and investigating
Writing tests to guarantee correctness at every level of the Dojo stack, from high-level pytorch integration, through every stage of the ML compiler stack, down to our custom hardware
Ensuring that neural networks of interest to our users function correctly on Dojo at the expected high performance
Writing tools to run, monitor, measure, and troubleshoot deep learning workflows running on Dojo
Writing bench-marking and reporting tools
Supporting in-house users; triaging errors and performance bottlenecks, driving issues to root cause, providing workarounds and fixes
Anticipating likely use cases; solving problems before our users hit them
What You’ll Bring
Degree in Engineering, Computer Science, or equivalent in experience and evidence of exceptional ability in related fields with practical software engineering experience
Strong proficiency with Python and comfortable with C++
Highly familiar with Linux administration and internals