Maintain large scale HPC/AI clusters with monitoring, logging and alerting Manage Linux job/workload schedulers and orchestration tools. Develop and maintain continuous integration and delivery pipelines. Develop tooling to automate deployment...
Primary responsibilities will include building AI/HPC infrastructure for new and existing customers. Support operational and reliability aspects of large-scale AI clusters, focusing on performance at scale, real-time monitoring, logging, and...
Develop HPC solutions using NVIDIA SDKs like Modulus and CUDA for physics simulations. Design GPU-accelerated workflows for complex simulations in optics, electromagnetics, and fluid dynamics. Deploy AI models for physics...
Supporting Higher Education and Research customers drivingresearch/influencing/newbusiness in those customers and academic collaborations. Co-work withprofessors/researcherson their GPU computing research works and assisting customer in building innovative solutions. We make heavy...
Maintain large scale HPC/AI clusters with monitoring, logging and alerting Manage Linux job/workload schedulers and orchestration tools. Develop and maintain continuous integration and delivery pipelines. Develop tooling to automate deployment...