Required/Minimum Qualifications:
- Bachelor’s degree in computer science, or related technical discipline AND 4+ years technical engineering experience building web services with coding in languages including, but not limited to, Python, C#, C++, Rust, Java
- OR equivalent experience.
- 4+ years of experience working with high-scale training clusters (ex. working with frameworks/tools such as nvidia InfiniBand clusters, SLURM, Kubernetes, Ray, etc.)
- 4+ years' experience building scalable services on top of public cloud infrastructure like Azure, AWS, or GCP.
Preferred Qualifications:
- Experience with LLM training clusters.
- Experience working with AI platforms, frameworks, and APIs.
- Experience using Machine Learning frameworks, including experience using, deploying, and scaling language learning models, either personally or professionally.
- Ability to identify, analyze, and resolve complex technical issues, ensuring optimal performance, scalability, and user experience.
- Dedication to writing clean, maintainable, and well-documented code with a focus on application quality, performance, and security.
- Demonstrated interpersonal skills and ability to work closely with cross-functional teams, including product managers, designers, and other engineers.
- Ability to clearly communicate complex technical concepts to both technical and non-technical stakeholders.
- Passion for learning new technologies and staying up to date with industry trends, best practices, and emerging technologies in web development and AI.
- Ability to work in a fast-paced environment, manage multiple priorities, and adapt to changing requirements and deadlines.
- Proven ability to collaborate and contribute to a positive, inclusive work environment, fostering knowledge sharing and growth within the team
Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: