Expoint - all jobs in one place

The point where experts and best companies meet

Limitless High-tech career opportunities - Expoint

Microsoft Research Intern - AI Systems Architecture 
United States, California, Mountain View 
639438459

07.01.2025
Required Qualifications
  • Accepted or currently enrolled in a PhD program in Computer Science or a related STEM field.
  • At least 1 year of experience with performance analysis for AI accelerators.

Other Requirements

  • Research Interns are expected to be physically located in their manager’s Microsoft worksite location for the duration of their internship.
  • In addition to the qualifications below, you’ll need to submit a minimum of two reference letters for this position as well as a cover letter and any relevant work or research samples. After you submit your application, a request for letters may be sent to your list of references on your behalf. Note that reference letters cannot be requested until after you have submitted your application, and furthermore, that they might not be automatically requested for all candidates. You may wish to alert your letter writers in advance, so they will be ready to submit your letter.
Preferred Qualifications
  • Ability to collaborate effectively with other researchers and product development teams.
  • Proficient interpersonal skills, cross-group, and cross-culture collaboration.
  • Ability to think unconventionally to derive creative and innovative solutions.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:

Additional Responsibilities
  • Responsible for developing and contributing to an in-house performance modeling tool for large scale machine learning systems.
  • Responsible for evaluation of ideas for performance improvement along with bottleneck analysis and feature enhancement.
  • Responsible for building framework for running large scale parallel performance simulations using cloud-based compute infrastructure.
  • Developing a testing framework and testbenches for enabling operator level unit tests and end-to-end application tests for the performance model.
  • Integrate performance model with power & TCO model to project application level Perf/W and Perf/$ metrics across workloads.
  • Develop cloud-based performance simulation database for storing large scale data from design-space exploration experiments.
  • Develop data-analytics framework along with debug tools and automation for easier retrieval of performance data based on user queries.
  • Develop and maintain performance dashboards and visualization tools for improving the analysis framework.
  • Formalize and improve general software development practices including codebase maintenance, code review, feature development and software design reviews.
  • Integrating CI/CD pipeline into Azure devops software development process.
  • General troubleshooting and debug processes including common performance bottleneck limiters and developing performance comparison tools.
  • Collaborate with larger team to define product requirements, feature improvements and implementation.