Expoint – all jobs in one place
The point where experts and best companies meet
Limitless High-tech career opportunities - Expoint

Microsoft Lead AI Software Architect 
Taiwan, Taoyuan City 
446929729

02.09.2025

Required/minimum qualifications

  • Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
  • 7+ years of industry experience, with at least 5 years in AI inference software stack development and architecture.
  • 5+ years of experience in designing and optimizing software stacks for specialized AI hardware, including accelerators, GPUs, or custom ASICs.
  • 3+ years of experience building infrastructure and identify the opportunities for end2end Perf/TCO optimization for business critical AI workloads
  • 3+ years of experience with AI inference frameworks and compiler toolchains such as TensorRT, ONNX Runtime, MLIR, or similar.

Other Requirements

  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter

Preferred Qualifications

  • Familiarity with open source AI inference SW stacks like vLLM, Dynamo, sglang.
  • Experience contributing to open-source AI frameworks or compiler projects.
  • Previous experience in leading the AI software stack for an early-stage hardware startup or novel hardware project.
  • Publications, patents, or other recognized contributions in the field of AI inference software architecture or acceleration
  • Exceptional leadership, communication, and collaboration skills with a proven track record of guiding technical teams.
  • Excellent understanding of hardware-software interaction, memory hierarchies, compute kernels, and data movement optimization.
  • Proficiency in C++, Python, and experience with low-level programming, performance optimization, and system-level integration.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:Microsoft will accept applications for the role until Septmeber 9th, 2025.


Responsibilities
  • Lead the SW architectural design, development, and deployment of the future AI inference infrastructure optimized for Microsoft’s AI cloud.
  • Collaborate closely with hardware architecture, compiler, systems, simulation/perf optimization to ensure seamless integration and optimized performance.
  • Define and execute strategies for inference , cost optimizations, workload balancing, and memory optimization.
  • Mentor and guide the software engineering team, setting clear technical directions and providing architectural oversight.
  • Evaluate, select, and integrate third-party libraries and open-source frameworks (e.g., TensorRT, TVM, PyTorch, ONNX) for optimized inference performance.
  • Act as a technical liaison between hardware engineers and software teams to communicate requirements, constraints, and opportunities for co-design.
  • Identify performance bottlenecks and opportunities to intersect future hardware and system roadmap planning, influencing strategic direction.
  • Ensure robust software quality and implement best practices for software engineering, testing, and continuous integration.