Expoint – all jobs in one place
Finding the best job has never been easier
Limitless High-tech career opportunities - Expoint

Microsoft Applied Scientist 
Taiwan, Taoyuan City 
101590605

Yesterday
Qualifications

Basic Qualifications:
Master’s degree or above (or equivalent experience) in Computer Science, Engineering, Mathematics, Physics, or a related field.
Strong programming skills with hands-on experience in managing large-scale data and machine learning pipelines.
Deep understanding of open-source ML frameworks such as PyTorch, vLLM, and TensorRT-LLM (TRT-LLM).
Solid knowledge of model optimization techniques, including quantization, pruning, and efficient inference.

Preferred Qualifications:
1+ years of experience optimizing LLM inference using frameworks like vLLM or TRT-LLM.
Practical experience in model compression and deployment within production systems.
Experience designing agentic AI systems, such as multi-agent orchestration, tool usage, planning, and reasoning.


Responsibilities

Model Optimization & Deployment:
Design and implement efficient workflows for training, distillation, and fine-tuning Small and Large Language Models (SLMs), leveraging techniques such as LoRA, QLoRA, and instruction tuning.
Apply model compression strategies—including quantization (e.g., GPTQ, AWQ) and pruning—to reduce inference costs and improve latency.
Optimize LLM inference performance using frameworks like vLLM and TensorRT-LLM (TRT-LLM) to enable scalable, low-latency deployment.
Build robust and scalable inference systems tailored to heterogeneous production environments, with a strong focus on performance, cost-efficiency, and stability.