Expoint – all jobs in one place
The point where experts and best companies meet
Limitless High-tech career opportunities - Expoint

Microsoft Senior Applied Scientist AI Data Platform CoreAI 
Taiwan, Taoyuan City 
104623287

Yesterday

datasets that power model

is to build aAI modelwith secure, reusable, and compliant datasets.


Required Qualifications

  • Bachelor's Degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 4+ years related experience (e.g., statistics predictive analytics, research) OR Master's Degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 3+ years related experience (e.g., statistics, predictive analytics, research) OR Doctorate in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 1+ year(s) related experience (e.g., statistics, predictive analytics, research) OR equivalent experience.
  • 2+ years of experience applying machine learning or data science in practical settings.
  • Programming skills in Python and ML frameworks (e.g.,PyTorch, TensorFlow, Scikit-learn).
  • Experience with data analysis, dataset design, or evaluation methodologies.

Other Requirements:

Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings:

  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years

Preferred Qualifications

  • Master’s degree or PhD in Computer Science, Machine Learning, Statistics, or related field, or equivalent experience.
  • 4+ years of experience applying machine learning or data science in practical settings.
  • Experience with LLM training pipelines, synthetic data generation, or data-centric AI approaches.
  • Knowledge of PII detection, data privacy, fairness, or compliance in AI systems.
  • Familiarity with distributed data systems (e.g., Spark, Databricks, Azure Data Lake).
  • Strong collaboration skills with engineers, TPMs, and product partners across multiple orgs.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:

CoreAI

Responsibilities

Responsibilities

  • Advancing machine learning and data science to improve data quality, automate dataset generation, and design intelligent agent-driven services that manage the end-to-end data lifecycle.
  • Develop ML-based pipelinesfor data generation, validation, augmentation, and discovery (e.g., synthetic data, human-in-the-loop workflows).
  • Design and train intelligent agentsto automate key parts of the dataset lifecycle, including ingestion, validation, PII detection and handling, governance, discovery, and feedback loops.
  • Build evaluation methodsto measure dataset quality, coverage, and usefulness for large-scale model training.
  • Leverage AI/ML techniques(e.g., classification, clustering, anomaly detection, embeddings, LLM-based evaluation) to improve data discovery, curation, and governance.
  • Collaborate with engineersto integrate scientific methods and models into scalable pipelines and platform services.
  • Partner with AI product and research teams(CoreAI, MAI, M365, GitHub, MSR, and more) to align datasets with model training needs andidentifynew opportunities.
  • Contribute thought leadership