Principal Applied Science Manager

Microsoft
United States, Texas, Irving
7000 State Highway 161 (Show on map)
Aug 20, 2025
OverviewBe at the forefront of AI evaluation by joining the Copilot Offline Evaluation Platform team and help us deliver the platform that makes Copilot innovation fast, reliable, and regression-free.As a Principal Applied Science Manager, you will lead a high-impact team of Applied Scientists focused on transforming how Copilot features are evaluated and improved at scale. Your team will drive end-to-end experimentation, offline evaluation, and actionable insights that empower Copilot engineers, product managers, and fellow scientists to deliver world-class AI experiences. You will partner closely with engineering and PM leaders to build a robust data generation platform that simulates realistic user behaviors, curates representative datasets, and develops comprehensive query sets and evaluation tooling. Your team will define and implement metrics that go beyond traditional accuracy-capturing nuance, user intent, and satisfaction. This is a unique opportunity to shape how AI quality is measured across Microsoft Copilot and to accelerate your leadership journey in one of the most dynamic areas of applied AI.Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond. ResponsibilitiesDriving scientific vision and strategy for Copilot offline evaluation, ensuring alignment with product goals such as parity with ChatGPT/Glean, safe deployment, and regression prevention.Leading a team of scientists in the development of metrics, experiments, and simulation pipelines that reflect user preferences and behavioral fidelity at scale.Mentoring and growing talent, fostering a culture of scientific rigor, innovation, and impact. Help scientists balance research, engineering, and business value.Partnering with engineering and PM leaders to translate business needs into scalable, data-driven evaluation frameworks and tools.Championing metric quality by establishing evaluation standards that are trusted across Copilot product teams and correlate with real user experience.Overseeing simulation and synthetic data efforts, ensuring that LLM-generated or agent-driven user activity covers key product scenarios and stress conditions.Communicating scientific insights and strategy clearly to technical and non-technical stakeholders, including leadership, product teams, and partner orgs.