Inside Microsoft’s Core AI Strategy: Infrastructure, Security, and the Future of AI Workflows

Q: Building the Core AI Team

- **Origins**: Core AI was created in early 2024 and publicly announced at Microsoft Build in May. - **Mission**: Provide a unified stack that helps builders, developers, and enterprises create, deploy, and monitor AI agents. - **Key Components**: - **Foundry/Agent Factory** – the platform where AI agents are built and observed. - **Security‑by‑Design** – trust and compliance baked into every layer because agents are non‑deterministic. - **Flexible Deployment** – workloads run in the cloud, on the edge, or on‑premises depending on geography and sector.

Q: Why Return to the Office?

- **Rapid Innovation**: AI tools evolve weekly; face‑to‑face interaction accelerates learning and idea sharing. - **Collaboration on Prompts**: Teams can quickly iterate on prompt engineering, context scaffolding, and complex task design. - **Cultural Transformation**: The shift to AI‑augmented work requires continuous mentorship, coaching, and knowledge transfer that is most effective in person.

Q: AI‑Powered Role Convergence

- **Blurring Boundaries**: Engineers, designers, product managers, and even low‑level system staff can now prototype UI, fix bugs, or generate code using AI assistants. - **Two User Archetypes**: 1. **Amazed Users** – low expectations, use AI sparingly, often surprised by results. 2. **Frustrated Power Users** – high expectations, push AI to complex tasks, iterate on models, context‑engineer, and fine‑tune. - **Outcome**: More functions across the company can participate in the full software lifecycle—from concept to deployment.

Q: Data‑Center Constraints & GPU Utilization

- **Power vs. GPU**: In the U.S., power availability is becoming a tighter bottleneck than GPU supply; some regions face moratoriums on new data‑centers. - **System‑Level Scaling**: AI agents generate many non‑GPU calls (storage, networking, CPU), increasing overall infrastructure demand. - **Optimization Efforts**: - Continuous profiling of CPU, GPU, memory, bandwidth. - Leveraging diverse workloads (Microsoft 365, GitHub, third‑party customers) to improve efficiency. - Partnerships to secure additional power and hardware capacity.

Matthew Berman

Summary Date: 2025-12-01 09:34:02

•

4 min read

Inside Microsoft’s Core AI Strategy: Infrastructure, Security, and the Future of AI Workflows

Overview

Jay Periq, Executive Vice President of Core AI at Microsoft, discusses how the company is reshaping AI infrastructure, security, and developer experience. The conversation covers the formation of the Core AI team, the push for in‑person collaboration, role convergence, data‑center constraints, model efficiency, open vs. closed‑source choices, and emerging security threats.

Building the Core AI Team

Origins: Core AI was created in early 2024 and publicly announced at Microsoft Build in May.
Mission: Provide a unified stack that helps builders, developers, and enterprises create, deploy, and monitor AI agents.
Key Components:
Foundry/Agent Factory – the platform where AI agents are built and observed.
Security‑by‑Design – trust and compliance baked into every layer because agents are non‑deterministic.
Flexible Deployment – workloads run in the cloud, on the edge, or on‑premises depending on geography and sector.

Why Return to the Office?

Rapid Innovation: AI tools evolve weekly; face‑to‑face interaction accelerates learning and idea sharing.
Collaboration on Prompts: Teams can quickly iterate on prompt engineering, context scaffolding, and complex task design.
Cultural Transformation: The shift to AI‑augmented work requires continuous mentorship, coaching, and knowledge transfer that is most effective in person.

AI‑Powered Role Convergence

Blurring Boundaries: Engineers, designers, product managers, and even low‑level system staff can now prototype UI, fix bugs, or generate code using AI assistants.
Two User Archetypes:
Amazed Users – low expectations, use AI sparingly, often surprised by results.
Frustrated Power Users – high expectations, push AI to complex tasks, iterate on models, context‑engineer, and fine‑tune.
Outcome: More functions across the company can participate in the full software lifecycle—from concept to deployment.

Data‑Center Constraints & GPU Utilization

Power vs. GPU: In the U.S., power availability is becoming a tighter bottleneck than GPU supply; some regions face moratoriums on new data‑centers.
System‑Level Scaling: AI agents generate many non‑GPU calls (storage, networking, CPU), increasing overall infrastructure demand.
Optimization Efforts:
Continuous profiling of CPU, GPU, memory, bandwidth.
Leveraging diverse workloads (Microsoft 365, GitHub, third‑party customers) to improve efficiency.
Partnerships to secure additional power and hardware capacity.

Model Efficiency and Routing

Cost‑Latency Trade‑offs: Enterprises use large frontier models for high‑value tasks and smaller, fine‑tuned models for routine workloads.
Model Router: A Microsoft service that selects the optimal model based on cost, speed, or quality preferences, removing the decision burden from customers.
Enterprise‑Specific Fine‑Tuning: Companies can bring their own data to open‑source or proprietary models, improving ROI and handling domain‑specific challenges.

Open vs. Closed Source Models

Choice Over Dogma: Microsoft supports >11,000 models in its Foundry platform, allowing customers to pick open‑source, closed‑source, or custom models.
Advisory Approach:
Assess the customer’s current AI maturity and business goals.
Provide proof‑points from similar deployments.
Offer packaged solutions and partner‑driven guidance.
Future Outlook: No single “omni‑model” will dominate soon; diverse problem domains (healthcare, finance, climate) demand specialized models.

Security, Trust, and Attack Vectors

Unknown Threats: The biggest concern is attacks that have not yet been identified; mitigation focuses on rapid detection and response.
Built‑In Controls:
Every AI agent receives an Entra ID for policy enforcement and auditability.
Fine‑grained access controls, compliance tracking, and the ability to deactivate rogue agents.
End‑to‑end observability of tool calls, data accessed, and human‑in‑the‑loop approvals.
AI‑Assisted Hacking: Acknowledges the risk of AI being used for model‑weight theft, poisoning, or autonomous hacking, reinforcing the need for proactive security design.

A Contrarian Take on AI Metrics

Lines of Code Myth: Measuring AI impact by the number of generated code lines is meaningless.
Focus on Outcomes: Real value lies in reducing technical debt, accelerating product cycles, and enabling tasks that were previously infeasible.

Looking Ahead

Microsoft’s Core AI team is positioning the company to: - Deliver a vertically integrated stack that abstracts complexity for developers. - Enable rapid, secure, and cost‑effective AI deployment across cloud, edge, and on‑premises. - Foster a culture of continuous learning and collaboration, both in‑person and through AI‑augmented tools.

Microsoft’s Core AI strategy combines a unified, security‑first platform, flexible deployment models, and a strong emphasis on collaboration to accelerate AI adoption while managing power constraints and evolving security threats.

Full Transcript

Here's my interview with Jay Periq, the
executive vice president of core AI at
Microsoft. In this interview, we talk
about AI infrastructure, whether there
are any dark GPUs, whether energy is a
limiter, of course, given his experience
in security. We talk about AI hacking
and AI security in general, and why his
team is returning to office fulltime.
Enjoy.
>> Jay, thanks for joining me today.
>> Yeah, it's great to be here, man. Thank
you. So you joined Microsoft about a
year ago,
>> correct?
>> Uh you report directly to Satia. You're
overseeing the newly combined core AI
team, the combined developer division,
core infrastructure teams. First, like
break down what those two teams did and
now what this new core AI team does
under your leadership.
>> Absolutely. So we created the core AI
team early this year and January of this
year. And in May this year at Microsoft
Build the conference, we rolled out this
product strategy, this vision, right?
And this was to think about in the new
world of AI, what is it going to take to
help builders, help developers, help
enterprises be successful with this
technology, right? And there's just a
lot of different pieces. There's all
these different models. There's
different sort of frameworks out there.
Now we have different protocols like
MCP. And then how do you put all this
together in a new stack? And I I say
stack kind of in in quotes because it's
evolving. It's not something that we
have, oh hey, here's the blueprint. We
know how to build it. You know, it's
just cookie cutter thing right now. So
it's changing kind of every week. It's a
different way of working. It's a
different way of engaging with our
customers. And so what we're focused on
at the top layer is reinventing,
re-imagining all the tools that you need
to build software a different way in
this AI era. That sits on top of a
platform that we call Foundry or our
agent factory. That is where these
agents, these AI applications are going
to be built. They're going to be
deployed. They're going to be the places
where you can observe those those
entities, those agents in the enterprise
workplace in that in that space. And
then for organizations big or small,
security and trust for AI has to be
baked in from the start, right? because
these are things that these agents,
these AI applications are not as
deterministic as software as we've built
in the past where we could go through
kind of a compliance checklist or a set
of kind of security policies to go and
make sure that you set up correctly.
Here you've got some bit of thinking,
planning, reasoning, calling out to
different tools, different access to
different data. So these things are
going to do more complex tasks in the
enterprise. And so security and
trustworthy AI has to be there from the
start. And then for our enterprises, we
want to give them a flexible deployment
strategy. So a lot of this stuff today
runs in the cloud. But as we think about
different types of sectors, different
types of geos in the world, some of
these agentic applications are going to
run in the edge in edge devices, right?
So that programming model of what we're
building has to span all four of those
areas. As you thought about bringing
these two teams together, what was your
vision for having this singular team?
What was the reason for for really like
uh needing to bring these two different
parts of the organization together? And
then how do you have a cohesive culture
with two separate teams? I know it's all
under Microsoft, but potentially like
aligning that culture. Uh how did you
think about doing that?
>> Yeah, absolutely. I think the it's in
some ways it's simple because the goal
the end goal for us is to serve bu
builders serve developers right so
everything has to acrue to serving this
persona developer but even the notion of
a developer today is changing right and
that's why I refer to folks builders as
as builders and so everything we do in
terms of our product the way we we work
together the way we measure our progress
the way we engage with the end user with
the companies that we work with is about
really looking through the entire tech
stack to like make sure this stuff all
adds up right in a way where it's going
to be easy for developers to get their
arms around this technology to be able
to really use this to really ramp up
creativity and collaboration and not be
stumbling around with a lot of things
that don't connect or they're hard or
they're insecure or they're not
observable or they're just missing
parts, right? That's why this like in
some ways is a vertically integrated
organization that's end goal is to and
our focus is to empower everyone every
builder out there to shape the future
with AI and these are the components
that I think are necessary to put
together to accomplish that focus. Okay.
So, right before we started rolling, we
were talking about building an in-person
culture and Microsoft and your team
starting in the new year is going to be
fully in person again. Talk a little bit
because talk about what we were talking
about earlier. Why why is that so
important?
>> Yeah, absolutely. I think it's you know
listen we're we're in this journey right
this decade in terms of we started off
this decade with uh with you know early
uh in COVID and coming out of that and
everybody was at home and then some
people stayed remote some people came
back early and I believe and I think
we're excited to have people be able to
collaborate be able to create and be
able to mentor coach learn from each
other in person right because in some
ways so much of being able to use these
AI tools is learning from each other,
right? Because the stuff is changing so
fast, right? And somebody may find a
unique way to use co-pilot to solve this
type of task and then you want to
broadcast. You want other people to
share, right? Or it's like, hey, we
should really push this AI system to do
a more complex thing. Let's do it as a
team. let's strategize for how we might,
you know, prompt this thing or provide
more context or build some other
scaffolding that might be missing, but
let's do that as a team. So, one, the
technology is just changing so so so
fast. And I think being in person
enables us to learn faster so that we
can really stay along that exponential
trajectory that this technology is on
right now. Yeah, that's so interesting
that you say that a lot of enterprise
companies are still trying to figure out
how to deploy AI. Even individuals,
they're trying to figure out, okay, how
do I use this day-to-day? And we're so
early. And it's it's fascinating to hear
that even you guys the kind of the edge
of where this technology is. You're also
figuring out how to leverage artificial
intelligence internally. Is that is that
what I heard?
>> We need to be progressing as fast as the
technology is progressing, right? And
Microsoft, the organization, covers a
lot of different products. It covers a
lot of different enduser personas.
Whether it be somebody in human
resources or in the finance team or in
an engineering team or a security op
center, we serve all of those different
personas. And we're trying to build our
products in a way that bring AI as a
superpower to every one of these
different departments, every one of
these end users. So that in many ways
for example for us is in corei we have
this focus on a program called nge
thrive inside of nge thrive this program
there's three pillars here but a big
part of it Matt is what we're trying to
do is understand how we're spending our
time and what can we do to free up kind
of run the business and other time so
that it can be reallocated to creative
time where we could be making products
improving improving our products,
delivering more value to our customers,
to our partners, but we can use AI to
actually be more time efficient, right,
with run the business stuff, with
administrative work that we're doing.
And in many ways, for example, today if
you're a product team, you come up with
a maybe a prototype for something,
right? And then you like go through
that, you come up with a prototype and
then you go get some user feedback on it
and then you iterate and then you come
up with prototype two and then you keep
doing that. Well, now with the power of
AI, I can write all of this. I can
specify this in a do like a word
document, right? And then I can tell the
AI tell the agents, hey, build me five
prototypes now all at the same time. So
now I have five prototypes instead of
one that I can go and iterate take the
best of all of these engage different
consumers different end users to get
that feedback faster right and so
product making I think one we get to
explore more of that creative space we
can get faster feedback that tighter
loop of feedback from customers from end
users which then ultimately means that
we can help our end customers be more
successful with AI if we
do that different way of working, right?
And and a big part of like when I talk
to customers and partners out there, I
would say like 90% of the conversations
end up in talking about the cultural
transformation like and I mean culture
and like the way we work
>> Yeah. is changing and everybody is
trying to learn from each other in terms
of how is this technology
enabling us to work differently changing
the way we work together and how do we
learn from that because I don't think
anybody has like the perfect recipe
right but we got to stay curious we got
to be experimenting all the time and we
got to be openly sharing what's working
what's not working what do we need to
tweak what are we going to do more of if
we like this new pattern
>> let's continue on changing work. Uh I'm
wondering what you're seeing internally.
I've heard a lot of people talk about
converging roles, whether it's product
management, uh product development,
developers, designers, like these roles,
the the the lines, the delineation
between them is blurring every day with
artificial intelligence. What are you
seeing internally?
>> Absolutely, we're seeing that too. We're
seeing different functions. I would say
one collaborate much more closely
because they can use these tools to
understand and you know do things that
maybe were maybe unreachable for them or
unapproachable you know and now you can
have somebody who's maybe a lower level
systems engineer who may not have really
understood how to do UI like a website
or something like that being able to be
like hey I have this idea for how this
new feature could show up to our end
users let mock that up, right? And now
you can just use any AI tool and prompt
it, write something and generate
concepts, right? And generate visuals
that you can show other people in the
team to get feedback on. The same thing
happens where we have folks maybe in
product management or in design or other
functions where they are able to fix
bugs just by prompting it assigning to
co-pilot and the pull request you know
is there and maybe the engineer still
has to review it and make sure all of
the code it fits our standards right so
that I think is super cool that
everybody's able to learn more of that
continuum of software and being able to
create, being able to build something,
being able to deploy it, and being able
to operate it that's accessible to more
and more functions in the company.
>> Is there a particular role in which
you've seen become really superpowered
because of AI? Something where maybe it
surprised you?
>> I think it's less about the role. I
think it's more about the individual,
right? So, I think it's more about what
the individual brings. So as I think
about this, you know, I think about sort
of like two groups of people, maybe
three groups of people, but you know, we
have kind of one group of people who
generally, and I think everybody can
selfidentify with this, and most people
are in tech or probably somewhere in
between, but there's sort of group one,
which is, you know, every time you use a
co-pilot or you use an AI powered tool,
you're absolutely just astonished and
amazed and surprised as how like
incredible this thing is, right? You're
just like, "Oh my gosh, this thing did
blah." Right? And you're like surprised.
Then there's like group two of people
and group two people are most of the
time frustrated that the AI tool or
co-pilot isn't
delivering on what they expect of it,
right? And usually group one, your
ambitions, your expectations for AI are
too low and you're using it not
frequently enough. Group two is you're
using it a lot and you're pushing your
understanding and the capabilities of
the AI. And the thing is that's really
interesting there is because people will
stay curious. They'll try different
models. They'll try different techniques
for context engineering. They'll learn
things like eval fine-tuning and more
about reinforcement learning. And so
that curve of like exponential
advancements that's happening in AI,
they're more riding that curve of
learning, whereas the people who are
using it in group one probably using it
infrequently and are too much amazed
with the with the tools versus having
some active like frustration with it
because you're giving it you're pushing
it for very complex tasks and it's not
delivering all the time for you.
>> Yeah. I mean it's such an exciting time
because it is so new. There's not many
golden paths to follow. So you are kind
of forging these new paths, these new
directions in which to use these tools.
It is quite exciting for the curious
person as you said.
>> Yeah. And we have, you know, teams doing
a whole range of things. We have teams
that are very very I think advanced in
how they're using AI and they are
generating lots and lots of like output
right like new things they're building
they don't look at a single line of code
because they've become masters in how
they do the context engineering and then
how they do the verification and the
validation of that right so code is you
know for them like a intermediary state
and it's not just delegating a a simple
task to one agent. They're actually
orchestrating and organizing a team of
agents, right? So, they're able to fire
off like different types of agents.
Those agents are all working together on
this complex task that they specify
upfront and then they create the
verification systems and even some of
them are pretty cool because the
verification agents will like feed back
back to the coding agents to like fix
its own problems. So, I want to I want
to switch gears for a moment and talk
about powering all of these uh amazing
innovations. Um, you know, when you
think about the data centers, first of
all, congratulations on Fairwater.
>> It's exciting.
>> Yeah, very cool. You kicked that off
last week, I believe, and um or opened
it up last week. Like, what are the
biggest constraints today? I'm hearing
uh you know a lot of a lot of folks talk
about either like GPU constraints but it
seems more and more like power
constraints are the true constraints
especially in the US. Is that what
you're seeing? I think it depends in
terms of what people's ambitions and
plans are right and I think there is a
lot of supply chain scaling up that's
happening whether it be power land
whether it be kind of the big heavy
equipment transformers those types of
things that go into running a hypers
scale or a big scale data center right
and those are all things that the
industry is obviously rallying on and
you know people are scaling filling up
their manufacturing and we're all
teaming up to kind of tackle those
changes and it depends on kind of the
geo in the in the in the world as well
right and what might be the constraints
here in the US versus what might be true
in Europe. There are some countries in
the world that have put you know
moratoriums on not building any more
data centers because they rode that boom
you know for a really long time for a
while and now there's a pause for a
while right whereas others are you know
open for business and want to invest and
build and scale more more rapidly right
now right so I think this is not
atypical of some like infrastructure
um I guess acceleration boom right now
that we're seeing but it's exciting
because I think a lot of really
interesting engineering challenges are
being sorted out in terms of how to do
these things cheaper, better, faster.
There's another aspect of this which is
the actual hardware kind of what these
AI systems required last year versus
this year versus next year is changing
right even if you look at say Nvidia's
hardware roadmap right you look at the
generational differences and what they
will do to the entire system from
cooling to power to what that network
looks like and now even with these more
advance advanced agents that we're
building and starting to deploy in the
enterprises. It's interesting because
these agents actually make a lot of tool
calls. they talk to a lot of other
systems and that then drives up the
amount of normal compute and storage and
network not GPU load right so the more
these agents become cap more capable in
the enterprise right and they're able to
tackle higher what I would say higher
ROI workflows or tasks or programs then
they do need to interact with a lot of
these other enterprise systems Those
enterprise systems are all the things
we've been building for decades, right?
But now you're able to get a lot more
throughput and it drives up the
utilization of the conventional stuff
too. So we have a whole kind of system
scaling problem here. And as much as
there is a lot of like money and focus
on kind of the GPU and AI data center,
it is a system that is evolving and and
growing and advancing at a very rapid
rate beyond just you know the the the
chip, the GPU or a particular AI data
center.
>> I think that makes a lot of sense. But
of course like over the last few weeks
in particular there's been a lot of
discussion about AI bubble and
infrastructure investment and um like uh
Gavin Baker made this analogy to the
fiber roll out in the mid9s right so as
the fiber was getting rolled out for the
internet 95% of it was dark but then he
said today there are no dark GPUs. Uh
but then you know Satia just a couple
weeks ago said well that's true but we
actually can't source enough power and
we actually have GPUs kind of just
sitting idle waiting for that power. So
I is that it like maybe talk a little
bit about what the current state of GPU
utilization is within Microsoft. Yeah,
for us the focus on making sure all of
our workloads, right, and this doesn't
just apply to GPUs, it applies to CPUs
as well, efficiency of any workload
across Microsoft is a top priority for
us, right? So one, we have focused
efforts. We have focused teams. We have
an incredible amount of like science
that goes into understanding how the
different components, how the different
services, how the technology, right?
It's like the CPU, the GPU, but you have
network, you have band, memory,
bandwidth, you have all these different
components that you are managing, you're
inspecting, and then you're doing
optimizations and maybe the way you
train a model, the way you do
inferencing. And the neat thing about
for for Microsoft that I find is we have
a lot of different workloads, right? We
have our first party workloads like M365
or GitHub, etc. And we have all of these
third-party workloads from our
customers. And so being able to optimize
these different be able to learn from
these different workloads how they're
changing. And then to be able to
optimize our entire stack, the hard the
hardware stack and the software stack
and the application stack to be able to
get the most I guess use out of these
GPUs is something that we spend a lot of
time on and are measuring, adjusting,
tweaking, improving every day.
>> Yeah. Um, but like specifically is there
enough energy to power all of the GPUs
that Microsoft has for now and then as
like the next 12, 18, 24 months happen?
>> There's a lot of constraints there in
terms of how you do that, right? Because
part of it is making sure that you're
building this physical plant, this
capacity
in a way that is grounded in the demand
curve as well. Right? So keeping those
in some rough similar shape together is
an important thing that we do right and
you know I think there are times where
you'll find maybe you have a surplus of
something sometimes you have a shortage
of something right and that's just an
active thing that the teams are always
looking for so you know Satia said hey
we're short on power you know there's
lots of people working to go find more
power to light up more GPU use right and
then whatever GPUs and infrastructure we
have we want to keep getting the most
out of it as as we possibly can. So
that's where we partner, we work with
lots of other folks, our partners here,
but then put a lot of effort internally
to optimizing that stack.
>> And and how important do you think model
efficiency is going to be a kind of the
broader picture of the data center
rollout, the energy requirements, these
models seemingly are getting smarter,
smaller, more efficient. Ho how much how
important is that in the total equation
as you think about rolling out uh AI to
enterprise?
>> Yeah. And you know just on the the last
topic I think it's maybe just uh to
worth calling out that you you can
always talk about how much more power
you need or there isn't enough power.
But often times like you you have for
the power we have turned up with GPUs
plugged in doing work. You want to get
the most out of the power you have
because that is like the most cost
effective and the fastest from a lead
time perspective, right? Because we ma
we manage a massive GPU fleet, right?
And if we can go find, you know, a
fraction of a percent of an improvement,
then that unlocks a lot of capability.
It allow unlocks a lot of supply to feed
our co-pilot family of products or to,
you know, serve our customers. So that
is something where these two things
really do balance each other out, right?
Because building these data centers
takes time. But whatever we can do on
making our own infrastructure that kind
of vertical
uh efficiency kind of program making it
more and more efficient over time. Now
on the model uh front I think the
trajectory is very clear right because
people do care about cost they do care
about latency and you're absolutely
right Matt that there are going to be
and I see this in the enterprise there
are workloads that require the big
slower expensive models and then there
is a whole shape of different workloads
in the enterprise where they can use a
smaller more targeted or more kind of
rosp specific specific or jobspecific
model and they'll usually start with a
smaller open- source model and they'll
fine-tune that or do some distillation
of that model to serve that agent or
that AI use case. Right? So often times
what happens is new applications land at
in the some big model then as they learn
and then they want to put more workload
into it then they will optimize the
frameworks optimize the model optimize
the runtime to really get faster
performance and lower cost right and to
keep within some tolerance of accuracy
security safety as well right that's all
part of the enterprise those things are
super important so they don't lack acts
on those those elements. And so like
okay as you're talking to your clients
and as you're thinking about deployment
for them like for their like cutting
edge sophisticated use cases they're
taking the top models the frontier
models that tend to be a little bit
higher latency a little bit higher cost
but then as they're figuring out what
works you you're saying they can almost
verticalize they can take more specific
models for that task smaller models and
then almost in parallel the smaller
models are getting better right they're
learning from the larger models. And so
is that is that kind of the trend you're
seeing?
>> And I would add to those two elements in
two in two ways, which is one is, you
know, what we're bringing to our
enterprises is that platform that allows
them to manage a whole different range
of models for their different
applications, right? Because they have
complex deployments, they have complex
jobs to be done. And for example, we
have a capability that we call model
router. And you know, it might be hard
for some enterprises to figure out when
they're working to scale up their
ambition and their projects with AI. And
it may be hard for them to know which
models to go pick for each application.
Well, we have a capability that we call
model router. And you can just send your
application, your agent through the
model router and you can provide it some
input. Do you care about low cost? Do
you care about fast performance? do you
care about just absolute best quality,
you know, and so you can set those dials
and then the router will pick the
underlying models that it's been trained
to understand the characteristics of
these models. So these are things that
we want to take that overhead
off of the enterprises plate to help
them be able to use multiple models to
optimize for each workload. The other
category I would add in here is that in
the enterprise these models get smarter
not just based on what the labs are
doing or what's happening in open source
but these models get smarter because you
bring your enterprise data to them.
Right? So being able to say, I'm going
to bring my
customer data or my supply chain data or
my marketing data next to the
open source model or the closed source
model and being able to fine-tune those
models or being able to reinforce learn
those models that then makes these
applications more capable. they can
handle higher or more complex workflows
or jobs in the enterprise and ultimately
they're going to deliver a more
noticeable ROI. But that enterprise data
is an important part. It's not just
having something in open source or
closed source. You have to be able to
bring that enterprise context close to
and be able to evolve the model as well.
>> So because you mentioned open and closed
source, I want to go a little bit deeper
in that. How do you advise your
customers on how to think about open
verse closed source? Uh there's a a few
leading Chinese companies coming out
with incredible open source models. Does
it really matter open versus closed
source? Is there uh a security element,
a safety element? Is it just well
whatever model is the best for this use
case? It doesn't matter open versus
closed. How do you think about it as
Microsoft? And then how do you advise
your customers to think about that?
>> Yeah, absolutely. So for us as part of
building a platform, a big part of what
we care about is bringing that choice,
bringing that ecosystem, right? With
11,000 models in Foundry today or if I'm
using GitHub Copilot, I have a long list
of different models that I can choose
from or I can build custom agents and I
can bring my custom models to to that
equation as well. So we want to support
choice because we know one the space is
advancing super fast right and to be
dogmatic about one or the other isn't
what builders and developers around the
world want. The second thing I would say
is when we work with our customers, it
goes I think it first starts with
understanding kind of where they are and
what they're what they're trying to
accomplish and that I think is much more
about discovery and understanding what
they're trying to accomplish and then
trying to give them ideas or advise them
or say or give them proof points, right?
which is like hey another customer over
here used this type of model to handle a
similar type of project or objective in
their enterprise here's another company
that did this with an open source model
right or they started open source they
went to closed source and they started
closed source or open source so I don't
think there's a binary way to look at
this I think every customer is one in a
different point in this journey of AI
transformation two is their jobs to be
done are actually quite different and
then where there is more I would say uh
kind of examples or where we can package
things up and say hey this is a package
of something you can use to get started
this is how we will work with our
customers but a lot of that is with our
partners with our field teams being
embedded understanding the the
priorities and the objectives of the
customer
>> I let me just touch one last part on on
model diversity it sounds like that's
not only a strategic decision for
Microsoft, but you know, I've heard the
there's, you know, potentially an
omniodel future, one large model to do
the job of any artificial intelligence
workload that you need. Sounds like you
don't agree with that.
>> I don't know that I don't agree with it.
I would say right now where we are and
kind of what we see in the ecosystem
right
>> now, the technology is advancing at an
incredible rate. So, is there one model
that can do everything in the world? I
don't think that happens anytime soon.
But I do think that this choice of being
able to use different models for
different jobs to be able to bring your
enterprise IP your organizations like IP
data context all of that glued together
with the model in the scaffolding that
we're providing does allow for faster
diffusion like more success with with
AI. Now if there are models that just
are you know way out there in terms of
capability and they just become the
default that will be another day that
will have to think through what this
programming paradigm looks like but I
think there's so much context there's so
much difference like different problems
to be solved out there whether it be in
healthcare or in biology or in supply
chain or in climate or in you know
finance processes or industrial
processes that it's going to be hard for
any one model to be the best at
everything, right? And I think there's
always going to be a need for some
smaller model or some custom model for
those those specific problems. As AI
becomes more and more diffuse in the
world, as more and more adoption happens
in organizations, in enterprises, I
think this is only going to accelerate
the creativity and the customization of
the underlying components.
>> Okay. And I mean, let's let's switch to
the kind of the dynamic with Open AI for
a moment. Um, you know, Satia mentioned
in an interview recently that Microsoft
has access to kind of the latest IP from
Open AI. They get the latest models,
which is great. How do you and your team
think about building on top of the
research that OpenAI has done versus
kind of uh go going at it and and taking
a completely different path and and like
how do you balance those two kind of
different paths that you could take?
>> Yeah, absolutely. I mean both are areas
of focus for us, right? And a lot of
what we do in our partnership with
OpenAI is being able to learn from and
take that IP, take that technology and
then put it into our products in a very
sophisticated way, right? And that does
involve not just prompts, it involves
other post-training around around these
models. It may be combining these models
with other models to solve you know
these different opportunities or create
these new paradigms or features in our
in our products right and on the other
side the work that we're doing in the
MAI team is to go and be able to build
our own models as we've announced some
of these in the past couple of months as
well. So we are building that
infrastructure that that capability we
are shipping stuff in in that. So it's
two parallel threads in terms of what
we're what we're pushing on and we're
fortunate to be able to kind of learn
from both aspects and both efforts.
>> So let's let's continue on your
interaction and and your dynamic with
Mustafa now and the MAI team. So as
they're building incredible things as
they're releasing it, how do you what is
that dynamic? What is that communication
like where you're determining what to
build into production and what do you
deploy to your customers?
>> Yeah, it's high bandwidth in terms of
being able to or our collaboration
there, right? So, it's infrastructure
collaboration. It's sort of like
algorithm research, you know,
collaboration.
It's how to uh get better at eval. Um
it's working together on safety and
trust issues together. So I would
largely say, you know, most of it is
just think of it as like one team and
the teams are working together and
bringing kind of all of their different
talent, their different diversity to the
table to solve these priority problems
we have. Right? And some of them may fit
more with one team versus the other
team, right? So for example, the stuff
we're doing to drive this model
development for software development,
you know, is largely done with teams
that are more connected to GitHub
copilot and to our developer tools,
right? Um, so that's just where we
divide and conquer, but there's a lot of
collaboration that's happening
>> between the two.
>> Perfect. I want to talk a little bit
about safety and security. you were at
Lace Work uh spent years on cloud
security. Obviously, there's a lot of
attack vectors for artificial
intelligence, whether you're talking
about the theft of model weights,
poisoning the weights themselves using
AI. I don't know if you saw that paper
from anthropic about you like a
completely autonomous system being used
for hacking. Um what are you kind of
most worried about? What which attack
vector kind of keeps you up at night if
any?
>> I think the attack vectors that most
security people worried about is the
ones you know that you don't know yet,
right? And I think the ones that you
know you can put in mitigations to
prevent those or to catch those and to
mitigate them quickly. I think we're in
a place just like we talk about what AI
can help you do to create and to solve
problems that humankind may not have
been able to solve for decades in the
past. Now those things are reachable
right with and and discoveries. But the
flip side is is okay well what can this
technology do to break down a lot of the
security conventions we've had in the
past right because one it can operate
really really fast two it's advancing
really really fast right and three like
it's learning along the way too so what
we're doing and and it's throughout the
the company but even here at Ignite
everything we do has to hook into the
overall security of the enterprise of
the organization right so for example an
agent created in our platform will get
an ID that ID is tracked in entra it has
policy and compliance and you can track
it and you can grant it access or not
and if it's doing something that it's
not supposed to you can deactivate this
right so these aren't things that are an
afterthought for us. They're designed
kind of from the get-go when we're
building a new part of the platform or a
new tool and providing that
understanding of what it is, that
observability of what it is, and and
making sure it adheres to your company's
compliance or governance or security
guidelines that you have, your
expectations that you have, and then
being able to act on these things,
right? Being able to go and say, "Hey, I
want to go and trace through what this
agent did yesterday in this customer
support case." And being able to see
line by line everything it did, all the
tools it called, all the data it access,
if there was a human in the loop, what
did they approve or not approve and
being able to really learn from how
these things are operating and and that
but security is got to be there from the
start for us.
>> Okay. And uh last question for you Jay.
What is your most contrarian belief
about artificial intelligence today?
>> I don't know if this is my most
contrarian, but I you know back to some
of the things we were talking about. We
often get the question or part of the
dialogue in interviews like this is you
know how many lines of code does AI
write for you? Right? And that is a
completely nonsensical and like dumb
should that Yeah, absolutely. Because I
I think that is measuring something that
doesn't really make sense in this world
and really you have to focus on what are
you able to do now that you couldn't do
before. Right? So I'll give you an
example. I was meeting with a bunch of
execs from the financial services
industry and one of the things that we
talked about and they were realizing is
they've had decade, five years of
technical debt that they have just been
kind of kicking the can down the road
for years because it's too expensive.
They don't have the right talent. It
takes too long. the opportunity cost to
go and divert to clean up some of this
stuff versus go handle kind of new stuff
is just too hard for them. But now all
of a sudden ability to shrink that
technical debt to modernize their code
bases to modernize their infrastructure
to get into a healthier like security uh
stature. All of that now is more
tractable, right? And and focused on
those outcomes versus here's how much
you know how many lines of code my AI
wrote for me today.
>> All right. Well, Jay, thank you so much.
I appreciate your time.

Summary

Inside Microsoft’s Core AI Strategy: Infrastructure, Security, and the Future of AI Workflows

Overview

Building the Core AI Team

Why Return to the Office?

AI‑Powered Role Convergence

Data‑Center Constraints & GPU Utilization

Model Efficiency and Routing

Open vs. Closed Source Models

Security, Trust, and Attack Vectors

A Contrarian Take on AI Metrics

Looking Ahead

Share This Summary

Embed This Summary

Stay Updated!