⚠️ This blog post was created with the help of AI tools. Yes, I used a bit of magic from language models to organize my thoughts and automate the boring parts, but the geeky fun and the 🤖 in C# are 100% mine.

Hi!

There’s a recurring debate every time we talk about AI agents:

Should I use Python or .NET?

After one too many conversations full of strong opinions and zero data, I decided to stop guessing and start measuring, and hey this all started in a fun conversations with friends, and at the end … it was the perfect excuse to put GitHub Copilot to work on a Saturday morning.

After the 8 min video, and more data:


Why this repo exists

This project started after a casual chat with friends.
The usual arguments came up:

  • “Python is faster for AI.”
  • “.NET scales better.”
  • “It depends.”

All of those statements can be true — depending on the workload.

So instead of debating, Copilot built a MAF-PerformanceComparison, a small and reproducible way to compare Microsoft Agent Framework implementations in Python and .NET, using the same model, same workload, and the same metrics.


What is being measured

Each test run executes the same agent workflow and captures:

  • Average time per iteration
  • Minimum and maximum iteration time (to see spikes and jitter)
  • Memory usage

The tests are executed with a local Ollama setup using the ministral-3 model, and scaled across different iteration counts:

  • 100
  • 500
  • 1,000
  • 5,000
  • 10,000

This makes it easy to observe how behavior changes as workloads grow.


What you’ll find in the repo

The repo is structured so you can quickly explore or reproduce the results:

  • tests_results/
    • One folder per iteration size
    • Raw metrics in JSON format
    • A side-by-side comparison report
    • A short analysis report with insights

You can start small (10 iterations for a quick demo) and scale up without changing the code.

👉 GitHub repository:
https://github.com/elbruno/MAF-PerformanceComparison/


Key takeaways

From these runs, a few patterns emerge:

  • Python performs very well for short runs and prototyping, with low latency and smooth behavior.
  • .NET tends to perform better for long-running workloads, especially when looking at average latency and memory usage at scale.
  • There is no universal winner — context matters.

Your hardware, your model, and your concurrency level will influence the outcome more than language loyalty.


Measure first, then decide

This repo is not about proving one runtime “wins”.
It’s about giving you a simple way to measure performance on your own setup and make informed decisions. And of course, the chance to evolve this to really take performance metrics in both platforms.

And hey, this triggered a personal note for myself: learn more about performance metrics in an scenario like this one. A fun challenge for 2026!

Happy coding!

Greetings

El Bruno

More posts in my blog ElBruno.com.

More info in https://beacons.ai/elbruno


One response to “.NET vs Python for AI Agents: Measuring Performance of Microsoft Agent Framework (kind of 😉)”

  1. […] .NET vs Python for AI Agents: Measuring Performance of Microsoft Agent Framework (kind of 😉) (Bruno Capuano) […]

    Like

Leave a comment

Discover more from El Bruno

Subscribe now to keep reading and get access to the full archive.

Continue reading