⚠️ This blog post was created with the help of AI tools. Yes, I used a bit of magic from language models to organize my thoughts and automate the boring parts, but the geeky fun and the 🤖 in C# are 100% mine.

Hi!

If you’re running local LLMs using:

  • Foundry Local
  • Ollama
  • Models like Phi, Qwen, or Llama

At some point you’ll want to experiment with RAG. And that means one thing: 👉 You need embeddings.

During production, you’ll probably rely on managed services and vector databases.
But during experimentation?

You just want something simple. That’s why I created:

📦 ElBruno.LocalEmbeddings
NuGet: https://www.nuget.org/packages/ElBruno.LocalEmbeddings
Repo: https://github.com/elbruno/elbruno.localembeddings/

It allows you to generate embeddings locally in .NET with minimal setup.

Let’s take a look.


🚀 1️⃣ Generate a Single Embedding

Install the package:

dotnet add package ElBruno.LocalEmbeddings

Then use it:

using ElBruno.LocalEmbeddings;
// Create the generator with default settings
var generator = new LocalEmbeddingGenerator(
new ElBruno.LocalEmbeddings.Options.LocalEmbeddingsOptions());
// Generate a single embedding
var embedding = await generator.GenerateEmbeddingAsync("Hello, world!");
Console.WriteLine($"Dimensions: {embedding.Vector.Length}"); // 384
// Don't forget to dispose when done
generator.Dispose();

That’s it. The first time it runs, it downloads the model locally and caches it.


📚 2️⃣ Generate Multiple Embeddings (Batch Mode)

You can also generate embeddings in batch:

var texts = new[]
{
"First document",
"Second document",
"Third document"
};
var embeddings = await generator.GenerateAsync(texts);
for (int i = 0; i < texts.Length; i++)
{
Console.WriteLine($"Embedding for '{texts[i]}': [{string.Join(", "{embeddings[i].Vector.ToArray().Take(3))}]");
}

Each text gets its own vector.

This is the foundation for:

  • Semantic search
  • Similarity comparison
  • Document ranking
  • RAG pipelines

🧠 Bonus: Cosine Similarity Example

Once you have vectors, you can compare them, check the main console sample code here to learn more.

This is exactly what you need to start building a minimal local RAG system.


💡 Why This Exists

This package is not trying to replace production infrastructure like Microsoft Foundry managed services.

It’s about:

  • Learning faster
  • Prototyping faster
  • Experimenting locally
  • Removing friction

If you’re running local LLMs with Foundry Local or Ollama and just want to test RAG — this helps.


📦 Try It Today


🎥 Want to See It in Action?

I recorded a 10-minute video walkthrough where I:

  • Explain embeddings in simple terms
  • Show both code demos
  • Compare similarity results
  • Talk about using it with Foundry Local and Ollama
  • Position it for experimentation vs production

👉 Watch the video here:

Let me know what you’d like next:

  • A minimal local RAG starter template
  • Full Foundry Local + Embeddings demo
  • Ollama + RAG end-to-end sample
  • This running in a Raspberry Pi (why not?)

Let’s build smarter AI in .NET — with less friction 💙

Happy coding!

Greetings

El Bruno

More posts in my blog ElBruno.com.

More info in https://beacons.ai/elbruno


Leave a comment

Discover more from El Bruno

Subscribe now to keep reading and get access to the full archive.

Continue reading