sql

With the announcement of SQL Server 2025 Public Preview, hopefully you are interested in test driving Vector Search.

Microsoft has already posted demo code, but it’s only for OpenAI on Azure. But many of us are wondering about running things locally. So I thought I’d share a step-by-step of getting Ollama setup and running locally on my laptop. End-to-end, these instructions should take less than 30 minutes to complete.

If you’re on the Docker bandwagon, jump over to Anthony Nocentino’s fast start with Docker Compose instead!

My Setup

My laptop is a Thinkpad P14s Gen 5 AMD, with a Ryzen 7 PRO 8840 HS, 64GB of RAM, and 4TB of storage and running Windows 11 Pro. I also have a Razer Core X eGPU with an NVidia RTX 3070 connected to my laptop.

I run VMware Workstation on my laptop, so have multiple SQL Server VMs including a fresh one for SQL Server 2025. I want Ollama to use my eGPU, but because VMware Workstation does not support GPU passthrough (to my knowledge), I have to install Ollama on my base OS. As such, you’ll see some extra steps here that can be skipped if you’re just doing everything either locally or within a VM.

You’ll need:

Ollama – https://ollama.com/
nginx – https://nginx.org/
mkcert – https://github.com/FiloSottile/mkcert

Ollama Setup

First, just run the exe & click through the wizard. That’s easy!

Open a PowerShell prompt and run this:

ollama run llama3.2

This will download the llama3.2 model and begin an interactive session after it completes setup of the model. You can start typing and have a conversation with the model!

Once you’re done, you can exit interactive mode, but Ollama should still remain running in the background. Verify this by checking your System Tray for Ollama.

Now let’s confirm connectivity with this test. Run this in the same PowerShell Prompt:

curl http://localhost:11434

Next, you’ll want to pull an embedding model down. There are several that you can choose from: I’m choosing nomic-embed-text.

ollama pull nomic-embed-text

Now let’s test again. Run this in your PowerShell Prompt:

Invoke-RestMethod  -Uri http://localhost:11434/api/embeddings `
     -Method Post  `
     -Body '{"model": "nomic-embed-text", "prompt": "This is a test sentence to embed."}' `
     -ContentType 'application/json'

NOTE: If you’re browsing other embedding models like mxbai-embed-large, you may see that there’s a curl REST API example command. When I was first playing around, I kept getting a “405 Method Not Allowed” error with that example command. This is because curl in PowerShell is different than if it were run from Linux, which is what the doc page assumes. So I just used Invoke-RestMethod instead.

Next, we must address one annoyance that I have with Ollama – it only listens on HTTP and not HTTPS. SQL Server can only communicate over HTTPS (and rightfully so). To resolve this, we must setup a reverse proxy. This is where nginx and mkcert will come into play.

mkcert Setup

Again in your PowerShell Prompt:

mkcert -install

dir C:\Users\<YOUR USERNAME>\AppData\Local\mkcert\rootCA.pem

Next we’ll create a certificate for our Windows machine. My IP address is 192.168.1.160.

mkcert <YOUR IP ADDRESS> localhost 127.0.0.1 ::1

Go to your nginx folder, create a new subfolder called certs, and copy the two new .pem files there.

Next we’ll setup nginx. Just unzip the download and switch to that folder. We’ll now modify \conf\nginx.conf that’s in the nginx folder.

In the default nginx.conf file, there will be a large segment that starts with http and is enclosed by { } braces. Replace all of that with this (BE SURE TO MAKE CHANGES FOR YOUR ENV):

http {
    server {
        listen 443 ssl;
        server_name <YOUR IP ADDRESS HERE>;

        ssl_certificate      ../certs/<YOUR CERT FILENAME HERE>.pem;
        ssl_certificate_key  ../certs/<YOUR CERT FILENAME HERE>-key.pem;

        location / {
            proxy_pass http://localhost:11434;
            proxy_http_version 1.1;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        }
    }
}

Save and start ngnix.

ngnix

NOTE: This will run ngnix actively within the prompt session you have open. To kill this, you’ll want to run the following command in another session.

ngnix -s stop

Later, if you want to start ngnix and run it in the background

start ngnix

Now let’s quick test again with curl. The first will go directly to ollama while the second will go through nginx to ollama

curl http://localhost:11434

curl https://localhost

Invoke-RestMethod -Uri https://localhost/api/embeddings `
   -Method Post `
   -Body '{"model": "nomic-embed-text", "prompt": "This is a test sentence to embed."}' `
   -ContentType 'application/json'

Note the differences:
* The first goes to the original HTTP URI with the Ollama port specified.
* The second goes to HTTPS but does not need the Ollama port specified because of the nginx reverse proxy is listening on 443 and passing traffic to 11434.
* And finally we’ll hit the REST API via HTTPS and create a vector embedding.

VMware Workstation Setup

Next, let’s switch to our VM and get remote access going.

First, we must import the root certificate that we created earlier to our VM.

Copy C:\Users\\AppData\Local\mkcert\rootCA.pem to somewhere on your VM.
Open the Windows Certificate Manager (certmgr.msc)
- Start -> “Manage computer certificates”
Navigate to Trusted Root Certication Authorities -> Certificates
Right click Certificates -> Import
Navigate to your copied rootCA.pem (you’ll want to change the “Field name:” field to *.pem)
Complete the wizard

Now let’s test from within the VM:

curl https://192.168.1.160

Invoke-RestMethod -Uri https://192.168.1.160/api/embeddings `
  -Method Post `
  -Body '{"model": "nomic-embed-text", "prompt": "This is a test sentence to embed."}' `
  -ContentType 'application/json'

Test within SQL Server 2025

Finally, let’s confirm that this all works within SQL Server!

First, some setup stuff.

-----
-- SETUP
USE master;
GO

-- Enable rest endpoint
sp_configure 'external rest endpoint enabled', 1
GO
RECONFIGURE WITH OVERRIDE;
GO

-- Turn on Trace Flags for Vector Search
DBCC TRACEON (466, 474, 13981, -1) 
GO

-- Check trace flags status
DBCC TraceStatus
GO

Now we’ll use some new T-SQL to create an External Model. This will communicate with Ollama’s HTTPS REST API endpoint that we took the time to create earlier.

-----
-- Create external model
USE CookbookDemo
GO

DROP EXTERNAL MODEL ollama_nomic_embed_text
GO

CREATE EXTERNAL MODEL ollama_nomic_embed_text
WITH (
    LOCATION = 'https://192.168.1.160:443/api/embed',
    API_FORMAT = 'Ollama',
    MODEL_TYPE = EMBEDDINGS,
    MODEL = 'nomic-embed-text'
);
GO

-- Verify
SELECT *
FROM sys.external_models;
GO

Now we will create some vector embeddings using the new T-SQL function AI_GENERATE_EMBEDDINGS().

The below code uses my CookbookDemo database. Of course, you’re welcome to use AdventureWorks or any other database of your choosing, but just make sure to change the final example SELECT statement.

-----
-- Test!
SELECT AI_GENERATE_EMBEDDINGS(
        N'this is some text that I want to turn into a vector embedding' USE MODEL ollama_nomic_embed_text
    ) AS GeneratedEmbedding;


-----
-- Run against a table with some data
SELECT 
    TOP 5000
    RecipeName,
    Ingredients,
    AI_GENERATE_EMBEDDINGS(
            Ingredients USE MODEL ollama_nomic_embed_text
        ) AS Ingredients_Embeddings
FROM dbo.Recipes_Flat;

Note that Ollama is using my external GPU (GPU 1 – 3D) to process the Ingredients records into Vector Embeddings! It took about 3 minutes to generate 5000 embeddings with my hardware. I do plan on looking into tweaking Ollama and my setup to improve performance. But hey, this works!

What Next?

Hopefully this quick start guide helped you get Ollama up and running, so you can start playing around now.

If you want to keep diving deeper, you’ll want to create another table with a Vector datatype, and store a foreign key plus the embedding value. Then you can start using vector search T-SQL to do some KNN searching, or create a DiskANN Index to do some ANN search. Have no idea what I just said?!? Don’t worry – I’ll be writing more blog posts to walk you through all of this!

And don’t forget to check out my other blogs on Vector Search too!

Thanks for reading!

Every Byte Counts

Widgets

Search

Practical AI in SQL Server 2025: Ollama Quick Start