Hosting your own local LLM can be fast, secure and private to online alternative.
Before getting started
- You’ll need a fairly decent machine if you want quicker responses. But as long as you are using it alone you can get by with a old machine also.
- We’ll use docker to host the LLM and web ui so make sure docker is installed and running.
Installing Ollama
In this tutorial we’ll use Ollama to host the LLM and later connect an UI. Create a directory called chat
and create a compose.yml
file inside it.
mkdir chat
cd chat
touch compose.yml
Add the following to the compose.yml
file.
services:
ollama:
image: ollama/ollama
container_name: ollama
restart: unless-stopped
ports:
- 11434:11434
volumes:
- ./ollama-data:/root/.ollama
Now, run the container.
docker compose up -d
And go to
Adding the UI for chat
For the UI we’ll use the open web UI project. To add this we’ll edit the compose.yml file.
services:
ollama:
image: ollama/ollama
container_name: ollama
restart: unless-stopped
ports:
- 11434:11434
volumes:
- ./ollama-data:/root/.ollama
open-webui:
image: ghcr.io/open-webui/open-webui:main
restart: unless-stopped
container_name: ollama-webui
volumes:
- ./open-webui-data:/app/backend/data
environment:
- "OLLAMA_BASE_URL=http://ollama:11434"
ports:
- 3000:8080
Then re-run the containers.
docker compose down
docker compose up -d
Now, navigate to
Downloading a LLM
Now we have all the essentials setup and running to download a use a LLM model. You can any model from the Ollama website. For this tutorial we’ll use the deelseek-r1 model with 1.5 billion parameters.
On the web interface go to select model and search deepseek-r1:1.5b and click pull to begin downloading. Wait for the model to get downloaded.
Once the model is downloaded select the model.
Try to ask it a joke 😂
Okay! the jokes might not be as funny 😅 but you have your own private chatbot.