— bash — netgod.dev — 80×24

guest@netgod.dev:~/blog$ cat running-llms-locally-with-ollama.md

POST(AI)netgod.dev manualPOST(AI)

════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════

NAME

$ Running LLMs Locally with Ollama: A Practical Workflow

DESCRIPTION

Local LLMs went from "toy" to "genuinely useful" in 2024. Here's the setup I use daily for prototyping without burning OpenAI credits.

DATE

2025-03-22

DURATION

1 min read

The setup

Install Ollama (brew install ollama), pull a model (ollama pull llama3.1:8b), and you're chatting in your terminal in two minutes.

The model I actually use

For coding: qwen2.5-coder:7b. Surprisingly close to GPT-4o-mini for autocomplete and small refactors. Runs on a laptop.

For general chat: llama3.1:8b or mistral-nemo:12b if your machine can handle it.

Wire it into your editor

Ollama exposes an OpenAI-compatible HTTP API on localhost:11434. That means Continue, Cline, Aider, and friends all work with one config line.

The honest limits

Local 8B models are not GPT-4. They forget context, refuse weird things, and make up library APIs. But they're free, private, and fast enough for the prototype-and-iterate loop.

netgod.dev manual2025-03-22END