Run Claude Code 100% Free & Locally

2025 Updated Guide

Run Claude Code
100% Free & Locally

No API bill. No usage cap. No cloud. Your code never leaves your machine.

🔒 Fully Private ⚡ Runs Offline $0 Forever 🧠 Ollama Powered
✦ 5 min read✦ 4 Steps Claude Code Ollama Local LLM Privacy

// In this guide

The Problem With Cloud-Dependent AI

Claude Code is one of the most capable AI coding assistants available — but the default setup comes with a recurring API cost and a hard dependency on Anthropic’s servers. Every autocomplete, every refactor, every “explain this function” call racks up tokens.

What if you could flip that model entirely? Run the AI brain locally. Keep your proprietary code off external servers. Pay nothing. That’s exactly what this guide shows you.

ℹ️

What We’re Building

A local pipeline where Claude Code’s interface talks to a locally-running Ollama server instead of Anthropic’s cloud. You get the familiar Claude Code UX powered by a local model you control.

Local vs Cloud: The Real Difference

Feature Cloud (Default) Local (This Guide)
CostPay-per-token$0
PrivacyCode sent to serversNever leaves machine
Usage LimitsRate limitedUnlimited
Internet RequiredAlwaysNever
Model QualityClaude 3.5+Depends on your GPU
Figure 1 — Claude Code running locally with qwen3-coder model

Figure 1 — Claude Code running locally with qwen3-coder model

01

Install Ollama — Your Local AI Engine

// The runtime that runs open-source models on your hardware

Ollama is a lightweight, open-source application that lets your computer run large language models locally — no Docker, no Python environment complexity, no manual CUDA configuration. It abstracts all of that away into a single install.

Think of Ollama as the “server” that Claude Code will talk to. Instead of sending requests to Anthropic’s API, Claude Code will send them to http://localhost:11434 — a private server running entirely on your machine.

Download & Install

Head to ollama.com and download the installer for your platform. The installer handles everything including adding Ollama to your system PATH automatically.

// terminal — linux one-liner
# Linux alternative
curl -fsSL https://ollama.com/install.sh | sh
Figure 2 — Download Ollama from ollama.com for macOS or Windows

Figure 2 — Download Ollama from ollama.com for macOS or Windows

Verify the Installation

Once installed, Ollama runs as a background service. Verify it’s alive by navigating to http://localhost:11434 — you should see a plain-text confirmation that Ollama is running.

// terminal — verify ollama
ollama --version
# → ollama version 0.x.x
 
curl http://localhost:11434
# → Ollama is running

Pro Tip

Ollama starts automatically on login and sits quietly in your system tray (macOS) or taskbar (Windows). You never need to manually start it again.

02

Pull a Coding Model

// Choose the right brain for your hardware

For coding tasks you want a model purpose-built for code generation and instruction-following. The right choice depends entirely on your hardware.

High-End GPU

qwen3-coder:30b

Excellent reasoning, large context. Needs 16GB+ VRAM or M2 Pro+.

Mid-Range

qwen2.5-coder:7b

Great balance of speed vs quality. Runs well on 8GB RAM/VRAM.

Low-End / CPU

gemma:2b

Fastest option, works purely on CPU. Always responsive.

⚠️

Tool/Function Call Support Required

Claude Code needs a model that supports tool calls. Always verify this before committing to a model. The qwen2.5-coder and qwen3-coder families both handle this reliably.

Download Your Model

// terminal — download model
# Recommended for most users
ollama pull qwen2.5-coder:7b
 
# High-end machines (16GB+ VRAM)
ollama pull qwen3-coder:30b
 
# CPU-only machines
ollama pull gemma:2b
Figure 3 — Running ollama pull to download a coding model

Figure 3 — Running ollama pull to download a coding model

03

Point Claude at Localhost

// Redirect Claude Code away from Anthropic’s servers

This is the most important step. By default, Claude Code is hard-wired to talk to Anthropic’s API. We override that by setting environment variables that tell it to use your local Ollama server instead.

Ollama exposes an OpenAI-compatible API endpoint, so Claude Code can communicate with it seamlessly — you just need to set the correct base URL.

Set the Environment Variable

// bash / zsh — set base URL
# Redirect Claude Code to your local Ollama instance
export ANTHROPIC_BASE_URL=http://localhost:11434/v1
export ANTHROPIC_API_KEY=ollama
 
# To make permanent, add both lines to ~/.zshrc or ~/.bashrc
💡

Why “ollama” as the API key?

Claude Code requires an API key field to be non-empty, but a local Ollama server has no real authentication. Setting it to any string (like “ollama”) satisfies the requirement without sending credentials anywhere.

Figure 4 — Setting the base URL to redirect Claude Code to your local Ollama server

Figure 4 — Setting the base URL to redirect Claude Code to your local Ollama server

Persist the Configuration

To avoid re-exporting these variables every session, add them permanently to your shell config file:

// bash — persist to shell config
# For zsh (macOS default)
echo 'export ANTHROPIC_BASE_URL=http://localhost:11434/v1' >> ~/.zshrc
echo 'export ANTHROPIC_API_KEY=ollama' >> ~/.zshrc
source ~/.zshrc
 
# For bash (Linux / Windows WSL)
# Replace ~/.zshrc with ~/.bashrc above
04

Launch Claude Code & Start Building

// Navigate to your project and run

Everything is in place. Navigate to any project directory and start Claude Code, passing the local model name with the --model flag.

// terminal — launch claude code
# Navigate to your project
cd ~/your-project
 
# Start Claude Code with your chosen local model
claude --model qwen2.5-coder:7b
 
# Or the 30B model for more powerful machines
claude --model qwen3-coder:30b
Figure 5 — Claude Code initialised and running with a local model

Figure 5 — Claude Code initialised and running with a local model

Try Your First Prompt

Once Claude Code starts, try these to verify everything is wired up correctly:

// claude code — test prompts
# Basic smoke test
"Make a hello world website"
 
# Real-world test
"Create a REST API endpoint for user authentication in Node.js"
 
# Refactor test
"Refactor this function to handle edge cases and add JSDoc"

You’ll see Claude Code read your files, propose edits, execute terminal commands — all the usual behaviour. The difference: every inference call hits localhost, not the cloud.

Figure 6 — Claude Code autonomously editing files and running commands, entirely offline

Figure 6 — Claude Code autonomously editing files and running commands, entirely offline

🚀

What Claude Code Can Do Locally

Browse your file tree, read source files, write and edit code, run terminal commands, install packages, and iterate on your requests — all without touching the internet or incurring a single API charge.

You’re Now Running
AI at Zero Cost

No subscription. No usage anxiety. No proprietary code leaving your machine. A genuine alternative to cloud-dependent AI tooling.

ollama.com · github.com/anthropics/claude-code

Scroll to Top