Running LLMs Locally Just Got Way Better - Ollama + MCP

0h 21m video Published Apr 12, 2026 Transcribed Jul 28, 2026 Tech With Tim

Tech With Tim

Intermediate 12 min read For: Developers and tech enthusiasts interested in running local LLMs with tool integration.

AI Trust Score 85/100

✅ Highly Legit

"Title accurately reflects content: local LLM with tool integration is demonstrated, though performance is slower than cloud models."

AI Summary

This video demonstrates how to run a local LLM on your own machine and connect it to external tools like Google, Notion, and Facebook Ads using Ollama and the Zapier MCP server. The goal is to achieve tool-use capabilities similar to Claude or OpenAI, but completely free, private, and secure.

Chapters

1 Introduction and Overview 00:00 2 Understanding LLMs vs AI Agents 01:13 3 Installing Ollama and Choosing a Model 02:26 4 Setting Up the MCP Client and Zapier Server 08:21 5 Running the Model with Tool Integration 15:55 6 Code Integration and Conclusion 19:42

[00:00]

Introduction to local LLM with tool use

The video promises to show how to run a capable local model and connect it to external services for free, private, and secure tool use.

[00:35]

Using Ollama and Zapier MCP

Ollama is used to run local models, and the Zapier MCP server connects to over 8,000 integrations, allowing the LLM to access external tools.

[01:13]

LLM vs AI agent explained

An LLM is a chatbot that predicts text; an AI agent can take actions by calling tools. Connecting an LLM to tools turns it into an agent.

[02:26]

Installing Ollama

Download Ollama from ollama.com and install it. Update with 'ollama update' command if already installed.

[03:45]

Choosing a model based on hardware

Model selection depends on GPU and RAM. Macs with unified memory can use RAM for models; Windows relies on VRAM. Newer devices perform better.

[06:00]

Selecting a tool-calling model

Models must support tool calling. Example: Qwen 3.5. Parameter count affects performance and RAM usage; choose based on available RAM.

[08:21]

Pulling and testing the model

Use 'ollama pull <model>' to download. Test with 'ollama run <model>'. Larger models may be slow; adjust parameter size as needed.

[10:58]

Setting up MCP client for Ollama

Ollama doesn't natively support MCP; use 'ollama-mcp' bridge. Install via pip: 'pip install ollama-mcp'.

[11:37]

Configuring Zapier MCP server

Create a Zapier MCP server, connect tools (e.g., Notion, Google Calendar), generate a token, and copy the URL with token.

[15:55]

Running the model with MCP integration

Run 'ollama-mcp --mcp-server <URL> --model <model>' to connect. Test tool calls like reading Notion or creating calendar events.

[19:42]

Using local model in code

Ollama exposes a REST API. Use LangChain with MCP adapters to build agents in code. Example: query calendar events.

Running a local LLM with tool integration is powerful and relatively easy to set up. The trade-off is speed vs. accuracy, but it's a huge unlock for privacy and control.

Mentioned in this Video

Ollama

tool

Zapier MCP server

tool

ollama-mcp

tool

LangChain

tool

Cursor

tool

Qwen 3.5

model

Tutorial Checklist

1 02:26 Download and install Ollama from ollama.com.

2 03:45 Open terminal and run 'ollama' to verify installation.

3 06:00 Choose a model that supports tool calling (e.g., Qwen 3.5) based on your hardware (RAM/VRAM).

4 08:21 Pull the model: 'ollama pull qwen3.5:27b'.

5 10:58 Install ollama-mcp: 'pip install ollama-mcp'.

6 11:37 Go to Zapier MCP server, create a new server, connect tools (e.g., Notion, Google Calendar), and generate a token.

7 15:55 Run the model with MCP: 'ollama-mcp --mcp-server "<URL with token>" --model qwen3.5:27b'.

8 19:42 Optionally, use Ollama's REST API with LangChain to integrate in code.

Study Flashcards (12)

What is the difference between an LLM and an AI agent?

easy Click to reveal answer

An LLM is a chatbot that predicts text; an AI agent can take actions by calling tools.