Ollama Cookbook
Setting up VS Code Environment
Debugging
Create a file launch.json and add this to configuration
"configurations": [
{
"name": "Streamlit",
"type": "debugpy",
"request": "launch",
"module": "streamlit",
"args": [
"run",
"${file}",
"--server.port",
"2000"
]
}
]Ollama und Python
Beispiele
Getting Started
Python
pip install ollama
import ollama
response = ollama.chat(model='llama2', messages=[
{
'role': 'user',
'content': 'Why is the sky blue?',
},
])
print(response['message']['content'])JavaScript
npm install ollama
import ollama from 'ollama'
const response = await ollama.chat({
model: 'llama2',
messages: [{ role: 'user', content: 'Why is the sky blue?' }],
})
console.log(response.message.content)
Use cases
Both libraries support Ollama’s full set of features. Here are some examples in Python:
Streaming
for chunk in chat('mistral', messages=messages, stream=True):
print(chunk['message']['content'], end='', flush=True)Multi-modal
with open('image.png', 'rb') as file:
response = ollama.chat(
model='llava',
messages=[
{
'role': 'user',
'content': 'What is strange about this image?',
'images': [file.read()],
},
],
)
print(response['message']['content'])Text Completion
result = ollama.generate( model='stable-code', prompt='// A c function to reverse a string\n', ) print(result['response'])
Creating custom models
modelfile=''' FROM llama2 SYSTEM You are mario from super mario bros. ''' ollama.create(model='example', modelfile=modelfile)
Custom client
ollama = Client(host='my.ollama.host')
More examples are available in the GitHub repositories for the Python and JavaScript libraries.
Tipps und Tricks
ollama serve
OLLAMA_ORIGINS=https://webml-demo.vercel.app OLLAMA_HOST=127.0.0.1:11435 ollama serve
Docker
Run Ollama in Docker container
CPU only
docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
Nvidia GPU
Install the Nvidia container toolkit.
Run Ollama inside a Docker container
docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
Run a model
Now you can run a model like Llama 2 inside the container.
docker exec -it ollama ollama run llama2
OpenAI
OpenAI Compatibility
from openai import OpenAI
client = OpenAI(
base_url = 'http://localhost:11434/v1',
api_key='ollama', # required, but unused
)
response = client.chat.completions.create(
model="llama2",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who won the world series in 2020?"},
{"role": "assistant", "content": "The LA Dodgers won in 2020."},
{"role": "user", "content": "Where was it played?"}
]
)
print(response.choices[0].message.content)Using Streamlit
LangChain
from langchain_community.llms import Ollama
llm = Ollama(model="gemma2")
llm.invoke("Why is the sky blue?")LlamaIndex
from llama_index.llms.ollama import Ollama
llm = Ollama(model="gemma2")
llm.complete("Why is the sky blue?")LLMs and Models by Example
wizard-math
ollama run wizard-math 'Expand the following expression: $7(3y+2)$'
























