Releases · ollama/ollama

20 Nov 23:07

jmorganca

v0.1.11

f2113c1

v0.1.11

New Models

Orca 2: A fine-tuned version of Meta's Llama 2 model, designed to excel particularly in reasoning.
DeepSeek Coder: A capable coding model trained from scratch. Available in 1.3B, 6.7B and 33B parameter counts.
Alfred: A robust conversational model designed to be used for both chat and instruct use cases.

What's Changed

Improved progress bar design
Fixed issue where ollama create would error with invalid cross-device link
Fixed issue where ollama run Ollama would exit with an error on macOS Big Sur and Monterey
q5_0 and q5_1 models will now use GPU
Fixed several max retries exceeded errors when running ollama pull or ollama push
Fixed issue where ollama create would result in a "file not found" error FROM referred to local file
Fixed issue where resizing the terminal while running ollama pull would cause repeated progress bar messages
Minor performance improvements on Intel Macs
Improved error messages on Linux when using Nvidia GPUs

Full Changelog: v0.1.10...v0.1.11

Assets 6

17 Nov 01:38

jmorganca

v0.1.10

41434a7

v0.1.10

New models

OpenChat: An open-source chat model trained on a wide variety of data, surpassing ChatGPT on various benchmarks.
Neural-chat: New chat model by Intel
Goliath: A large chat model created by combining two fine-tuned versions of Llama 2 70B

What's Changed

JSON mode can now be used with ollama run:
- Pass --format json flag or
- Use /set format json to change the current chat session to use JSON mode
Prompts can now be passed in via standard input to ollama run. For example: head -30 README.md | ollama run codellama "how do I install Ollama on Linux?"
ollama create now works with OLLAMA_HOST to build models using Ollama running on a remote machine
Fixed crashes on Intel Macs
Fixed issue where ollama pull progress would reverse when re-trying a failed connection
Fixed issue where ollama show --modelfile would show an incorrect FROM command
Fixed issue where word wrap wouldn't work when piping in data to ollama run via standard input
Fix permission denied issues when running ollama create on Linux
Added FAQ entry for proxy support on Linux
Fixed installer error on Debian 12
Fixed issue where ollama push would result in a 405 error
ollama push will now return a better error when trying to push to a namespace the current user does not have access to

New Contributors

@dhiltgen made their first contribution in #1075
@dansreis made their first contribution in #1055
@breitburg made their first contribution in #1106
@enricoros made their first contribution in #1078
@huynle made their first contribution in #1115
@bnodnarb made their first contribution in #1098
@danemadsen made their first contribution in #1120
@pieroit made their first contribution in #1124
@yanndegat made their first contribution in #1151

Full Changelog: v0.1.9...v0.1.10

Contributors

enricoros, huynle, and 7 other contributors

Assets 6

10 Nov 05:03

jmorganca

v0.1.9

5cba29b

v0.1.9

New models

Yi: a high-performing, bilingual model supporting both English and Chinese.

What's Changed

JSON mode: instruct models to always return valid JSON when calling /api/generate by setting the format parameter to json
Raw mode: bypass any templating done by Ollama by passing {"raw": true} to /api/generate
Better error descriptions when downloading and uploading models with ollama pull and ollama push
Fixed issue where Linux installer would encounter an error when running as the root user
Improved progress bar design when running ollama pull and ollama push
Fixed issue where running on a machine with less than 2GB of VRAM would be slow

New Contributors

@pepperoni21 made their first contribution in #995
@lgrammel made their first contribution in #1020
@ej52 made their first contribution in #999
@David-Kunz made their first contribution in #996
@tjbck made their first contribution in #943
@omagdy7 made their first contribution in #1029
@upchui made their first contribution in #1034
@kevinhermawan made their first contribution in #1043
@amithkoujalgi made their first contribution in #1044
@mpldr made their first contribution in #1042
@aashish2057 made their first contribution in #992
@nickanderson made their first contribution in #1062

Full Changelog: v0.1.8...v0.1.9

Contributors

nickanderson, lgrammel, and 10 other contributors

Assets 6

04 Nov 00:06

jmorganca

v0.1.8

e21579a

v0.1.8

New Models

CodeBooga: A high-performing code instruct model created by merging two existing code models.
Dolphin 2.2 Mistral: An instruct-tuned model based on Mistral. Version 2.2 is fine-tuned for improved conversation and empathy.
MistralLite: MistralLite is a fine-tuned model based on Mistral with enhanced capabilities of processing long contexts.
Yarn Mistral an extension of Mistral to support a context window of up to 128 tokens
Yarn Llama 2 an extension of Llama 2 to support a context window of up to 128 tokens

What's Changed

Ollama will now honour large context sizes on models such as codellama and mistrallite
Fixed issue where repeated characters would be output on long contexts
ollama push is now much faster. 7B models will push up to ~100MB/s and large models (70B+) up to 1GB/s if network speeds permit

New Contributors

@dloss made their first contribution in #948
@noahgitsham made their first contribution in #983

Full Changelog: v0.1.7...v0.1.8

Contributors

dloss and noahgitsham

Assets 6

28 Oct 04:06

jmorganca

v0.1.7

9ec16f0

v0.1.7

What's Changed

Fixed an issue when running ollama run where certain key combinations such as Ctrl+Space would lead to an unresponsive prompt
Fixed issue in ollama run where retrieving the previous prompt from history would require two up arrow key presses instead of one
Exiting ollama run with Ctrl+D will now put cursor on the next line

Full Changelog: v0.1.6...v0.1.7

Assets 6

27 Oct 20:58

jmorganca

v0.1.6

3a1ed9f

v0.1.6

New models

Dolphin 2.1 Mistral: an instruct-tuned model based on Mistral and trained on a dataset filtered to remove alignment and bias.
Zephyr Beta: this is the second model in the series based on Mistral, and has strong performance that compares to and even exceeds Llama 2 70b in several categories. It’s trained on a distilled dataset, improving grammar and yielding even better chat results.

What's Changed

Pasting multi-line strings in ollama run is now possible
Fixed various issues when writing prompts in ollama run
The library models have been refreshed and revamped including llama2, codellama, and more:
- All chat or instruct models now support setting the system parameter, or SYSTEM command in the Modelfile
- Parameters (num_ctx, etc) have been updated for library models
- Slight performance improvements for all models
Model storage can now be configured with OLLAMA_MODELS. See the FAQ for more info on how to configure this.
OLLAMA_HOST will now default to port 443 when https:// is specified, and port 80 when http:// is specified
Fixed trailing slashes causing an error when using OLLAMA_HOST
Fixed issue where ollama pull would retry multiple times when out of space
Fixed various out of memory issues when using Nvidia GPUs
Fixed performance issue previously introduced on AMD CPUs

New Contributors

@ajayk made their first contribution in #855

Full Changelog: v0.1.5...v0.1.6

Contributors

ajayk

Assets 6

24 Oct 18:09

jmorganca

v0.1.5

cecf831

v0.1.5

What's Changed

Fix an issue where an error would occur when running falcon or starcoder models

Full Changelog: v0.1.4...v0.1.5

Assets 6

20 Oct 12:56

jmorganca

v0.1.4

c345b4c

v0.1.4

New models

OpenHermes 2 Mistral: a new fine-tuned model based on Mistral, trained on open datasets totalling over 900,000 instructions. This model has strong multi-turn chat skills, surpassing previous Hermes 13B models and even matching 70B models on some benchmarks.

What's Changed

Faster model switching: models will now stay loaded between requests when using different parameters (e.g. temperature) or system prompts
starcoder, sqlcoder and falcon models now have unicode support. Note: they will need to be re-pulled (e.g. ollama pull starcoder)
New documentation guide on importing existing models to Ollama (GGUF, PyTorch, etc)
ollama serve will now print the current version of Ollama on start
ollama run will now show more descriptive errors when encountering runtime issues (such as insufficient memory)
Fixed an issue where Ollama on Linux would use CPU instead of using both the CPU and GPU for GPUs with less memory
Fixed architecture check in Linux install script
Fixed issue where leading whitespaces would be returned in responses
Fixed issue where ollama show would show an empty SYSTEM prompt (instead of omitting it)
Fixed issue with the /api/tags endpoint would return null instead of [] if no models were found
Fixed an issue where ollama show wouldn't work when connecting remotely by using OLLAMA_HOST
Fixed issue where GPU/Metal would be used on macOS even with num_gpu set to 0
Fixed issue where certain characters would be escaped in responses
Fixed ollama serve logs to report the proper amount of GPU memory (VRAM) being used

Note: the EMBED keyword in Modelfile is being revisited until a future version of Ollama. Join the discussion on how we can make it better.

New Contributors

@vieux made their first contribution in #810
@s-kostyaev made their first contribution in #801
@ggozad made their first contribution in #794
@awaescher made their first contribution in #811
@deichbewohner made their first contribution in #799

Full Changelog: v0.1.3...v0.1.4

Contributors

ggozad, vieux, and 3 other contributors

Assets 6

13 Oct 23:59

jmorganca

v0.1.3

832b4db

v0.1.3

What's Changed

Improved various API error messages to be easier to read
Improved GPU allocation for older GPUs to fix "out of memory" errors
Fixed issue where setting num_gpu to 0 would result in an error
Ollama for macOS will now always update to the latest version, even if earlier updates had also been downloaded beforehand

Full Changelog: v0.1.2...v0.1.3

Assets 6

12 Oct 18:46

jmorganca

v0.1.2

d890890

v0.1.2

New Models

Zephyr A fine-tuned 7B version of mistral that was trained on a mix of publicly available, synthetic datasets and performs as well as Llama 2 70B in many benchmarks
Mistral OpenOrca a 7 billion parameter model fine-tuned on top of the Mistral 7B model using the OpenOrca dataset

Examples

Ollama's examples have been updated with some new examples:

Ask the mentors: a TypesScript, multi-user conversation app
TypeScript LangChain: a simple example of using Ollama with LangChainJS and TypeScript.

What's Changed

Download speeds for ollama pull have been significantly improved, from 60MB/s to over 1.5GB/s (25x faster) on fast network connections

The API now supports non-streaming responses. Set the stream parameter to false and endpoints will return data in one single response:

curl -X POST http://localhost:11434/api/generate -d '{
  "model": "llama2",
  "prompt": "Why is the sky blue?",
  "stream": false
}'

Ollama can now be used with http proxies (using HTTP_PROXY=http://<proxy>) and https proxies (using HTTPS_PROXY=https://<proxy>)
Fixed token too long error when generating a response
q8_0, q5_0, q5_1, and f32 models will now use GPU on Linux
Revise help text in ollama run to be easier to read
Rename runner subprocess to ollama-runner
ollama create will now show feedback when reading model metadata
Fix not found error showing when running ollama pull
Improved video memory allocation on Linux to fix errors when using Nvidia GPUs

New Contributors

@xyproto made their first contribution in #705
@konsalex made their first contribution in #741

Full Changelog: v0.1.1...v0.1.2

Contributors

xyproto and konsalex

Assets 6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New Models

What's Changed

New models

What's Changed

New Contributors

Contributors

New models

What's Changed

New Contributors

Contributors

New Models

What's Changed

New Contributors

Contributors

What's Changed

New models

What's Changed

New Contributors

Contributors

What's Changed

New models

What's Changed

New Contributors

Contributors

What's Changed

New Models

Examples

What's Changed

New Contributors

Contributors

Releases: ollama/ollama

v0.1.11

New Models

What's Changed

v0.1.10

New models

What's Changed

New Contributors

Contributors

v0.1.9

New models

What's Changed

New Contributors

Contributors

v0.1.8

New Models

What's Changed

New Contributors

Contributors

v0.1.7

What's Changed

v0.1.6

New models

What's Changed

New Contributors

Contributors

v0.1.5

What's Changed

v0.1.4

New models

What's Changed

New Contributors

Contributors

v0.1.3

What's Changed

v0.1.2

New Models

Examples

What's Changed

New Contributors

Contributors