What is Ollama?

Ollama is a framework that can locally deploy and manage open source large language models. Because it greatly simplifies the installation and configuration details of open source large language models, it has been widely praised since its launch and has currently received 46k stars on github.

Whether it is the famous alpaca series, the latest AI upstart Mistral, or various open source large language models, you can use Ollama to install and run it with one click. For a list of more supported models, please check the Ollama official website.

Model	Parameters	Size	Download
Llama 2	7B	3.8GB	`ollama run llama2`
Mistral	7B	4.1GB	`ollama run mistral`

In this article, let us get started with Ollama.

How to install Ollama framework?

Ollama supports various platforms: Mac, Windows and Linux, and also provides docker images. You can download it from the Ollama official website or Github, and then install the Ollama framework with one click:

macOS
Windows
Linux:curl -fsSL https://ollama.com/install.sh | sh

Since Ollama has just supported Windows, the related configuration on Windows is not perfect yet. Below I will mainly use Ollama running on Linux as an example.

Run the Ollama service

After the Ollama installation is completed, the Ollama service will generally be automatically started and will be automatically set to start automatically at boot. After the installation is complete, you can use the following command to check whether Ollama starts normally. In the following example, “Active: active (running)” is displayed to indicate that Ollama has started normally.

$ systemctl status ollama
● ollama.service - Ollama Service
     Loaded: loaded (/etc/systemd/system/ollama.service; enabled; vendor preset: enabled)
    Drop-In: /etc/systemd/system/ollama.service.d
             └─environment.conf
     Active: active (running) since Thu 2024-03-07 09:09:39 HKT; 4 days ago
   Main PID: 19975 (ollama)
      Tasks: 29 (limit: 69456)
     Memory: 1.1G
        CPU: 14min 44.702s
     CGroup: /system.slice/ollama.service
             └─19975 /usr/local/bin/ollama serve

On Linux, if Ollama is not started, you can use the following command to start the Ollama service: ollama serve, orsudo systemctl start ollama .

By analyzing the Linux installation script install.sh, you will see that it has been ollama serveconfigured as a system service, so you can use systemctlit to start/stop the ollama process.

    status "Creating ollama systemd service..."
    cat <<EOF | $SUDO tee /etc/systemd/system/ollama.service >/dev/null
[Unit]
Description=Ollama Service
After=network-online.target

[Service]
ExecStart=$BINDIR/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3
Environment="PATH=$PATH"

After starting the Ollama service, you can view the current Ollama version and commonly used commands

~$ ollama -v
ollama version is 0.1.20
~$ ollama --help
Large language model runner

Usage:
  ollama [flags]
  ollama [command]

Available Commands:
  serve       Start ollama
  create      Create a model from a Modelfile
  show        Show information for a model
  run         Run a model
  pull        Pull a model from a registry
  push        Push a model to a registry
  list        List models
  cp          Copy a model
  rm          Remove a model
  help        Help about any command

Flags:
  -h, --help      help for ollama
  -v, --version   Show version information

Use "ollama [command] --help" for more information about a command.

How to download and run a large language model?

At this point, the installation of the Ollama framework has been completed. Next, you can run the large language model locally with one command. Take the famous alpaca as an example:ollama run llama2 .

If the specified large language model has not been downloaded yet, this command will be executed firstollama pull llama2 to download the large language model locally and then run the large language model locally.

After the download is completed, the running effect is as follows:

:~$ ollama run llama2
>>> who are you?

I am LLaMA, an AI assistant developed by Meta AI that can understand and respond to human input in a conversational manner. I am trained on a massive dataset of text from the internet and can 
generate human-like responses to a wide range of topics and questions. I can be used to create chatbots, virtual assistants, and other applications that require natural language understanding and
generation capabilities.

>>> Send a message (/? for help)

REST API

Ollama also provides API interface:

curl http://localhost:11434/api/generate -d '{
  "model": "llama2",
  "prompt":"Why is the sky blue?",
  "stream": false
}'

The return result is as follows:

{
    "model": "llama2",
    "created_at": "2024-02-26T04:35:10.787352404Z",
    "response": "The sky appears blue because of a phenomenon called Rayleigh scattering, which occurs when sunlight enters Earth's atmosphere. The sunlight encounters tiny molecules of gases such as nitrogen and oxygen, which scatter the light in all directions. The shorter wavelengths of light, such as blue and violet, are scattered more than the longer wavelengths, such as red and orange. This is known as Rayleigh scattering, named after Lord Rayleigh, who first described the phenomenon in the late 19th century. As a result of this scattering, the light that reaches our eyes from the sun appears blue, especially when viewed from a distance. The closer we get to the horizon, the more the blue color appears to fade, as the light has to travel through more of the atmosphere, which scatters the shorter wavelengths even more. It's worth noting that the exact shade of blue can vary depending on the time of day and atmospheric conditions. For example, during sunrise and sunset, when the sun is low in the sky, the sky can take on a more orange or red hue due to the scattering of light by atmospheric particles. So, to summarize, the sky appears blue because of the way light interacts with the tiny molecules of gases in Earth's atmosphere, particularly nitrogen and oxygen.",
    "done": true,
    "total_duration": 7001870820,
    "load_duration": 4930376,
    "prompt_eval_duration": 60907000,
    "eval_count": 309,
    "eval_duration": 6931593000
}

Using the API interface, you can achieve more flexible functions, such as cooperating with IDE plug-ins to implement local programming assistants.

FAQ

How to view running logs?

Run commands on Linuxjournalctl -u ollama to view the running log.

How to configure a local large model to provide services to the LAN?

Create the following configuration file on Linux and configure environment variables OLLAMA_HOSTto specify the address that provides services to the LAN, and then restart the Ollama service.

:~$ cat /etc/systemd/system/ollama.service.d/environment.conf
[Service]
Environment=OLLAMA_HOST=0.0.0.0:11434

After this configuration, a GPU server can provide large language model services for the local area network.

There are multiple GPUs locally. How to use the specified GPU to run Ollama?

Create the following configuration file on Linux and configure the environment variables CUDA_VISIBLE_DEVICESto specify the GPU running Ollama, and then restart the Ollama service.

:~$ cat /etc/systemd/system/ollama.service.d/environment.conf
[Service]
Environment=CUDA_VISIBLE_DEVICES=1,2

In which path are the downloaded large models stored?

By default, the paths stored by different operating systems are as follows:

macOS:~/.ollama/models
Linux:/usr/share/ollama/.ollama/models
Windows:C:\Users<username>.ollama\models

How to modify the path of large model storage?

When installing Ollama on the Linux platform, the user ollama will be created by default during the installation, and the model files will be stored in the user’s directory./usr/share/ollama/.ollama/models . However, because large model files are often very large, sometimes the large model files need to be stored in a special data disk. In this case, the storage path of the large model files needs to be modified.

The official method is to set the environment variable “OLLAMA_MODELS”, but after I tried it on Linux, it did not succeed.

After analyzing the Linux version of the installation script install.sh, I found that the user ollama and user group ollama were created, and then the large model was stored in the user’s directory./usr/share/ollama/.ollama/models , and some operations of my account on the ollama account did not take effect. Even if I manually add my account to the ollama user group, there will still be some permission issues, causing the directory operations on the ollama account to not take effect.

Since the newly created ollama account did not bring me any additional convenience, I finally used the following steps to modify the storage path of the large model file:

Modify the installation file install.sh and cancel the step of creating user ollama. Please refer to the following:# if ! id ollama >/dev/null 2>&1; then # status "Creating ollama user..." # $SUDO useradd -r -s /bin/false -m -d /usr/share/ollama ollama # fi # status "Adding current user to ollama group..." # $SUDO usermod -a -G ollama $(whoami)
Modify the installation file install.sh and use my account to start the ollama service. The reference is as follows: status "Creating ollama systemd service..." cat <<EOF | $SUDO tee /etc/systemd/system/ollama.service >/dev/null [Unit] Description=Ollama Service After=network-online.target [Service] ExecStart=$BINDIR/ollama serve User=<myusername> Group=<myusername>
Modify the installation file install.sh and add the environment variables specified in the following configurationOLLAMA_MODELS to specify the storage path, and then use this installation file to install ollama.Environment="OLLAMA_MODELS=/home/paco/lab/LLM/ollama/OLLAMA_MODELS" Or after the installation is complete, create the following configuration file, configure environment variables OLLAMA_MODELSto specify the storage path, and then restart the Ollama service.:~$ cat /etc/systemd/system/ollama.service.d/environment.conf [Service] Environment=OLLAMA_MODELS=<path>/OLLAMA_MODELS

Getting Started with AI from Zero: Run Various Open Source Large Language Models Locally with One Click – Ollama

What is Ollama?

How to install Ollama framework?

Run the Ollama service

How to download and run a large language model?

REST API

FAQ

How to view running logs?

How to configure a local large model to provide services to the LAN?

There are multiple GPUs locally. How to use the specified GPU to run Ollama?

In which path are the downloaded large models stored?

How to modify the path of large model storage?

By savesoff

Related Post

Leave a Reply Cancel reply

You Missed

Stable Diffusion Notes Basic Principles

Learn JAVA annotations from the basics to the depths

Review of iOS – Application of CoreHaptics Framework

Database tuning – hot and cold separation