What is Ollama?
Ollama is a framework that can locally deploy and manage open source large language models. Because it greatly simplifies the installation and configuration details of open source large language models, it has been widely praised since its launch and has currently received 46k stars on github.
Whether it is the famous alpaca series, the latest AI upstart Mistral, or various open source large language models, you can use Ollama to install and run it with one click. For a list of more supported models, please check the Ollama official website.
Model | Parameters | Size | Download |
---|---|---|---|
Llama 2 | 7B | 3.8GB | ollama run llama2 |
Mistral | 7B | 4.1GB | ollama run mistral |
In this article, let us get started with Ollama.
How to install Ollama framework?
Ollama supports various platforms: Mac, Windows and Linux, and also provides docker images. You can download it from the Ollama official website or Github, and then install the Ollama framework with one click:
Since Ollama has just supported Windows, the related configuration on Windows is not perfect yet. Below I will mainly use Ollama running on Linux as an example.
Run the Ollama service
After the Ollama installation is completed, the Ollama service will generally be automatically started and will be automatically set to start automatically at boot. After the installation is complete, you can use the following command to check whether Ollama starts normally. In the following example, “Active: active (running)” is displayed to indicate that Ollama has started normally.
$ systemctl status ollama
● ollama.service - Ollama Service
Loaded: loaded (/etc/systemd/system/ollama.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/ollama.service.d
└─environment.conf
Active: active (running) since Thu 2024-03-07 09:09:39 HKT; 4 days ago
Main PID: 19975 (ollama)
Tasks: 29 (limit: 69456)
Memory: 1.1G
CPU: 14min 44.702s
CGroup: /system.slice/ollama.service
└─19975 /usr/local/bin/ollama serve
On Linux, if Ollama is not started, you can use the following command to start the Ollama service: ollama serve
, orsudo systemctl start ollama
.
By analyzing the Linux installation script install.sh, you will see that it has been ollama serve
configured as a system service, so you can use systemctl
it to start/stop the ollama process.
status "Creating ollama systemd service..."
cat <<EOF | $SUDO tee /etc/systemd/system/ollama.service >/dev/null
[Unit]
Description=Ollama Service
After=network-online.target
[Service]
ExecStart=$BINDIR/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3
Environment="PATH=$PATH"
After starting the Ollama service, you can view the current Ollama version and commonly used commands
~$ ollama -v
ollama version is 0.1.20
~$ ollama --help
Large language model runner
Usage:
ollama [flags]
ollama [command]
Available Commands:
serve Start ollama
create Create a model from a Modelfile
show Show information for a model
run Run a model
pull Pull a model from a registry
push Push a model to a registry
list List models
cp Copy a model
rm Remove a model
help Help about any command
Flags:
-h, --help help for ollama
-v, --version Show version information
Use "ollama [command] --help" for more information about a command.
How to download and run a large language model?
At this point, the installation of the Ollama framework has been completed. Next, you can run the large language model locally with one command. Take the famous alpaca as an example:ollama run llama2
.
If the specified large language model has not been downloaded yet, this command will be executed firstollama pull llama2
to download the large language model locally and then run the large language model locally.
After the download is completed, the running effect is as follows:
:~$ ollama run llama2
>>> who are you?
I am LLaMA, an AI assistant developed by Meta AI that can understand and respond to human input in a conversational manner. I am trained on a massive dataset of text from the internet and can
generate human-like responses to a wide range of topics and questions. I can be used to create chatbots, virtual assistants, and other applications that require natural language understanding and
generation capabilities.
>>> Send a message (/? for help)
REST API
Ollama also provides API interface:
curl http://localhost:11434/api/generate -d '{
"model": "llama2",
"prompt":"Why is the sky blue?",
"stream": false
}'
The return result is as follows:
{
"model": "llama2",
"created_at": "2024-02-26T04:35:10.787352404Z",
"response": "The sky appears blue because of a phenomenon called Rayleigh scattering, which occurs when sunlight enters Earth's atmosphere. The sunlight encounters tiny molecules of gases such as nitrogen and oxygen, which scatter the light in all directions. The shorter wavelengths of light, such as blue and violet, are scattered more than the longer wavelengths, such as red and orange. This is known as Rayleigh scattering, named after Lord Rayleigh, who first described the phenomenon in the late 19th century. As a result of this scattering, the light that reaches our eyes from the sun appears blue, especially when viewed from a distance. The closer we get to the horizon, the more the blue color appears to fade, as the light has to travel through more of the atmosphere, which scatters the shorter wavelengths even more. It's worth noting that the exact shade of blue can vary depending on the time of day and atmospheric conditions. For example, during sunrise and sunset, when the sun is low in the sky, the sky can take on a more orange or red hue due to the scattering of light by atmospheric particles. So, to summarize, the sky appears blue because of the way light interacts with the tiny molecules of gases in Earth's atmosphere, particularly nitrogen and oxygen.",
"done": true,
"total_duration": 7001870820,
"load_duration": 4930376,
"prompt_eval_duration": 60907000,
"eval_count": 309,
"eval_duration": 6931593000
}
Using the API interface, you can achieve more flexible functions, such as cooperating with IDE plug-ins to implement local programming assistants.
FAQ
How to view running logs?
Run commands on Linuxjournalctl -u ollama
to view the running log.
How to configure a local large model to provide services to the LAN?
Create the following configuration file on Linux and configure environment variables OLLAMA_HOST
to specify the address that provides services to the LAN, and then restart the Ollama service.
:~$ cat /etc/systemd/system/ollama.service.d/environment.conf
[Service]
Environment=OLLAMA_HOST=0.0.0.0:11434
After this configuration, a GPU server can provide large language model services for the local area network.
There are multiple GPUs locally. How to use the specified GPU to run Ollama?
Create the following configuration file on Linux and configure the environment variables CUDA_VISIBLE_DEVICES
to specify the GPU running Ollama, and then restart the Ollama service.
:~$ cat /etc/systemd/system/ollama.service.d/environment.conf
[Service]
Environment=CUDA_VISIBLE_DEVICES=1,2
In which path are the downloaded large models stored?
By default, the paths stored by different operating systems are as follows:
- macOS:
~/.ollama/models
- Linux:
/usr/share/ollama/.ollama/models
- Windows:
C:\Users<username>.ollama\models
How to modify the path of large model storage?
When installing Ollama on the Linux platform, the user ollama will be created by default during the installation, and the model files will be stored in the user’s directory./usr/share/ollama/.ollama/models
. However, because large model files are often very large, sometimes the large model files need to be stored in a special data disk. In this case, the storage path of the large model files needs to be modified.
The official method is to set the environment variable “OLLAMA_MODELS”, but after I tried it on Linux, it did not succeed.
After analyzing the Linux version of the installation script install.sh, I found that the user ollama and user group ollama were created, and then the large model was stored in the user’s directory./usr/share/ollama/.ollama/models
, and some operations of my account on the ollama account did not take effect. Even if I manually add my account to the ollama user group, there will still be some permission issues, causing the directory operations on the ollama account to not take effect.
Since the newly created ollama account did not bring me any additional convenience, I finally used the following steps to modify the storage path of the large model file:
- Modify the installation file install.sh and cancel the step of creating user ollama. Please refer to the following:
# if ! id ollama >/dev/null 2>&1; then # status "Creating ollama user..." # $SUDO useradd -r -s /bin/false -m -d /usr/share/ollama ollama # fi # status "Adding current user to ollama group..." # $SUDO usermod -a -G ollama $(whoami)
- Modify the installation file install.sh and use my account to start the ollama service. The reference is as follows:
status "Creating ollama systemd service..." cat <<EOF | $SUDO tee /etc/systemd/system/ollama.service >/dev/null [Unit] Description=Ollama Service After=network-online.target [Service] ExecStart=$BINDIR/ollama serve User=<myusername> Group=<myusername>
- Modify the installation file install.sh and add the environment variables specified in the following configuration
OLLAMA_MODELS
to specify the storage path, and then use this installation file to install ollama.Environment="OLLAMA_MODELS=/home/paco/lab/LLM/ollama/OLLAMA_MODELS"
Or after the installation is complete, create the following configuration file, configure environment variablesOLLAMA_MODELS
to specify the storage path, and then restart the Ollama service.:~$ cat /etc/systemd/system/ollama.service.d/environment.conf [Service] Environment=OLLAMA_MODELS=<path>/OLLAMA_MODELS