How to build pyllamacpp without AVX2 or FMA. #71

kuvaus · 2023-04-20T18:58:36Z

How to build pyllamacpp without AVX2 or FMA.

1) Check what features your CPU supports

I have an old Mac but these commands likely also work on any linux machine.

The default pyllamacpp and llama.cpp require AVX2 support. But there is a way to build both even if have an old CPU with AVX1 support. First, check what technologies your CPU supports. On a Mac you can do it with:

sysctl -a

I these options which means it supports AVX1 but not AVX2 or FMA.
hw.optional.avx1_0: 1
hw.optional.avx2_0: 0
hw.optional.fma: 0

2) Clone the repository and edit the CMakelists

So, clone the repository.

git clone --recursive https://github.com/nomic-ai/pyllamacpp && cd pyllamacpp

Edit the CMakeLists.txt and change:

option(LLAMA_AVX2  "llama: enable AVX2" OFF)
option(LLAMA_FMA   "llama: enable FMA"  OFF)

Run the install:

pip install -e .

It should install the custom pyllamacpp to your python packages.

3) Use the built pyllamacpp in code.

Now you can just use

import pyllamacpp
from pyllamacpp.model import Model

Now, for some reason this version only works with a single thread. So if you write

generated_text = model.generate("Once upon a time, ", n_predict=55, n_threads=1)

Everything is fine, but

generated_text = model.generate("Once upon a time, ", n_predict=55, n_threads=2)

Seems to generate gibberish.

4) Compare with llama.cpp

For testing purposes I also built the regular llama.cpp.
Here, like they say in their github issues, you have to use regular make instead of cmake to make it work without AVX2. But after building the cpp version, it does work with multiple threads. So just run make like this and you should get the main file:

make

Now, with regular llama.cpp you can write:

./main --threads 4  [+rest of the options]

and that will work.

The text was updated successfully, but these errors were encountered:

kuvaus · 2023-04-21T05:44:37Z

Update: fixed the gibberish issue by downloading the newest llama.cpp submodule and replacing the old @ 3525899 checkpoint folder with the newest. Changed also the CMakeLists.txt inside llama.cpp folder to disable AVX2, FMA but that might not be necessary. All good now.

absadiki · 2023-05-02T21:47:49Z

Thanks so much @kuvaus for the guide.

kuvaus closed this as completed Apr 21, 2023

kuvaus changed the title ~~How to build pyllamacpp without AVX2 or FMA. Works only with n_threads=1, n_threads=2 creates gibberish.~~ How to build pyllamacpp without AVX2 or FMA. Apr 21, 2023

apcameron mentioned this issue May 19, 2023

Illegal instruction (core dumped) zylon-ai/private-gpt#203

Closed

swapatel06 mentioned this issue May 29, 2023

Error getting data SamurAIGPT/EmbedAI#6

Closed

CyberSinister mentioned this issue Jun 28, 2023

Illegal Instruction (core dumped) even after disabling AVX2 and FMA absadiki/pyllamacpp#18

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to build pyllamacpp without AVX2 or FMA. #71

How to build pyllamacpp without AVX2 or FMA. #71

kuvaus commented Apr 20, 2023

kuvaus commented Apr 21, 2023

absadiki commented May 2, 2023

How to build pyllamacpp without AVX2 or FMA. #71

How to build pyllamacpp without AVX2 or FMA. #71

Comments

kuvaus commented Apr 20, 2023