We can run a Large Language Model (LLM) – although not quite as good as ChatGPT – on a local machine.
One of the easiest to run is Alpaca, a fine-tuning of LLaMA.
The following works on an Apple M1 Mac.
Clone and build the repo:
$ git clone https://github.com/antimatter15/alpaca.cpp $ cd alpaca.cpp/ $ make chat
Download the pre-trained model weights:
$ wget -O ggml-alpaca-7b-q4.bin -c https://gateway.estuary.tech/gw/ipfs/QmQ1bf2BTnYxq73MFJWu1B7bQ2UD6qG7D7YDCxhTndVkPC
(See the source repo below for alternatives if this fails).
Run the model:
$ ./chat
Output:
main: seed = 1679968451 llama_model_load: loading model from 'ggml-alpaca-7b-q4.bin' - please wait ... llama_model_load: ggml ctx size = 4529.34 MB llama_model_load: memory_size = 512.00 MB, n_mem = 16384 llama_model_load: loading model part 1/1 from 'ggml-alpaca-7b-q4.bin' llama_model_load: .................................... done llama_model_load: model size = 4017.27 MB / num tensors = 291 == Running in chat mode. == - Press Ctrl+C to interject at any time. - Press Return to return control to LLaMa. - If you want to submit another line, end your input in '\'. > What is the age of the universe? The current estimate for when our Universe was created, according to modern cosmology and astronomy, is 13.798 billion years ago (±0.2%). >
References