Mozilla/Mistral-7B-Instruct-v0.3-llamafile

Jul 28

PS D:\wrkspace\data> ./Mistral-7B-Instruct-v0.3.Q6_K_llamafile.exe -p ...
note: if you have an AMD or NVIDIA GPU then you need to pass -ngl 9999 to enable GPU offloading
main: llamafile version 0.8.11
main: seed = 1722194754
/D/wrkspace/JPMC-RAG/data/Mistral-7B-Instruct-v0.3.Q6_K_llamafile.exe: warning: not a pkzip archive
llama_model_load: error loading model: failed to open models/7B/ggml-model-f16.gguf: No such file or directory
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model 'models/7B/ggml-model-f16.gguf'
main: error: unable to load model

This is the error I encounter"

jartine

mozilla org Jul 29

That llamafile exceeds the Windows 4GB exe size limit. I'm surprised it runs at all. There's a workaround. You can download a llamafile release executable without weights and use that to run it.

curl -o llamafile.exe https://github.com/Mozilla-Ocho/llamafile/releases/download/0.8.12/llamafile-0.8.12
llamafile -m Mistral-7B-Instruct-v0.3.Q6_K.Llamafile -p hello

jartine changed discussion status to closed Jul 29

Mozilla
/

Mistral-7B-Instruct-v0.3-llamafile

Failed to load model.