BEE-spoke-data
/

smol_llama-220M-GQA

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Edit model card

smol_llama: 220M GQA

A small 220M param (total) decoder model. This is the first version of the model.

1024 hidden size, 10 layers
GQA (32 heads, 8 key-value), context length 2048
train-from-scratch on one GPU :)

Links

Here are some fine-tunes we did, but there are many more possibilities out there!

instruct
- openhermes - link
- open-instruct - link
code
- python (pypi) - link
zephyr DPO tune
- SFT - link
- full DPO - link

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	29.44
AI2 Reasoning Challenge (25-Shot)	24.83
HellaSwag (10-Shot)	29.76
MMLU (5-Shot)	25.85
TruthfulQA (0-shot)	44.55
Winogrande (5-shot)	50.99
GSM8k (5-shot)	0.68

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	6.62
IFEval (0-Shot)	23.86
BBH (3-Shot)	3.04
MATH Lvl 5 (4-Shot)	0.00
GPQA (0-shot)	0.78
MuSR (0-shot)	9.07
MMLU-PRO (5-shot)	1.66

Downloads last month: 2,247

Safetensors

Model size

218M params

Tensor type

BF16

·

Inference Examples

Text Generation

Inference API (serverless) is not available, repository is disabled.

Model tree for BEE-spoke-data/smol_llama-220M-GQA

Finetunes

Merges

1 model

Quantizations

Datasets used to train BEE-spoke-data/smol_llama-220M-GQA

Collection including BEE-spoke-data/smol_llama-220M-GQA

smol llama

🚧"raw" pretrained smol_llama checkpoints - WIP 🚧 • 4 items • Updated Apr 29 • 6

Evaluation results

normalized accuracy on AI2 Reasoning Challenge (25-Shot)
test set Open LLM Leaderboard

24.830
normalized accuracy on HellaSwag (10-Shot)
validation set Open LLM Leaderboard

29.760
accuracy on MMLU (5-Shot)
test set Open LLM Leaderboard

25.850
mc2 on TruthfulQA (0-shot)
validation set Open LLM Leaderboard

44.550
accuracy on Winogrande (5-shot)
validation set Open LLM Leaderboard

50.990
accuracy on GSM8k (5-shot)
test set Open LLM Leaderboard

0.680
strict accuracy on IFEval (0-Shot)
Open LLM Leaderboard

23.860
normalized accuracy on BBH (3-Shot)
Open LLM Leaderboard

3.040
exact match on MATH Lvl 5 (4-Shot)
Open LLM Leaderboard

0.000
acc_norm on GPQA (0-shot)
Open LLM Leaderboard

0.780

View on Papers With Code