Bamboo-Nano

This is an experiment to see if a smaller vocabulary affects model training, inspired by the "TinyStories" paper.

Training Data

The various datasets we trained on are contained within this readme.md file. These are all permissively licensed datasets.

Safetensors

Model size

113M params

Tensor type

F32

Inference API

Unable to determine this model's library. Check the docs .