What is the context size?

#16
by fatshady - opened

What is the context size and is there any way to extend it?

The paper says "However, we note that the Aya model is finetuned using up to 1024 input tokens as in mT5 pretraining, ...."
Section 5.1.2, Page 17

https://cohere.com/research/aya/aya-model-paper.pdf

Sign up or log in to comment