Can we add `use_scaled_rope` in the config.json?

#2
by lanking - opened

This will help to determine whether we need to turn on the patch or not. From the current model config.json, there is no way for us to know when to turn on/off.

Simply

use_scaled_rope: true,

config.json already has rope_scaling attribute that is null by default.
At least vLLM has some logic that can leverage this: https://github.com/vllm-project/vllm/blob/09c2eb85ddd3b2585979f4cd9cc97168d86718b6/vllm/model_executor/layers/rotary_embedding.py#L739

Probably same is the case for HF transformers

Understand about the logic. But for general LLAMA 3, we don't need to apply

def apply_scaling(freqs: torch.Tensor):
+    ...

But now it is a must have for 3.1. Just trying to add some toggle at config.json layer to see when we need to turn this on. This will be helpful for library building. I think we need to align with transformers library and also vLLM with a single standard here.

Meta Llama org

We adde the rope_type with llama3

lanking changed discussion status to closed

Sign up or log in to comment