The Workflow of PEFT

Community Article Published August 14, 2024

image/png

I am new to Parameter-Efficient Fine-Tuning (PEFT), both the method and the library (peft). PEFT is a method designed to make fine-tuning large models more efficient by focusing on a subset of parameters. peft is a library that helps with various PEFT methods.

One of the popular techniques within PEFT is Low-Rank Adaptation (LoRA), which I will demonstrate via the peft library.

After reading the Quicktour of peft, I grasped the basic workflow of the library. In this short essay, I’ll explore how to create a base model and use it to build a LoRA model. Additionally, I'll take a brief look under the hood of the peft library to better understand how things work.

Why Use PEFT and LoRA?

PEFT methods like LoRA are designed to reduce the computational load during fine-tuning by focusing on specific parts of the model. This allows you to achieve high performance with significantly fewer trainable parameters, making it ideal for scenarios with limited resources.

Create a Base Model

import torch
from torch import nn
from peft import LoraConfig, get_peft_model, PeftType

# create a base model
class BaseModel(nn.Module):
    def __init__(self, dim_in, dim_out):
        super().__init__()
        self.linear = nn.Linear(dim_in, dim_out)
    
    def forward(self, x):
        return self.linear(x)

base_model = BaseModel(dim_in=2, dim_out=4)

print("### BASE MODEL")
for name, param in base_model.named_parameters():
    print(name, param)

The BaseModel consists of a single Linear layer. When we print the names and parameters of the model, we observe that the parameters have requires_grad=True, indicating that they are trainable.

### BASE MODEL
linear.weight Parameter containing:
tensor([[-0.4009, -0.6960],
        [-0.1067, -0.3404],
        [-0.1151, -0.5108],
        [ 0.3779, -0.3602]], requires_grad=True)
linear.bias Parameter containing:
tensor([-0.2076, -0.0166, -0.1409,  0.3477], requires_grad=True)

Create a PEFT Configuration

Each PEFT method requires a configuration that determines how the fine-tuning will be applied. Below is a configuration for LoRA:

# you always start with a configuration
config = LoraConfig(
    inference_mode=False,
    r=1,
    lora_alpha=4,
    lora_dropout=0.1,
    target_modules=["linear"],
    peft_type=PeftType.LORA,
)

Create the PEFT Model

Using the base model and the configuration, we instantiate the PEFT model via the get_peft_model method.

lora_model = get_peft_model(model=base_model, peft_config=config)
print("\n\n### LORA MODEL")
for name, param in lora_model.named_parameters():
    print(name, param)
### LORA MODEL
base_model.model.linear.base_layer.weight Parameter containing:
tensor([[-0.4009, -0.6960],
        [-0.1067, -0.3404],
        [-0.1151, -0.5108],
        [ 0.3779, -0.3602]])
base_model.model.linear.base_layer.bias Parameter containing:
tensor([-0.2076, -0.0166, -0.1409,  0.3477])
base_model.model.linear.lora_A.default.weight Parameter containing:
tensor([[-0.5649,  0.3173]], requires_grad=True)
base_model.model.linear.lora_B.default.weight Parameter containing:
tensor([[0.],
        [0.],
        [0.],
        [0.]], requires_grad=True)

Here, the linear layers of the base model no longer have requires_grad=True, meaning that the lora_model will only train the LoRA-specific layers, optimizing only the additional parameters introduced by LoRA.

Under the Hood of PEFT

To understand the workflow better, let's explore what happens when you call the get_peft_model method.

  • The get_peft_model function, found in the src/peft/mapping.py file, preprocesses the configurations and creates a PeftModel.
  • The PeftModel class, located in src/peft/peft_model.py, is the base model for various PEFT methods, including LoRA. It selects the appropriate PEFT module and creates an instance, such as LoraModel in this example.
  • Each PEFT tuner is implemented in the src/peft/tuners/ folder. For instance, the LoraModel class is defined in src/peft/tuners/lora/model.py. This class inherits from the BaseTuner class, which provides common methods and attributes for all PEFT tuners.

For more detailed exploration, you can check out the LoRA paper and browse through the PEFT code on GitHub.

Conclusion

I hope this post gives you an overview of what happens when you execute the get_peft_model method and how PEFT methods like LoRA can be implemented in your models. If you'd like a deeper dive, feel free to reach out.

I would like to extend my gratitude to Benjamin Bossan, a core contributor to PEFT at Hugging Face, for reviewing this post and providing valuable suggestions.