How to Use PEFT

Introduction

This guide provides step-by-step instructions on how to fine-tune models using Parameter-Efficient Fine-Tuning (PEFT) with LoRA in GenAI Studio. Fine-tuning models using PEFT with LoRA is a more efficient way to enhance your model’s performance.

This feature supports two model families: Mistral and llama-2 series.

Step-by-Step Instructions

Running fine-tuning with PEFT is basically the same as running fine-tuning. The only difference is the additional PEFTConfig argument.

Step 1: Configure PEFT with LoRA

To start, configure PEFT with LoRA. This configuration will enable you to perform efficient fine-tuning using LoRA.

# Run fine-tuning with LoRA
peft_config = bt.PEFTConfig(peft_type=bte.PEFTType.LORA, peft_args={"r": 16})
exp = lore.launch_training(
    dataset=dataset,
    base_model=llama_2,
    name=f"llama2_peft",
    max_steps=100,
    resource_pool="a100",  # This pool has a100-40GB GPUs
    num_train_epochs=None,
    slots_per_trial=2,
    peft_config=peft_config,
    trust_remote_code=True,
    hf_token=os.environ["HF_TOKEN"],
)

This code snippet demonstrates how to set up the PEFT configuration with LoRA for the llama-2 model. The PEFTConfig object is initialized with the PEFT type set to LoRA and appropriate arguments. The launch_training function is then used to start the fine-tuning process.

Step 2: Run Fine-Tuning

Execute the fine-tuning process with the configured PEFT settings. The lore.launch_training function takes several parameters, including the dataset, base model, training name, maximum steps, resource pool, and PEFT configuration. Ensure you have the Hugging Face token (hf_token) set in your environment variables.

Step 3: Inference with the Fine-Tuned Model

After the fine-tuning experiment completes, you can run inference with the newly fine-tuned PEFT model.

Note: There are no default training configurations for PEFT trainings.

model = lore.get_experiment_models(exp.id)[0]
lore.load_model(
    model,
    resource_pool="a100",
    vllm_swap_space=16,
    vllm_tensor_parallel_size=1,
    hf_token=os.environ["HF_TOKEN"],
)
question = """
Given the context below, please answer the question: What river was Petrela located by?

Context:
A few years after the First Crusade, in 1107, the Normans under the command of Bohemond, Robert's son, 
landed in Valona and besieged Dyrrachium using the most sophisticated military equipment of the time, but to no avail. 
Meanwhile, they occupied Petrela, the citadel of Mili at the banks of the river Deabolis, Gllavenica (Ballsh), Kanina and Jericho. 
This time, the Albanians sided with the Normans, dissatisfied by the heavy taxes the Byzantines had imposed upon them. 
With their help, the Normans secured the Arbanon passes and opened their way to Dibra. 
The lack of supplies, disease and Byzantine resistance forced Bohemond to retreat from his campaign and sign a peace treaty with the Byzantines in the city of Deabolis.

Answer:
"""

This code snippet shows how to load and use the fine-tuned model for inference. The lore.get_experiment_models function retrieves the model from the completed experiment. The lore.load_model function loads the model with the specified resource pool and configurations. Finally, you can use the model to answer questions based on provided context.