Llama 2的简易微调指南
在本指南中,我将介绍如何微调 Llama 2 以使其成为对话摘要器!
上周末,我想在我自己收集的Google Keep笔记数据集上微调Llama 2(现在在Open LLM排行榜中占据至高无上的地位);我的每个笔记都有一个标题和一个正文,所以我想训练 Llama 从给定的标题生成一个正文。
本教程的第一部分介绍如何使用Huggingface库在 samsum 对话框摘要数据集上微调 Llama 2。我倾向于发现,虽然Huggingface在transformers中建立了一个极好的库,但他们的指南往往使普通人的事情过于复杂。第二部分,对自定义数据的微调,将在本周末推出!
要开始使用,请为自己购买 A10、A10G、A100(或任何具有 >24GB GPU 内存的 GPU)。如果您不确定从哪里开始,Brev 云可以轻松访问这些 GPU 中的每一个!
1. 下载模型
Clone Meta 的 Llama 推理存储库(包含下载脚本):
git clone https://github.com/facebookresearch/llama.git
然后运行下载脚本:
bash download.sh
它会提示您在电子邮件中输入 Meta 发送给您的网址。如果您尚未注册,请在此处注册。他们向您发送电子邮件的速度出奇地快!
对于本指南,您只需要下载7B型号。
2.将模型转换为Hugging Face格式
wget https://raw.githubusercontent.com/huggingface/transformers/main/src/transformers/models/llama/convert_llama_weights_to_hf.py
pip install git+https://github.com/huggingface/transformers
pip install -e .
python convert_llama_weights_to_hf.py \
--input_dir llama-2-7b --model_size 7B --output_dir models_hf/7B
如果您最初只下载了 7B 模型,则需要确保将模型文件移动到名为“7B”的目录中。这是我的目录结构:
llama-2-7b/
├── 7B
│ ├── checklist.chk
│ ├── consolidated.00.pth
│ └── params.json
├── tokenizer.model
└── tokenizer_checklist.chk
现在,这为我们提供了一个Hugging Face模型,我们可以利用Huggingface库对其进行微调!
3. 运行微调笔记本:
克隆Llama-recipies存储库:
git clone https://github.com/facebookresearch/llama-recipes.git
然后在首选笔记本界面中打开 quickstart.ipynb 文件:
(我像这样使用Jupyter实验室):
pip install jupyterlab
jupyter lab # in the repo you want to work in
然后只需运行整个笔记本。
确保更改该行:
model_id="./models_hf/7B"
转换为转换的实际模型路径。就是这样!你最终会得到一个 Lora 微调.
4. 在微调后的模型上运行推理
这里的问题是Huggingface只保存适配器重量,而不是完整模型。因此,我们需要将适配器权重加载到完整模型中。我努力寻找合适的文档来做到这一点......但最终还是解决了!
导入库:
import torch
from transformers import LlamaForCausalLM, LlamaTokenizer
from peft import PeftModel, PeftConfig
加载分词器和模型:
model_id="./models_hf/7B"
tokenizer = LlamaTokenizer.from_pretrained(model_id)
model =LlamaForCausalLM.from_pretrained(model_id, load_in_8bit=True, device_map='auto', torch_dtype=torch.float16)
在训练后从保存适配器的位置加载适配器:
model = PeftModel.from_pretrained(model, "/root/llama-recipes/samsungsumarizercheckpoint")
运行推理:
eval_prompt = """
Summarize this dialog:
A: Hi Tom, are you busy tomorrow’s afternoon?
B: I’m pretty sure I am. What’s up?
A: Can you go with me to the animal shelter?.
B: What do you want to do?
A: I want to get a puppy for my son.
B: That will make him so happy.
A: Yeah, we’ve discussed it many times. I think he’s ready now.
B: That’s good. Raising a dog is a tough issue. Like having a baby ;-)
A: I'll get him one of those little dogs.
B: One that won't grow up too big;-)
A: And eat too much;-))
B: Do you know which one he would like?
A: Oh, yes, I took him there last Monday. He showed me one that he really liked.
B: I bet you had to drag him away.
A: He wanted to take it home right away ;-).
B: I wonder what he'll name it.
A: He said he’d name it after his dead hamster – Lemmy - he's a great Motorhead fan :-)))
---
Summary:
"""
model_input = tokenizer(eval_prompt, return_tensors="pt").to("cuda")
model.eval()
with torch.no_grad():
print(tokenizer.decode(model.generate(**model_input, max_new_tokens=100)[0], skip_special_tokens=True))
在本系列的下一步中,我将向您展示如何格式化自己的数据集以在自定义任务上训练 Llama 2!如果你想让我快点,请在给我留言!