📄️ Transformer 架构
模型架构
📄️ 微调 Fine-Tuning
Definition
📄️ 量化:减小模型大小
大模型发展很快,大模型的复杂度和大小变高,这就给微调带来一个挑战,如何使用有限的资源训练更大的模型?
📄️ Low-Rank Adaptation (LoRA)
What is LoRA ?
📄️ Quantized Low-Rank Adaptation (QLoRA)
Quantized Low-Rank Adaptation (QLoRA), as the name suggests, combines the two most widely used methods of fine-tuning, i.e., LoRA and quantization. Where LoRA uses the low-rank matrices to reduce the number of trainable parameters, QLoRA extends it by further reducing the model size by quantizing its weights.
📄️ 相关疑问
什么是线性层?