--- license: llama2 datasets: - vicgalle/alpaca-gpt4 pipeline_tag: text-generation language: - en tags: - llama-2 --- ## Fine-tuning - Base Model: [NousResearch/Llama-2-7b-hf](https://huggingface.co/NousResearch/Llama-2-7b-hf) - Dataset for fine-tuning: [vicgalle/alpaca-gpt4](https://huggingface.co/vicgalle/gpt2-alpaca-gpt4) - Training - BitsAndBytesConfig ``` BitsAndBytesConfig( load_in_4bit= True, bnb_4bit_quant_type= "nf4", bnb_4bit_compute_dtype= torch.bfloat16, bnb_4bit_use_double_quant= False, ) ``` - LoRA Config ``` LoraConfig( r=16, lora_alpha= 8, # alpha = rank * 2 ! lora_dropout= 0.1, bias="none", task_type="CAUSAL_LM", target_modules=["q_proj", "k_proj", "v_proj", "o_proj","gate_proj", "up_proj"] ) ``` - Training Arguments ``` TrainingArguments( output_dir= "./results", num_train_epochs= 1, per_device_train_batch_size= 8, gradient_accumulation_steps= 2, optim = "paged_adamw_8bit", save_steps= 1000, logging_steps= 30, learning_rate= 2e-4, weight_decay= 0.001, fp16= False, bf16= False, max_grad_norm= 0.3, max_steps= -1, warmup_ratio= 0.3, group_by_length= True, lr_scheduler_type= "linear", report_to="wandb", ) ```