Model save

Browse files

Files changed (5) hide show

README.md +86 -0
all_results.json +9 -0
generation_config.json +14 -0
train_results.json +9 -0
trainer_state.json +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,86 @@

+---
+tags:
+- trl
+- dpo
+- generated_from_trainer
+model-index:
+- name: Qwen2-7B-Instruct-SPPO-Function-call-v2.11
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# Qwen2-7B-Instruct-SPPO-Function-call-v2.11
+This model was trained from scratch on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.1457
+- Rewards/chosen: -1.7639
+- Rewards/rejected: -14.1509
+- Rewards/accuracies: 0.9364
+- Rewards/margins: 12.3871
+- Logps/rejected: -551.2230
+- Logps/chosen: -189.1563
+- Logits/rejected: -1.6081
+- Logits/chosen: -1.5770
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-07
+- train_batch_size: 2
+- eval_batch_size: 2
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 8
+- total_train_batch_size: 16
+- total_eval_batch_size: 16
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 2
+### Training results
+| Training Loss | Epoch  | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
+|:-------------:|:------:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
+| 0.2001        | 0.1145 | 250  | 0.2192          | 0.7210         | -1.8684          | 0.9162             | 2.5895          | -305.5732      | -139.4582    | -1.6566         | -1.7096       |
+| 0.1246        | 0.2290 | 500  | 0.1662          | 0.6780         | -4.7708          | 0.9277             | 5.4487          | -363.6193      | -140.3193    | -1.6309         | -1.6619       |
+| 0.0831        | 0.3436 | 750  | 0.1441          | 0.5794         | -6.0728          | 0.9191             | 6.6521          | -389.6595      | -142.2913    | -1.6015         | -1.6194       |
+| 0.0698        | 0.4581 | 1000 | 0.1458          | -0.1931        | -8.1002          | 0.9335             | 7.9071          | -430.2079      | -157.7405    | -1.6062         | -1.6142       |
+| 0.0872        | 0.5726 | 1250 | 0.1416          | -0.0252        | -8.5014          | 0.9393             | 8.4762          | -438.2315      | -154.3822    | -1.5572         | -1.5535       |
+| 0.0547        | 0.6871 | 1500 | 0.1330          | -0.4963        | -9.4547          | 0.9335             | 8.9584          | -457.2992      | -163.8050    | -1.5598         | -1.5574       |
+| 0.1092        | 0.8016 | 1750 | 0.1337          | -1.2236        | -10.3660         | 0.9277             | 9.1424          | -475.5235      | -178.3509    | -1.5822         | -1.5827       |
+| 0.1109        | 0.9162 | 2000 | 0.1190          | -0.4262        | -9.6091          | 0.9364             | 9.1829          | -460.3859      | -162.4036    | -1.5682         | -1.5631       |
+| 0.013         | 1.0307 | 2250 | 0.1355          | -0.4415        | -10.4543         | 0.9393             | 10.0128         | -477.2908      | -162.7087    | -1.5520         | -1.5425       |
+| 0.0107        | 1.1452 | 2500 | 0.1450          | -1.2114        | -11.9528         | 0.9393             | 10.7414         | -507.2599      | -178.1073    | -1.5666         | -1.5494       |
+| 0.0203        | 1.2597 | 2750 | 0.1424          | -1.2291        | -12.7381         | 0.9364             | 11.5090         | -522.9661      | -178.4617    | -1.5798         | -1.5536       |
+| 0.0128        | 1.3743 | 3000 | 0.1428          | -1.5064        | -13.4244         | 0.9393             | 11.9180         | -536.6923      | -184.0067    | -1.5982         | -1.5679       |
+| 0.0447        | 1.4888 | 3250 | 0.1490          | -1.6333        | -13.8914         | 0.9422             | 12.2581         | -546.0324      | -186.5450    | -1.6084         | -1.5768       |
+| 0.0114        | 1.6033 | 3500 | 0.1508          | -1.8097        | -14.2168         | 0.9393             | 12.4071         | -552.5399      | -190.0730    | -1.6144         | -1.5842       |
+| 0.0201        | 1.7178 | 3750 | 0.1447          | -1.7474        | -14.1355         | 0.9393             | 12.3881         | -550.9136      | -188.8267    | -1.6087         | -1.5784       |
+| 0.0139        | 1.8323 | 4000 | 0.1461          | -1.7396        | -14.1065         | 0.9393             | 12.3669         | -550.3343      | -188.6715    | -1.6088         | -1.5783       |
+| 0.0038        | 1.9469 | 4250 | 0.1457          | -1.7639        | -14.1509         | 0.9364             | 12.3871         | -551.2230      | -189.1563    | -1.6081         | -1.5770       |
+### Framework versions
+- Transformers 4.44.0
+- Pytorch 2.3.1+cu121
+- Datasets 2.20.0
+- Tokenizers 0.19.1

all_results.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+    "epoch": 2.0,
+    "total_flos": 0.0,
+    "train_loss": 0.09263221575655435,
+    "train_runtime": 18962.0322,
+    "train_samples": 34924,
+    "train_samples_per_second": 3.684,
+    "train_steps_per_second": 0.23
+}

generation_config.json ADDED Viewed

	@@ -0,0 +1,14 @@

+{
+  "bos_token_id": 151643,
+  "do_sample": true,
+  "eos_token_id": [
+    151645,
+    151643
+  ],
+  "pad_token_id": 151643,
+  "repetition_penalty": 1.05,
+  "temperature": 0.7,
+  "top_k": 20,
+  "top_p": 0.8,
+  "transformers_version": "4.44.0"
+}

train_results.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+    "epoch": 2.0,
+    "total_flos": 0.0,
+    "train_loss": 0.09263221575655435,
+    "train_runtime": 18962.0322,
+    "train_samples": 34924,
+    "train_samples_per_second": 3.684,
+    "train_steps_per_second": 0.23
+}

trainer_state.json ADDED Viewed

The diff for this file is too large to render. See raw diff