Sultannn commited on
Commit
deb00a7
1 Parent(s): 9114fd0

Training in progress epoch 0

Browse files
Files changed (5) hide show
  1. README.md +5 -8
  2. merges.txt +0 -0
  3. tf_model.h5 +1 -1
  4. tokenizer.json +0 -0
  5. vocab.json +0 -0
README.md CHANGED
@@ -13,9 +13,9 @@ probably proofread and complete it, then remove this comment. -->
13
 
14
  This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
15
  It achieves the following results on the evaluation set:
16
- - Train Loss: 5.6501
17
- - Validation Loss: 5.9263
18
- - Epoch: 3
19
 
20
  ## Model description
21
 
@@ -34,17 +34,14 @@ More information needed
34
  ### Training hyperparameters
35
 
36
  The following hyperparameters were used during training:
37
- - optimizer: {'inner_optimizer': {'class_name': 'AdamWeightDecay', 'config': {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 0.0006, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 0.0006, 'decay_steps': 28440, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, '__passive_serialization__': True}, 'warmup_steps': 700, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}}, 'dynamic': True, 'initial_scale': 32768.0, 'dynamic_growth_steps': 2000}
38
  - training_precision: mixed_float16
39
 
40
  ### Training results
41
 
42
  | Train Loss | Validation Loss | Epoch |
43
  |:----------:|:---------------:|:-----:|
44
- | 6.9370 | 6.3016 | 0 |
45
- | 6.1151 | 6.0404 | 1 |
46
- | 5.8246 | 5.9557 | 2 |
47
- | 5.6501 | 5.9263 | 3 |
48
 
49
 
50
  ### Framework versions
 
13
 
14
  This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
15
  It achieves the following results on the evaluation set:
16
+ - Train Loss: 8.0106
17
+ - Validation Loss: 8.1104
18
+ - Epoch: 0
19
 
20
  ## Model description
21
 
 
34
  ### Training hyperparameters
35
 
36
  The following hyperparameters were used during training:
37
+ - optimizer: {'inner_optimizer': {'class_name': 'Addons>AdamW', 'config': {'name': 'AdamW', 'learning_rate': {'class_name': 'ExponentialDecay', 'config': {'initial_learning_rate': 0.0005, 'decay_steps': 100000, 'decay_rate': 0.96, 'staircase': True, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay': 0.01, 'exclude_from_weight_decay': None}}, 'dynamic': True, 'initial_scale': 32768.0, 'dynamic_growth_steps': 2000}
38
  - training_precision: mixed_float16
39
 
40
  ### Training results
41
 
42
  | Train Loss | Validation Loss | Epoch |
43
  |:----------:|:---------------:|:-----:|
44
+ | 8.0106 | 8.1104 | 0 |
 
 
 
45
 
46
 
47
  ### Framework versions
merges.txt CHANGED
The diff for this file is too large to render. See raw diff
 
tf_model.h5 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:36f9dc692d32736d9fc66e39124c98eb31296940616cdcf9573534851c4d6110
3
  size 451065960
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:41582e56a407b1d4a0a6d75eb1b83cd5b9f706423ead1f57a2458716533a8ea5
3
  size 451065960
tokenizer.json CHANGED
The diff for this file is too large to render. See raw diff
 
vocab.json CHANGED
The diff for this file is too large to render. See raw diff