antalvdb's picture
Upload README.md
1a918fa
|
raw
history blame
No virus
13.4 kB
metadata
license: apache-2.0
tags:
  - generated_from_trainer
model-index:
  - name: bart-base-spelling-nl-1m
    results: []

This model is a Dutch fine-tuned version of facebook/bart-base.

It achieves the following results on the evaluation set:

  • Loss: 0.0221
  • Cer: 0.0145

Model description

This is a text-to-text fine-tuned version of facebook/bart-base trained on spelling correction. It leans on the excellent work by Oliver Guhr (github, huggingface). Training was performed on an AWS EC2 instance (g5.xlarge) on a single GPU.

Intended uses & limitations

The intended use for this model is to be a component of the Valkuil.net context-sensitive spelling checker. A next version of the model will be trained on more data.

Training and evaluation data

The model was trained on a Dutch dataset composed of 2,964,203 (nearly 3m lines) of text from three public Dutch sources, downloaded from the Opus corpus:

  • nl-europarlv7.1m.txt (1,000,000 lines)
  • nl-opensubtitles2016.1m.txt (1,000,000 lines)
  • nl-wikipedia.txt (964,203 lines)

Together these texts comprise 45,308,056 tokens.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 2
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 2.0

Training results

Training Loss Epoch Step Validation Loss Cer
0.2824 0.01 1000 0.2129 0.9219
0.1971 0.02 2000 0.1600 0.9217
0.171 0.03 3000 0.1273 0.9217
0.1586 0.04 4000 0.1110 0.9216
0.1288 0.05 5000 0.0991 0.9214
0.1338 0.06 6000 0.0910 0.9215
0.1279 0.08 7000 0.0831 0.9215
0.1147 0.09 8000 0.0789 0.9215
0.1091 0.1 9000 0.0769 0.9216
0.0935 0.11 10000 0.0700 0.9214
0.0963 0.12 11000 0.0678 0.9215
0.0969 0.13 12000 0.0654 0.9214
0.0957 0.14 13000 0.0627 0.9215
0.0886 0.15 14000 0.0644 0.9215
0.0911 0.16 15000 0.0604 0.9215
0.0955 0.17 16000 0.0595 0.9215
0.0875 0.18 17000 0.0587 0.9213
0.0879 0.19 18000 0.0576 0.9214
0.079 0.21 19000 0.0550 0.9213
0.0808 0.22 20000 0.0536 0.9215
0.0684 0.23 21000 0.0536 0.9214
0.0789 0.24 22000 0.0530 0.9214
0.088 0.25 23000 0.0524 0.9215
0.076 0.26 24000 0.0519 0.9214
0.0714 0.27 25000 0.0506 0.9213
0.0664 0.28 26000 0.0495 0.9213
0.0791 0.29 27000 0.0492 0.9215
0.0702 0.3 28000 0.0485 0.9215
0.0709 0.31 29000 0.0493 0.9213
0.0676 0.32 30000 0.0480 0.9214
0.0692 0.34 31000 0.0468 0.9215
0.0633 0.35 32000 0.0473 0.9213
0.0732 0.36 33000 0.0455 0.9213
0.0809 0.37 34000 0.0455 0.9214
0.0562 0.38 35000 0.0451 0.9214
0.0715 0.39 36000 0.0440 0.9214
0.0596 0.4 37000 0.0441 0.9214
0.0534 0.41 38000 0.0430 0.9213
0.0657 0.42 39000 0.0427 0.9214
0.0643 0.43 40000 0.0441 0.9212
0.0579 0.44 41000 0.0414 0.9213
0.0695 0.45 42000 0.0430 0.9212
0.0566 0.47 43000 0.0413 0.9212
0.0646 0.48 44000 0.0415 0.9213
0.0573 0.49 45000 0.0410 0.9212
0.0568 0.5 46000 0.0406 0.9213
0.065 0.51 47000 0.0405 0.9213
0.063 0.52 48000 0.0396 0.9213
0.0654 0.53 49000 0.0397 0.9213
0.0506 0.54 50000 0.0391 0.9212
0.0573 0.55 51000 0.0382 0.9213
0.0569 0.56 52000 0.0381 0.9214
0.0597 0.57 53000 0.0381 0.9212
0.0543 0.58 54000 0.0374 0.9213
0.057 0.59 55000 0.0381 0.9213
0.058 0.61 56000 0.0380 0.9212
0.0481 0.62 57000 0.0366 0.9213
0.0581 0.63 58000 0.0367 0.9212
0.0521 0.64 59000 0.0363 0.9213
0.0543 0.65 60000 0.0358 0.9212
0.0594 0.66 61000 0.0359 0.9214
0.0479 0.67 62000 0.0354 0.9212
0.0512 0.68 63000 0.0357 0.9211
0.0488 0.69 64000 0.0341 0.9213
0.0485 0.7 65000 0.0346 0.9213
0.052 0.71 66000 0.0343 0.9213
0.0427 0.72 67000 0.0341 0.9212
0.0502 0.74 68000 0.0343 0.9211
0.0434 0.75 69000 0.0337 0.9213
0.0579 0.76 70000 0.0337 0.9213
0.0534 0.77 71000 0.0330 0.9212
0.0437 0.78 72000 0.0334 0.9212
0.05 0.79 73000 0.0332 0.9213
0.043 0.8 74000 0.0329 0.9212
0.0554 0.81 75000 0.0323 0.9212
0.0418 0.82 76000 0.0326 0.9212
0.0461 0.83 77000 0.0326 0.9212
0.0435 0.84 78000 0.0319 0.9212
0.0453 0.85 79000 0.0317 0.9212
0.0434 0.87 80000 0.0318 0.9212
0.0466 0.88 81000 0.0321 0.9212
0.0461 0.89 82000 0.0316 0.9212
0.0381 0.9 83000 0.0311 0.9213
0.0455 0.91 84000 0.0306 0.9212
0.0446 0.92 85000 0.0315 0.9212
0.0532 0.93 86000 0.0305 0.9212
0.052 0.94 87000 0.0305 0.9212
0.0353 0.95 88000 0.0305 0.9211
0.0469 0.96 89000 0.0304 0.9212
0.0387 0.97 90000 0.0303 0.9212
0.0478 0.98 91000 0.0302 0.9212
0.0395 1.0 92000 0.0299 0.9212
0.0387 1.01 93000 0.0290 0.9212
0.0356 1.02 94000 0.0287 0.9212
0.0381 1.03 95000 0.0295 0.9212
0.0386 1.04 96000 0.0284 0.9213
0.038 1.05 97000 0.0293 0.9212
0.0346 1.06 98000 0.0284 0.9212
0.0357 1.07 99000 0.0285 0.9212
0.0446 1.08 100000 0.0287 0.9211
0.0424 1.09 101000 0.0284 0.9213
0.0357 1.1 102000 0.0282 0.9211
0.0413 1.11 103000 0.0282 0.9211
0.0348 1.12 104000 0.0279 0.9212
0.0363 1.14 105000 0.0279 0.9212
0.0329 1.15 106000 0.0282 0.9211
0.0438 1.16 107000 0.0279 0.9212
0.037 1.17 108000 0.0274 0.9212
0.0311 1.18 109000 0.0278 0.9212
0.0297 1.19 110000 0.0275 0.9212
0.0323 1.2 111000 0.0271 0.9212
0.0387 1.21 112000 0.0275 0.9212
0.0366 1.22 113000 0.0269 0.9211
0.0345 1.23 114000 0.0269 0.9211
0.0389 1.24 115000 0.0261 0.9211
0.0381 1.25 116000 0.0265 0.9211
0.0324 1.27 117000 0.0265 0.9211
0.0345 1.28 118000 0.0260 0.9212
0.032 1.29 119000 0.0260 0.9211
0.0359 1.3 120000 0.0259 0.9211
0.0347 1.31 121000 0.0259 0.9212
0.0334 1.32 122000 0.0253 0.9211
0.0297 1.33 123000 0.0260 0.9210
0.0333 1.34 124000 0.0251 0.9212
0.0303 1.35 125000 0.0254 0.9211
0.0292 1.36 126000 0.0250 0.9211
0.0318 1.37 127000 0.0250 0.9212
0.0284 1.38 128000 0.0250 0.9211
0.0311 1.4 129000 0.0248 0.9211
0.0323 1.41 130000 0.0248 0.9211
0.0253 1.42 131000 0.0244 0.9211
0.0287 1.43 132000 0.0246 0.9211
0.0351 1.44 133000 0.0240 0.9212
0.0363 1.45 134000 0.0238 0.9211
0.0264 1.46 135000 0.0240 0.9211
0.0304 1.47 136000 0.0242 0.9211
0.0325 1.48 137000 0.0236 0.9212
0.033 1.49 138000 0.0239 0.9211
0.03 1.5 139000 0.0236 0.9211
0.0256 1.51 140000 0.0235 0.9211
0.0312 1.53 141000 0.0237 0.9211
0.0302 1.54 142000 0.0237 0.9211
0.0227 1.55 143000 0.0232 0.9212
0.0261 1.56 144000 0.0232 0.9211
0.0269 1.57 145000 0.0227 0.9211
0.0312 1.58 146000 0.0228 0.9211
0.0298 1.59 147000 0.0231 0.9211
0.0281 1.6 148000 0.0226 0.9212
0.029 1.61 149000 0.0227 0.9211
0.0324 1.62 150000 0.0225 0.9211
0.0251 1.63 151000 0.0223 0.9212
0.0278 1.64 152000 0.0223 0.9211
0.0284 1.65 153000 0.0224 0.9210
0.0254 1.67 154000 0.0220 0.9211
0.028 1.68 155000 0.0221 0.9210
0.0247 1.69 156000 0.0222 0.9211
0.0295 1.7 157000 0.0218 0.9211
0.0283 1.71 158000 0.0216 0.9211
0.0245 1.72 159000 0.0218 0.9211
0.0249 1.73 160000 0.0216 0.9211
0.0264 1.74 161000 0.0215 0.9211
0.0264 1.75 162000 0.0213 0.9211
0.0306 1.76 163000 0.0212 0.9211
0.0242 1.77 164000 0.0212 0.9212
0.0247 1.78 165000 0.0211 0.9211
0.0227 1.8 166000 0.0211 0.9210
0.0252 1.81 167000 0.0211 0.9211
0.0269 1.82 168000 0.0208 0.9211
0.0256 1.83 169000 0.0209 0.9211
0.0234 1.84 170000 0.0207 0.9211
0.0258 1.85 171000 0.0207 0.9211
0.0282 1.86 172000 0.0205 0.9210
0.0282 1.87 173000 0.0206 0.9210
0.0234 1.88 174000 0.0205 0.9211
0.0222 1.89 175000 0.0204 0.9211
0.0237 1.9 176000 0.0203 0.9211
0.0299 1.91 177000 0.0203 0.9211
0.0246 1.93 178000 0.0203 0.9211
0.0227 1.94 179000 0.0204 0.9211
0.0253 1.95 180000 0.0202 0.9211
0.0197 1.96 181000 0.0202 0.9211
0.0231 1.97 182000 0.0200 0.9211
0.0244 1.98 183000 0.0201 0.9211
0.0259 1.99 184000 0.0200 0.9211

Framework versions

  • Transformers 4.27.3
  • Pytorch 2.0.0+cu117
  • Datasets 2.10.1
  • Tokenizers 0.13.2