--- license: apache-2.0 base_model: EleutherAI/pythia-70m-deduped tags: - generated_from_trainer model-index: - name: chessdevilai results: [] --- # chessdevilai This model is a fine-tuned version of [EleutherAI/pythia-70m-deduped](https://huggingface.co/EleutherAI/pythia-70m-deduped) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 0.7609 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 5e-05 - train_batch_size: 8 - eval_batch_size: 8 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: cosine - num_epochs: 1 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:------:|:----:|:---------------:| | 1.1912 | 0.0100 | 62 | 1.2654 | | 1.1714 | 0.0200 | 124 | 1.1780 | | 1.0771 | 0.0301 | 186 | 1.1419 | | 1.0829 | 0.0401 | 248 | 1.1046 | | 1.0113 | 0.0501 | 310 | 1.0850 | | 1.152 | 0.0601 | 372 | 1.0701 | | 1.0895 | 0.0701 | 434 | 1.0544 | | 0.9123 | 0.0802 | 496 | 1.0484 | | 1.0489 | 0.0902 | 558 | 1.0214 | | 1.0312 | 0.1002 | 620 | 1.0252 | | 0.9756 | 0.1102 | 682 | 1.0020 | | 1.0125 | 0.1202 | 744 | 0.9940 | | 1.0581 | 0.1303 | 806 | 0.9862 | | 1.0726 | 0.1403 | 868 | 0.9809 | | 0.9963 | 0.1503 | 930 | 0.9830 | | 0.9309 | 0.1603 | 992 | 0.9653 | | 0.8858 | 0.1703 | 1054 | 0.9538 | | 1.1137 | 0.1803 | 1116 | 0.9472 | | 0.9024 | 0.1904 | 1178 | 0.9411 | | 0.9812 | 0.2004 | 1240 | 0.9396 | | 0.9916 | 0.2104 | 1302 | 0.9254 | | 0.9509 | 0.2204 | 1364 | 0.9334 | | 0.8848 | 0.2304 | 1426 | 0.9439 | | 0.8302 | 0.2405 | 1488 | 0.9175 | | 1.0111 | 0.2505 | 1550 | 0.9158 | | 1.0273 | 0.2605 | 1612 | 0.9182 | | 0.8968 | 0.2705 | 1674 | 0.9116 | | 0.8892 | 0.2805 | 1736 | 0.9098 | | 0.7539 | 0.2906 | 1798 | 0.8896 | | 0.811 | 0.3006 | 1860 | 0.8968 | | 0.928 | 0.3106 | 1922 | 0.8875 | | 0.8163 | 0.3206 | 1984 | 0.8821 | | 0.9202 | 0.3306 | 2046 | 0.8820 | | 1.0208 | 0.3407 | 2108 | 0.8811 | | 0.8297 | 0.3507 | 2170 | 0.8823 | | 0.8213 | 0.3607 | 2232 | 0.8736 | | 0.8324 | 0.3707 | 2294 | 0.8698 | | 0.7721 | 0.3807 | 2356 | 0.8735 | | 0.9504 | 0.3908 | 2418 | 0.8705 | | 0.858 | 0.4008 | 2480 | 0.8620 | | 0.8791 | 0.4108 | 2542 | 0.8540 | | 0.8411 | 0.4208 | 2604 | 0.8606 | | 0.8845 | 0.4308 | 2666 | 0.8496 | | 0.7752 | 0.4409 | 2728 | 0.8462 | | 0.8598 | 0.4509 | 2790 | 0.8481 | | 0.7935 | 0.4609 | 2852 | 0.8412 | | 0.7352 | 0.4709 | 2914 | 0.8392 | | 0.8153 | 0.4809 | 2976 | 0.8426 | | 0.7371 | 0.4910 | 3038 | 0.8332 | | 0.7136 | 0.5010 | 3100 | 0.8300 | | 0.9777 | 0.5110 | 3162 | 0.8294 | | 0.8336 | 0.5210 | 3224 | 0.8306 | | 0.7546 | 0.5310 | 3286 | 0.8234 | | 0.8436 | 0.5410 | 3348 | 0.8237 | | 0.9316 | 0.5511 | 3410 | 0.8224 | | 0.6996 | 0.5611 | 3472 | 0.8191 | | 0.7417 | 0.5711 | 3534 | 0.8146 | | 0.8528 | 0.5811 | 3596 | 0.8110 | | 0.6861 | 0.5911 | 3658 | 0.8095 | | 0.8401 | 0.6012 | 3720 | 0.8096 | | 0.7056 | 0.6112 | 3782 | 0.8080 | | 0.8643 | 0.6212 | 3844 | 0.8004 | | 0.7575 | 0.6312 | 3906 | 0.8018 | | 0.8133 | 0.6412 | 3968 | 0.8008 | | 0.8221 | 0.6513 | 4030 | 0.7940 | | 0.8004 | 0.6613 | 4092 | 0.7948 | | 0.7002 | 0.6713 | 4154 | 0.7984 | | 0.8425 | 0.6813 | 4216 | 0.7892 | | 0.6777 | 0.6913 | 4278 | 0.7876 | | 0.9178 | 0.7014 | 4340 | 0.7865 | | 0.787 | 0.7114 | 4402 | 0.7844 | | 0.6979 | 0.7214 | 4464 | 0.7829 | | 0.7954 | 0.7314 | 4526 | 0.7825 | | 0.7937 | 0.7414 | 4588 | 0.7792 | | 0.7849 | 0.7515 | 4650 | 0.7790 | | 0.7108 | 0.7615 | 4712 | 0.7782 | | 0.831 | 0.7715 | 4774 | 0.7768 | | 0.8242 | 0.7815 | 4836 | 0.7741 | | 0.7472 | 0.7915 | 4898 | 0.7731 | | 0.8171 | 0.8016 | 4960 | 0.7732 | | 0.7857 | 0.8116 | 5022 | 0.7702 | | 0.7925 | 0.8216 | 5084 | 0.7707 | | 0.7134 | 0.8316 | 5146 | 0.7680 | | 0.8401 | 0.8416 | 5208 | 0.7686 | | 0.6919 | 0.8516 | 5270 | 0.7679 | | 0.7689 | 0.8617 | 5332 | 0.7658 | | 0.7899 | 0.8717 | 5394 | 0.7645 | | 0.8457 | 0.8817 | 5456 | 0.7639 | | 0.7738 | 0.8917 | 5518 | 0.7635 | | 0.7943 | 0.9017 | 5580 | 0.7628 | | 0.756 | 0.9118 | 5642 | 0.7625 | | 0.8021 | 0.9218 | 5704 | 0.7619 | | 0.7325 | 0.9318 | 5766 | 0.7615 | | 0.7312 | 0.9418 | 5828 | 0.7613 | | 0.8255 | 0.9518 | 5890 | 0.7613 | | 0.794 | 0.9619 | 5952 | 0.7610 | | 0.7392 | 0.9719 | 6014 | 0.7609 | | 0.841 | 0.9819 | 6076 | 0.7609 | | 0.7018 | 0.9919 | 6138 | 0.7609 | ### Framework versions - Transformers 4.41.2 - Pytorch 2.3.0+cu121 - Datasets 2.19.2 - Tokenizers 0.19.1