scenario-KD-PO-MSV-EN-EN-D2_data-en-massive_all_1_166

This model is a fine-tuned version of haryoaw/scenario-MDBT-TCR_data-en-massive_all_1_1 on the massive dataset. It achieves the following results on the evaluation set:

Loss: 18.6035
Accuracy: 0.2922
F1: 0.2796

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 32
seed: 66
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 30

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
No log	0.28	100	16.0585	0.0733	0.0068
No log	0.56	200	14.9516	0.1179	0.0282
No log	0.83	300	15.2519	0.1360	0.0631
No log	1.11	400	14.4169	0.1781	0.0989
11.7852	1.39	500	13.7629	0.2084	0.1253
11.7852	1.67	600	14.0237	0.2160	0.1496
11.7852	1.94	700	13.4321	0.2267	0.1644
11.7852	2.22	800	13.4657	0.2486	0.1846
11.7852	2.5	900	13.5450	0.2570	0.1981
5.8982	2.78	1000	13.3906	0.2568	0.2051
5.8982	3.06	1100	13.6663	0.2642	0.2024
5.8982	3.33	1200	14.2607	0.2513	0.1952
5.8982	3.61	1300	14.3919	0.2542	0.2029
5.8982	3.89	1400	14.5446	0.2510	0.2007
4.048	4.17	1500	14.1289	0.2602	0.2153
4.048	4.44	1600	13.3403	0.2956	0.2267
4.048	4.72	1700	14.8539	0.2622	0.2131
4.048	5.0	1800	13.5653	0.2939	0.2389
4.048	5.28	1900	14.6566	0.2792	0.2383
3.0561	5.56	2000	14.7909	0.2676	0.2348
3.0561	5.83	2100	13.9744	0.3007	0.2483
3.0561	6.11	2200	15.5685	0.2777	0.2348
3.0561	6.39	2300	15.5535	0.2727	0.2297
3.0561	6.67	2400	15.1022	0.2858	0.2416
2.3857	6.94	2500	15.8558	0.2734	0.2361
2.3857	7.22	2600	16.0300	0.2645	0.2272
2.3857	7.5	2700	16.0818	0.2734	0.2460
2.3857	7.78	2800	16.2737	0.2782	0.2445
2.3857	8.06	2900	16.3877	0.2587	0.2262
1.9693	8.33	3000	17.3819	0.2583	0.2331
1.9693	8.61	3100	17.1412	0.2636	0.2366
1.9693	8.89	3200	17.3173	0.2611	0.2337
1.9693	9.17	3300	16.1257	0.2785	0.2468
1.9693	9.44	3400	17.4479	0.2671	0.2453
1.6624	9.72	3500	15.9842	0.2959	0.2595
1.6624	10.0	3600	16.6481	0.2764	0.2454
1.6624	10.28	3700	16.0613	0.2952	0.2496
1.6624	10.56	3800	17.3130	0.2796	0.2483
1.6624	10.83	3900	17.9793	0.2768	0.2415
1.4248	11.11	4000	17.9004	0.2768	0.2508
1.4248	11.39	4100	17.6532	0.2776	0.2549
1.4248	11.67	4200	17.9802	0.2763	0.2512
1.4248	11.94	4300	19.2692	0.2543	0.2468
1.4248	12.22	4400	18.8586	0.2693	0.2551
1.2048	12.5	4500	18.2546	0.2746	0.2508
1.2048	12.78	4600	18.1165	0.2729	0.2526
1.2048	13.06	4700	18.8671	0.2615	0.2417
1.2048	13.33	4800	18.8131	0.2630	0.2466
1.2048	13.61	4900	18.3799	0.2771	0.2568
1.0702	13.89	5000	18.5563	0.2691	0.2416
1.0702	14.17	5100	19.0471	0.2621	0.2449
1.0702	14.44	5200	18.6233	0.2721	0.2451
1.0702	14.72	5300	18.8386	0.2776	0.2590
1.0702	15.0	5400	19.5330	0.2655	0.2479
0.9462	15.28	5500	19.8716	0.2607	0.2499
0.9462	15.56	5600	18.5496	0.2770	0.2568
0.9462	15.83	5700	17.8301	0.2950	0.2697
0.9462	16.11	5800	17.7789	0.2951	0.2724
0.9462	16.39	5900	19.3109	0.2768	0.2673
0.8576	16.67	6000	18.1516	0.2926	0.2660
0.8576	16.94	6100	19.5121	0.2744	0.2574
0.8576	17.22	6200	19.5653	0.2763	0.2625
0.8576	17.5	6300	18.3909	0.2851	0.2640
0.8576	17.78	6400	19.0072	0.2741	0.2467
0.7527	18.06	6500	18.6327	0.2833	0.2648
0.7527	18.33	6600	18.9928	0.2803	0.2563
0.7527	18.61	6700	19.7251	0.2744	0.2603
0.7527	18.89	6800	19.1755	0.2745	0.2576
0.7527	19.17	6900	18.4740	0.2883	0.2686
0.7082	19.44	7000	18.9633	0.2866	0.2668
0.7082	19.72	7100	19.5535	0.2751	0.2648
0.7082	20.0	7200	19.1204	0.2826	0.2612
0.7082	20.28	7300	19.4658	0.2786	0.2618
0.7082	20.56	7400	18.3475	0.2930	0.2716
0.6603	20.83	7500	19.6894	0.2720	0.2568
0.6603	21.11	7600	18.3256	0.2929	0.2727
0.6603	21.39	7700	19.0269	0.2809	0.2712
0.6603	21.67	7800	18.9538	0.2834	0.2658
0.6603	21.94	7900	18.8878	0.2904	0.2742
0.6171	22.22	8000	18.9117	0.2887	0.2728
0.6171	22.5	8100	19.0627	0.2853	0.2737
0.6171	22.78	8200	19.1497	0.2822	0.2722
0.6171	23.06	8300	19.3517	0.2764	0.2673
0.6171	23.33	8400	18.9524	0.2836	0.2706
0.5721	23.61	8500	18.4516	0.2907	0.2749
0.5721	23.89	8600	18.7686	0.2881	0.2749
0.5721	24.17	8700	19.0653	0.2847	0.2697
0.5721	24.44	8800	20.1017	0.2721	0.2664
0.5721	24.72	8900	18.7587	0.2910	0.2743
0.5466	25.0	9000	19.4485	0.2827	0.2741
0.5466	25.28	9100	19.2920	0.2831	0.2677
0.5466	25.56	9200	19.1334	0.2872	0.2735
0.5466	25.83	9300	18.9784	0.2859	0.2694
0.5466	26.11	9400	18.7701	0.2914	0.2763
0.5168	26.39	9500	19.3216	0.2767	0.2665
0.5168	26.67	9600	19.3074	0.2800	0.2745
0.5168	26.94	9700	18.6569	0.2889	0.2722
0.5168	27.22	9800	19.3113	0.2800	0.2703
0.5168	27.5	9900	18.8369	0.2900	0.2774
0.5197	27.78	10000	18.7418	0.2894	0.2771
0.5197	28.06	10100	18.8462	0.2885	0.2754
0.5197	28.33	10200	18.6737	0.2913	0.2785
0.5197	28.61	10300	18.8000	0.2880	0.2755
0.5197	28.89	10400	18.5512	0.2936	0.2793
0.5027	29.17	10500	18.5273	0.2943	0.2809
0.5027	29.44	10600	18.5875	0.2920	0.2797
0.5027	29.72	10700	18.6780	0.2916	0.2807
0.5027	30.0	10800	18.6035	0.2922	0.2796

Framework versions

Transformers 4.33.3
Pytorch 2.1.1+cu121
Datasets 2.14.5
Tokenizers 0.13.3

haryoaw
/

scenario-KD-PO-MSV-EN-EN-D2_data-en-massive_all_1_166

scenario-KD-PO-MSV-EN-EN-D2_data-en-massive_all_1_166

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for haryoaw/scenario-KD-PO-MSV-EN-EN-D2_data-en-massive_all_1_166

Evaluation results