scenario-KD-PR-MSV-EN-EN-D2_data-en-massive_all_1_166

This model is a fine-tuned version of haryoaw/scenario-MDBT-TCR_data-en-massive_all_1_1 on the massive dataset. It achieves the following results on the evaluation set:

Loss: 3.9034
Accuracy: 0.3152
F1: 0.2967

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 32
seed: 66
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 30

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
No log	0.28	100	4.2349	0.0671	0.0036
No log	0.56	200	4.0584	0.1478	0.0387
No log	0.83	300	3.8868	0.2207	0.1016
No log	1.11	400	3.8941	0.2315	0.1309
3.3584	1.39	500	3.7344	0.2735	0.1722
3.3584	1.67	600	3.8172	0.2668	0.1933
3.3584	1.94	700	3.7445	0.2709	0.2084
3.3584	2.22	800	3.6741	0.3013	0.2250
3.3584	2.5	900	3.7111	0.2996	0.2395
1.98	2.78	1000	3.6822	0.2994	0.2445
1.98	3.06	1100	3.7436	0.2984	0.2477
1.98	3.33	1200	3.7195	0.3040	0.2515
1.98	3.61	1300	3.8800	0.2706	0.2273
1.98	3.89	1400	3.7345	0.3057	0.2428
1.5602	4.17	1500	3.8605	0.3010	0.2528
1.5602	4.44	1600	3.7124	0.3140	0.2674
1.5602	4.72	1700	3.7400	0.3041	0.2521
1.5602	5.0	1800	3.9425	0.2957	0.2605
1.5602	5.28	1900	3.7719	0.3133	0.2768
1.3533	5.56	2000	3.8076	0.3100	0.2835
1.3533	5.83	2100	3.6673	0.3258	0.2794
1.3533	6.11	2200	3.8029	0.3080	0.2641
1.3533	6.39	2300	3.7847	0.3079	0.2601
1.3533	6.67	2400	3.8791	0.2994	0.2807
1.2425	6.94	2500	3.7637	0.3122	0.2892
1.2425	7.22	2600	3.8474	0.3155	0.2742
1.2425	7.5	2700	3.8424	0.3131	0.2776
1.2425	7.78	2800	3.8016	0.3113	0.2648
1.2425	8.06	2900	3.8632	0.2981	0.2643
1.1513	8.33	3000	3.8469	0.3088	0.2705
1.1513	8.61	3100	3.9476	0.2929	0.2589
1.1513	8.89	3200	3.8249	0.3178	0.2684
1.1513	9.17	3300	3.7724	0.3166	0.2801
1.1513	9.44	3400	3.7976	0.3215	0.2793
1.0936	9.72	3500	3.6198	0.3498	0.3023
1.0936	10.0	3600	3.8257	0.3075	0.2775
1.0936	10.28	3700	3.7182	0.3224	0.2892
1.0936	10.56	3800	3.8149	0.3149	0.2797
1.0936	10.83	3900	3.7853	0.3276	0.2893
1.0476	11.11	4000	3.8488	0.3177	0.2833
1.0476	11.39	4100	4.0615	0.2979	0.2812
1.0476	11.67	4200	3.8836	0.3178	0.2891
1.0476	11.94	4300	4.1136	0.2832	0.2705
1.0476	12.22	4400	3.8156	0.3144	0.2999
1.0048	12.5	4500	3.9173	0.3117	0.2946
1.0048	12.78	4600	3.7431	0.3293	0.2965
1.0048	13.06	4700	3.7538	0.3245	0.2914
1.0048	13.33	4800	3.9135	0.2957	0.2827
1.0048	13.61	4900	3.8702	0.3133	0.2926
0.9752	13.89	5000	3.8238	0.3131	0.2861
0.9752	14.17	5100	3.9863	0.2986	0.2860
0.9752	14.44	5200	3.9071	0.3068	0.2891
0.9752	14.72	5300	4.1397	0.2902	0.2831
0.9752	15.0	5400	4.0661	0.2916	0.2760
0.9544	15.28	5500	3.9804	0.3059	0.2848
0.9544	15.56	5600	4.1628	0.2815	0.2757
0.9544	15.83	5700	3.8083	0.3233	0.2940
0.9544	16.11	5800	3.8357	0.3144	0.2821
0.9544	16.39	5900	4.0037	0.2987	0.2914
0.935	16.67	6000	3.8943	0.3073	0.2803
0.935	16.94	6100	3.8387	0.3171	0.2978
0.935	17.22	6200	3.9244	0.3046	0.2799
0.935	17.5	6300	3.9478	0.3065	0.2900
0.935	17.78	6400	4.0418	0.3036	0.2754
0.9186	18.06	6500	4.1112	0.2862	0.2773
0.9186	18.33	6600	4.1101	0.2907	0.2750
0.9186	18.61	6700	4.0951	0.2908	0.2763
0.9186	18.89	6800	3.9274	0.3049	0.2824
0.9186	19.17	6900	3.9502	0.2988	0.2843
0.9081	19.44	7000	4.0642	0.2935	0.2879
0.9081	19.72	7100	3.8820	0.3102	0.2914
0.9081	20.0	7200	4.0206	0.2987	0.2893
0.9081	20.28	7300	3.9475	0.3105	0.2949
0.9081	20.56	7400	3.9688	0.3088	0.2868
0.9007	20.83	7500	3.9359	0.3088	0.2867
0.9007	21.11	7600	4.0488	0.3001	0.2862
0.9007	21.39	7700	3.8327	0.3246	0.2988
0.9007	21.67	7800	3.9259	0.3146	0.2978
0.9007	21.94	7900	3.8813	0.3191	0.2962
0.8902	22.22	8000	3.9249	0.3129	0.2953
0.8902	22.5	8100	3.9929	0.3066	0.2949
0.8902	22.78	8200	3.9557	0.3118	0.2966
0.8902	23.06	8300	4.0791	0.2933	0.2811
0.8902	23.33	8400	3.8798	0.3173	0.2949
0.8812	23.61	8500	4.0575	0.2969	0.2832
0.8812	23.89	8600	3.9538	0.3071	0.2921
0.8812	24.17	8700	4.1906	0.2817	0.2775
0.8812	24.44	8800	3.9515	0.3113	0.2941
0.8812	24.72	8900	3.8893	0.3190	0.2955
0.8781	25.0	9000	3.9491	0.3094	0.2920
0.8781	25.28	9100	3.8647	0.3171	0.2928
0.8781	25.56	9200	3.8908	0.3146	0.2994
0.8781	25.83	9300	3.9586	0.3088	0.2958
0.8781	26.11	9400	3.9277	0.3104	0.2980
0.8719	26.39	9500	3.9350	0.3097	0.2946
0.8719	26.67	9600	4.0499	0.2948	0.2890
0.8719	26.94	9700	3.9529	0.3109	0.2917
0.8719	27.22	9800	3.9768	0.3073	0.2896
0.8719	27.5	9900	3.8371	0.3239	0.3011
0.871	27.78	10000	3.9067	0.3131	0.2976
0.871	28.06	10100	3.8732	0.3183	0.2971
0.871	28.33	10200	3.9588	0.3070	0.2915
0.871	28.61	10300	3.9081	0.3143	0.2988
0.871	28.89	10400	3.8574	0.3199	0.3004
0.8673	29.17	10500	3.9120	0.3131	0.2961
0.8673	29.44	10600	3.8986	0.3147	0.2972
0.8673	29.72	10700	3.9068	0.3149	0.2967
0.8673	30.0	10800	3.9034	0.3152	0.2967

Framework versions

Transformers 4.33.3
Pytorch 2.1.1+cu121
Datasets 2.14.5
Tokenizers 0.13.3

haryoaw
/

scenario-KD-PR-MSV-EN-EN-D2_data-en-massive_all_1_166

scenario-KD-PR-MSV-EN-EN-D2_data-en-massive_all_1_166

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for haryoaw/scenario-KD-PR-MSV-EN-EN-D2_data-en-massive_all_1_166

Evaluation results