anton-l HF staff commited on
Commit
3144ae1
1 Parent(s): 7a311db

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +49 -52
README.md CHANGED
@@ -8,37 +8,57 @@ metrics:
8
  - recall
9
  - accuracy
10
  model-index:
11
- - name: stack-edu-scorer
12
  results: []
13
  ---
14
 
15
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
16
- should probably proofread and complete it, then remove this comment. -->
17
 
18
- # stack-edu-scorer
19
 
20
- This model is a fine-tuned version of [Snowflake/snowflake-arctic-embed-m](https://huggingface.co/Snowflake/snowflake-arctic-embed-m) on an unknown dataset.
21
- It achieves the following results on the evaluation set:
22
- - Loss: 0.3426
23
- - Precision: 0.5188
24
- - Recall: 0.3971
25
- - F1 Macro: 0.4258
26
- - Accuracy: 0.6350
27
 
28
- ## Model description
 
29
 
30
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
31
 
32
  ## Intended uses & limitations
33
 
34
- More information needed
35
 
36
- ## Training and evaluation data
 
 
37
 
38
- More information needed
 
39
 
40
  ## Training procedure
41
 
 
 
 
 
42
  ### Training hyperparameters
43
 
44
  The following hyperparameters were used during training:
@@ -52,42 +72,19 @@ The following hyperparameters were used during training:
52
 
53
  ### Training results
54
 
55
- | Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 Macro | Accuracy |
56
- |:-------------:|:-------:|:-----:|:---------------:|:---------:|:------:|:--------:|:--------:|
57
- | 0.3973 | 0.5787 | 1000 | 0.3904 | 0.4701 | 0.3433 | 0.3701 | 0.5885 |
58
- | 0.3848 | 1.1574 | 2000 | 0.3803 | 0.5107 | 0.3574 | 0.3863 | 0.5974 |
59
- | 0.3667 | 1.7361 | 3000 | 0.3715 | 0.6471 | 0.4478 | 0.4879 | 0.6103 |
60
- | 0.3727 | 2.3148 | 4000 | 0.3655 | 0.6140 | 0.4375 | 0.4715 | 0.6121 |
61
- | 0.3639 | 2.8935 | 5000 | 0.3617 | 0.6234 | 0.4519 | 0.4879 | 0.6176 |
62
- | 0.3684 | 3.4722 | 6000 | 0.3626 | 0.6424 | 0.4632 | 0.5020 | 0.6211 |
63
- | 0.3557 | 4.0509 | 7000 | 0.3589 | 0.5519 | 0.3739 | 0.4032 | 0.6175 |
64
- | 0.3513 | 4.6296 | 8000 | 0.3650 | 0.6328 | 0.4671 | 0.5010 | 0.6241 |
65
- | 0.3505 | 5.2083 | 9000 | 0.3535 | 0.5320 | 0.3850 | 0.4129 | 0.6259 |
66
- | 0.3549 | 5.7870 | 10000 | 0.3526 | 0.6358 | 0.4588 | 0.4949 | 0.6248 |
67
- | 0.3465 | 6.3657 | 11000 | 0.3580 | 0.5204 | 0.3712 | 0.3970 | 0.6166 |
68
- | 0.3468 | 6.9444 | 12000 | 0.3498 | 0.5266 | 0.3936 | 0.4235 | 0.6293 |
69
- | 0.3463 | 7.5231 | 13000 | 0.3497 | 0.6837 | 0.4661 | 0.4999 | 0.6300 |
70
- | 0.3404 | 8.1019 | 14000 | 0.3557 | 0.6169 | 0.4940 | 0.5285 | 0.6307 |
71
- | 0.3381 | 8.6806 | 15000 | 0.3493 | 0.5124 | 0.3871 | 0.4135 | 0.6290 |
72
- | 0.342 | 9.2593 | 16000 | 0.3482 | 0.5265 | 0.3959 | 0.4247 | 0.6337 |
73
- | 0.3397 | 9.8380 | 17000 | 0.3477 | 0.5210 | 0.3919 | 0.4191 | 0.6325 |
74
- | 0.3407 | 10.4167 | 18000 | 0.3465 | 0.5380 | 0.3895 | 0.4202 | 0.6297 |
75
- | 0.3303 | 10.9954 | 19000 | 0.3471 | 0.5273 | 0.3952 | 0.4234 | 0.6355 |
76
- | 0.3296 | 11.5741 | 20000 | 0.3447 | 0.5428 | 0.3891 | 0.4173 | 0.6313 |
77
- | 0.3299 | 12.1528 | 21000 | 0.3451 | 0.5173 | 0.3964 | 0.4248 | 0.6347 |
78
- | 0.3316 | 12.7315 | 22000 | 0.3448 | 0.6321 | 0.4809 | 0.5167 | 0.6350 |
79
- | 0.3289 | 13.3102 | 23000 | 0.3446 | 0.5100 | 0.3969 | 0.4242 | 0.6358 |
80
- | 0.3278 | 13.8889 | 24000 | 0.3445 | 0.5451 | 0.3918 | 0.4223 | 0.6327 |
81
- | 0.3249 | 14.4676 | 25000 | 0.3440 | 0.5282 | 0.3915 | 0.4194 | 0.6343 |
82
- | 0.328 | 15.0463 | 26000 | 0.3438 | 0.5670 | 0.3880 | 0.4183 | 0.6316 |
83
- | 0.3263 | 15.625 | 27000 | 0.3448 | 0.6290 | 0.4828 | 0.5191 | 0.6363 |
84
- | 0.3243 | 16.2037 | 28000 | 0.3437 | 0.5534 | 0.3950 | 0.4252 | 0.6356 |
85
- | 0.3265 | 16.7824 | 29000 | 0.3435 | 0.5432 | 0.3926 | 0.4217 | 0.6328 |
86
- | 0.3193 | 17.3611 | 30000 | 0.3432 | 0.5231 | 0.3962 | 0.4238 | 0.6348 |
87
- | 0.3261 | 17.9398 | 31000 | 0.3433 | 0.5517 | 0.3933 | 0.4235 | 0.6326 |
88
- | 0.317 | 18.5185 | 32000 | 0.3431 | 0.5527 | 0.3929 | 0.4220 | 0.6334 |
89
- | 0.3222 | 19.0972 | 33000 | 0.3429 | 0.5132 | 0.3976 | 0.4259 | 0.6357 |
90
- | 0.3223 | 19.6759 | 34000 | 0.3426 | 0.5188 | 0.3971 | 0.4258 | 0.6350 |
91
 
92
 
93
  ### Framework versions
 
8
  - recall
9
  - accuracy
10
  model-index:
11
+ - name: python-edu-scorer
12
  results: []
13
  ---
14
 
 
 
15
 
16
+ # Python-Edu Scorer
17
 
18
+ This model is a fine-tuned version of [Snowflake/snowflake-arctic-embed-m](https://huggingface.co/Snowflake/snowflake-arctic-embed-m) on a dataset of Python files labeled by Llama3 for educational value.
19
+ We used this classifier to build [Python-Edu](https://huggingface.co/datasets/HuggingFaceTB/cosmopedia-v2) dataset.
 
 
 
 
 
20
 
21
+ ### How to use in transformers
22
+ To load the Python-Edu classifier, use the following code:
23
 
24
+ ```python
25
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
26
+
27
+ tokenizer = AutoTokenizer.from_pretrained("HuggingFaceTB/python-edu-scorer")
28
+ model = AutoModelForSequenceClassification.from_pretrained("HuggingFaceTB/python-edu-scorer")
29
+
30
+ text = "This is a test sentence."
31
+ inputs = tokenizer(text, return_tensors="pt", padding="longest", truncation=True)
32
+ outputs = model(**inputs)
33
+ logits = outputs.logits.squeeze(-1).float().detach().numpy()
34
+ score = logits.item()
35
+ result = {
36
+ "text": text,
37
+ "score": score,
38
+ "int_score": int(round(max(0, min(score, 5)))),
39
+ }
40
+
41
+ print(result)
42
+ # {'text': 'This is a test sentence.', 'score': 0.07964489609003067, 'int_score': 0}
43
+ ```
44
 
45
  ## Intended uses & limitations
46
 
47
+ While the Python-Edu classifier performs well in distinguishing high-quality python code, there are some limitations:
48
 
49
+ - Scope: The model's performance might change for other datasets, in particular for out of distribution samples. It is also focused on educational content relevant to beginners and may not perform as well on content intended for higher education or specialized domains.
50
+ - Bias: The model's performance is dependent on the quality and representativeness of the training data and the LLM used for the annotation. Biases in both can affect the classifier's judgments. It might overfit to thoroughly commented code.
51
+ - Context: The classifier evaluates individual code files without considering broader context, which might impact its effectiveness in certain scenarios.
52
 
53
+ The training and inference code is available on GitHub
54
+ https://github.com/huggingface/cosmopedia/tree/main/classification
55
 
56
  ## Training procedure
57
 
58
+ The classifier was trained on 450,000 pairs of python code files and their scores from 1 to 5, generated by Llama3. The samples were annotated based on their educational quality with 1 being not educational and 5 being highly educational.
59
+
60
+ We added a classification head with a single regression output to [Snowflake-arctic-embed](https://huggingface.co/Snowflake/snowflake-arctic-embed-m) and trained the model for 20 epochs with a learning rate of 3e-4. During training, the embedding and encoder layers were frozen to focus on the classification head.
61
+
62
  ### Training hyperparameters
63
 
64
  The following hyperparameters were used during training:
 
72
 
73
  ### Training results
74
 
75
+ ```
76
+ precision recall f1-score support
77
+
78
+ 1 0.84 0.46 0.59 8364
79
+ 2 0.61 0.76 0.68 19605
80
+ 3 0.60 0.62 0.61 16187
81
+ 4 0.72 0.50 0.59 4872
82
+ 5 0.38 0.08 0.13 118
83
+
84
+ accuracy 0.64 49146
85
+ macro avg 0.63 0.48 0.52 49146
86
+ weighted avg 0.66 0.64 0.63 49146
87
+ ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
88
 
89
 
90
  ### Framework versions