usvsnsp commited on
Commit
5504192
1 Parent(s): a60b413

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -5
README.md CHANGED
@@ -4,9 +4,15 @@ tags:
4
  - generated_from_trainer
5
  metrics:
6
  - accuracy
 
7
  model-index:
8
  - name: code-vs-nl
9
  results: []
 
 
 
 
 
10
  ---
11
 
12
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -14,7 +20,8 @@ should probably proofread and complete it, then remove this comment. -->
14
 
15
  # code-vs-nl
16
 
17
- This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on an unknown dataset.
 
18
  It achieves the following results on the evaluation set:
19
  - Loss: 0.5180
20
  - Accuracy: 0.9951
@@ -22,15 +29,15 @@ It achieves the following results on the evaluation set:
22
 
23
  ## Model description
24
 
25
- More information needed
26
 
27
  ## Intended uses & limitations
28
 
29
- More information needed
30
 
31
  ## Training and evaluation data
32
 
33
- More information needed
34
 
35
  ## Training procedure
36
 
@@ -58,4 +65,4 @@ The following hyperparameters were used during training:
58
  - Transformers 4.25.1
59
  - Pytorch 1.13.1+cu116
60
  - Datasets 2.8.0
61
- - Tokenizers 0.13.2
 
4
  - generated_from_trainer
5
  metrics:
6
  - accuracy
7
+ - f1
8
  model-index:
9
  - name: code-vs-nl
10
  results: []
11
+ datasets:
12
+ - bookcorpus
13
+ - codeparrot/github-code
14
+ language:
15
+ - en
16
  ---
17
 
18
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
20
 
21
  # code-vs-nl
22
 
23
+ This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased)
24
+ on [bookcorpus](https://huggingface.co/datasets/bookcorpus) for text and [codeparrot/github-code](https://huggingface.co/datasets/codeparrot/github-code) for code datasets.
25
  It achieves the following results on the evaluation set:
26
  - Loss: 0.5180
27
  - Accuracy: 0.9951
 
29
 
30
  ## Model description
31
 
32
+ As it's a finetuned model, it's architecture is same as distilbert-base-uncased for Sequence Classification
33
 
34
  ## Intended uses & limitations
35
 
36
+ Can be used to classify documents into text and code
37
 
38
  ## Training and evaluation data
39
 
40
+ It is a mix of above two datasets, equally random sampled
41
 
42
  ## Training procedure
43
 
 
65
  - Transformers 4.25.1
66
  - Pytorch 1.13.1+cu116
67
  - Datasets 2.8.0
68
+ - Tokenizers 0.13.2