File size: 2,474 Bytes
4567b6c
 
 
fb5682b
 
 
 
 
 
4567b6c
 
 
 
fb5682b
4567b6c
 
 
 
fb5682b
4567b6c
fb5682b
4567b6c
fb5682b
4567b6c
fb5682b
4567b6c
fb5682b
4567b6c
fb5682b
4567b6c
fb5682b
4567b6c
 
fb5682b
 
 
 
4567b6c
fb5682b
4567b6c
fb5682b
4567b6c
fb5682b
4567b6c
fb5682b
4567b6c
fb5682b
4567b6c
 
 
 
 
 
fb5682b
4567b6c
 
 
 
 
 
 
 
fb5682b
4567b6c
 
 
 
 
 
 
 
 
fb5682b
4567b6c
 
 
 
 
fb5682b
4567b6c
 
 
fb5682b
4567b6c
fb5682b
4567b6c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
fb5682b
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
---
library_name: peft
base_model: mistralai/Mistral-7B-v0.1
license: mit
language:
- en
metrics:
- perplexity
- bertscore
---

# Model Card for Model ID

Fine-tuned using QLoRA for story generation task.


### Model Description

We utilize "Hierarchical Neural Story Generation" dataset and fine-tune the model to generate stories.

The input to the model is structred as follows:

'''

\#\#\# Instruction:  Below is a story idea. Write a short story based on this context.

\#\#\# Input: [story idea here]

\#\#\# Response:

'''


- **Developed by:** Abdelrahman ’Boda’ Sadallah, Anastasiia Demidova, Daria Kotova
- **Model type:** Causal LM
- **Language(s) (NLP):** English
- **Finetuned from model [optional]:** mistralai/Mistral-7B-v0.1

### Model Sources

- **Repository:** https://github.com/BodaSadalla98/llm-optimized-fintuning

## Uses

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

The model is the result of our AI project. If you intend to use it, please, refer to the repo.


### Recommendations

<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->

For improving stories generation, you can play parameters: temeperature, top_p/top_k, repetition_penalty, etc.


## Training Details

### Training Data

<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

Github for the dataset: https://github.com/kevalnagda/StoryGeneration


## Evaluation

<!-- This section describes the evaluation protocols and provides the results. -->

### Testing Data, Factors & Metrics


Test split of the same dataset.

#### Metrics

<!-- These are the evaluation metrics being used, ideally with a description of why. -->

We are using perplexity and BERTScore.

### Results

Perplexity: 8.8647

BERTScore: 80.76 

## Training procedure


The following `bitsandbytes` quantization config was used during training:
- quant_method: bitsandbytes
- load_in_8bit: False
- load_in_4bit: True
- llm_int8_threshold: 6.0
- llm_int8_skip_modules: None
- llm_int8_enable_fp32_cpu_offload: False
- llm_int8_has_fp16_weight: False
- bnb_4bit_quant_type: fp4
- bnb_4bit_use_double_quant: False
- bnb_4bit_compute_dtype: float32

### Framework versions


- PEFT 0.6.0.dev0