--- license: cc-by-4.0 base_model: kxx-kkk/FYP_sq2_mrqa_adqa_synqa tags: - generated_from_trainer model-index: - name: FYP_qa_final results: - task: type: question-answering name: Question Answering dataset: name: squad_v2 type: squad_v2 config: squad_v2 split: validation metrics: - type: exact_match value: 82.3 name: Exact Match - type: f1 value: 85.7701063996245 name: F1 - task: type: question-answering name: Question Answering dataset: name: squad type: squad config: plain_text split: validation metrics: - type: exact_match value: 89.9 name: Exact Match - type: f1 value: 93.57935153408677 name: F1 datasets: - rajpurkar/squad_v2 - mrqa - UCLNLP/adversarial_qa - mbartolo/synQA language: - en pipeline_tag: question-answering --- # FYP_qa_final This model is a fine-tuned version of [deepset/deberta-v3-base-squad2](https://huggingface.co/deepset/deberta-v3-base-squad2) on an [MRQA](https://huggingface.co/datasets/mrqa) dataset. It achieves the following results on the evaluation set: - Loss: 2.7493 ## Model description This model is trained for performing extractive question-answering tasks for academic essays. ## Intended uses & limitations More information needed ## Training and evaluation data The dataset used for training is listed below according to training sequences: 1. [MRQA(train split)](https://huggingface.co/datasets/mrqa) 2. [UCLNLP/adversarial_qa](https://huggingface.co/datasets/UCLNLP/adversarial_qa) 3. [mbartolo/synQA](https://huggingface.co/datasets/mbartolo/synQA) 4. [MRQA(test split)](https://huggingface.co/datasets/mrqa)*This model ## Training procedure The training approach uses the fine-tuning approach of transfer learning on the pre-trained model to perform NLP QA tasks. Each time a model was trained with one dataset only and saved as the PTMs for the next training. This model is the last model that trained with [MRQA(test split)](https://huggingface.co/datasets/mrqa). ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 1e-05 - train_batch_size: 16 - eval_batch_size: 16 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 3 - mixed_precision_training: Native AMP ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:----:|:---------------:| | 2.8084 | 0.48 | 300 | 3.1468 | | 2.5707 | 0.96 | 600 | 2.9035 | | 2.5187 | 1.44 | 900 | 2.7175 | | 2.4463 | 1.91 | 1200 | 2.7497 | | 2.4328 | 2.39 | 1500 | 2.7229 | | 2.3839 | 2.87 | 1800 | 2.7493 | ### Framework versions - Transformers 4.39.3 - Pytorch 2.2.1+cu121 - Datasets 2.18.0 - Tokenizers 0.15.2