--- library_name: peft license: cc-by-nc-4.0 base_model: mistralai/Mixtral-8x7B-v0.1 datasets: - HuggingFaceH4/ultrachat_200k - rohansolo/BB_HindiHinglishV2 model-index: - name: BB-Mixtral-HindiHinglish-8x7B-v0.1 results: [] language: - hi --- ### Model Description Mixtral Fine-Tuned for Hindi and Hinglish as part of ongoing experiments by [bb deep learning systems](bhaiyabot.com) - **Developed by:** [bb deep learning systems](bhaiyabot.com) - **Language(s) (NLP):** [English, Hindi, Romanised Hindi] - **License:** [cc-by-nc-4.0] - **Finetuned from model:** [mistralai/Mixtral-8x7B-v0.1](mistralai/Mixtral-8x7B-v0.1) ### Model Sources [optional] - **Paper:** [More Information Coming Soon] ## Training Details ### Training Data A mix of [Ultrachat200k] and [rohansolo/BB_HindiHinglishV2] were used for a total of 573,014,566 tokens in Hindi, Romanised Hindi and English. ### Training Procedure Training Loss at the end was 0.8977639613123988 Model was trained using the follwoing Hyperparameters: warmup_steps: 100 weight_decay: 0.05 num_epochs: 1 optimizer: paged_adamw_8bit lr_scheduler: cosine learning_rate: 0.0002 lora_r: 32 lora_alpha: 16 lora_dropout: 0.05 lora_target_modules: - q_proj - k_proj - v_proj - o_proj - w1 - w2 - w3 lora_target_linear: lora_fan_in_fan_out: lora_modules_to_save: - embed_tokens - lm_head The following `bitsandbytes` quantization config was used during training: - quant_method: bitsandbytes - load_in_8bit: False - load_in_4bit: True - llm_int8_threshold: 6.0 - llm_int8_skip_modules: None - llm_int8_enable_fp32_cpu_offload: False - llm_int8_has_fp16_weight: False - bnb_4bit_quant_type: nf4 - bnb_4bit_use_double_quant: True - bnb_4bit_compute_dtype: bfloat16 ## Environmental Impact ``` Experiments were conducted using a private infrastructure, which has a carbon efficiency of 0.432 kgCO$_2$eq/kWh. A cumulative of 94 hours of computation was performed on hardware of type A100 SXM4 80 GB (TDP of 400W). Total emissions are estimated to be 16.24 kgCO$_2$eq of which 0 percents were directly offset. - **Hardware Type:** [8 x A100 SXM4 80 GB] ```