--- base_model: Xclbr7/Arcanum-12b language: - en license: apache-2.0 tags: - text-generation-inference - transformers - unsloth - mistral - trl --- # Aether-12b Aether-12b is a fine-tuned large language model based on Arcanum-12b, further trained on the CleverBoi-Data-20k dataset. ## Model Details 📊 - Developed by: AIXON Lab - Model type: Causal Language Model - Language(s): English (primarily), may support other languages - License: apache-2.0 - Repository: https://huggingface.co/aixonlab/Aether-12b ## Model Architecture 🏗️ - Base model: Arcanum-12b - Parameter count: ~12 billion - Architecture specifics: Transformer-based language model ## Open LLM Leaderboard Evaluation Results Coming Soon ! ## Training & Fine-tuning 🔄 Aether-12b was fine-tuned on the following dataset: - Dataset: theprint/CleverBoi-Data-20k - Fine-tuning method: TRL SFTTrainer with AdamW optimizer, cosine decay LR scheduler, bfloat16 precision. The CleverBoi-Data-20k dataset improved the model in the following ways: 1. Enhanced reasoning and problem-solving capabilities 2. Broader knowledge across various topics 3. Improved performance on specific tasks like writing, analysis, and problem-solving 4. Better contextual understanding and response generation ## Intended Use 🎯 As an assistant or specific role bot. ## Ethical Considerations 🤔 As a fine-tuned model based on Arcanum-12b, this model may inherit biases and limitations from its parent model and the fine-tuning dataset. Users should be aware of potential biases in generated content and use the model responsibly. ## Acknowledgments 🙏 We acknowledge the contributions of: - theprint for the amazing CleverBoi-Data-20k dataset