engineering-lamini commited on
Commit
49d95d8
1 Parent(s): f2413e0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -5,13 +5,13 @@ license: apache-2.0
5
  # Model Name: Lamini-1
6
 
7
  # Description
8
- Lamini-1 is a novel language model architecture designed to mitigate hallucinations in large language models (LLMs). By leveraging a massive mixture of memory experts (MoEs), Lamini-1 can store and retrieve a large number of facts precisely, allowing it to achieve near-zero training loss on a set of 100 randomly generated facts. This architecture is particularly effective in reducing hallucinations, which are a common issue in LLMs that can lead to inaccurate or fabricated responses. This checkpoint demonstrates an example MoME model trained with approximately one million memory experts.
9
 
10
  # Warning:
11
  This model checkpoint is meant to demonstrate the ability of the architecture to scale to millions of experts and fit specific facts precisely. It is intended for research reproducibility purposes. It is not meant to be used for commercial applications because it is not loaded with facts from a real application. Contact us at info@lamini.ai to explore using Memory Tuning and the Lamini 1 architecture to remove hallucinations by adding your data.
12
 
13
  # Training Details:
14
- This checkpoint of a Lamini-1 MoME model was trained on a dataset of 100 random facts, with each fact consisting of a question and a corresponding answer. The model was trained using a combination of randomization tests and information retrieval methods to ensure that it can accurately recall and retrieve the stored facts. The training process involved selecting a subset of experts from the massive array of MoEs, freezing the backbone network and cross-attention mechanism, and taking gradient descent steps until the loss is reduced sufficiently to memorize the fact. The resulting model, Lamini-1, demonstrates improved factual recall and reduced hallucinations compared to traditional LLMs.
15
 
16
  # Key Features:
17
  Lamini-1's architecture is designed to address the issue of hallucinations in LLMs. The model's massive array of MoEs allows it to store and retrieve a large number of facts precisely, reducing the likelihood of hallucinations. Additionally, the model's ability to freeze the backbone network and cross-attention mechanism during training helps to prevent overfitting and ensures that the model learns to generalize well to new, unseen facts.
 
5
  # Model Name: Lamini-1
6
 
7
  # Description
8
+ Lamini-1 is a novel language model architecture designed to mitigate hallucinations in large language models (LLMs). By leveraging a massive mixture of memory experts (MoEs), Lamini-1 can store and retrieve a large number of facts precisely, allowing it to achieve near-zero training loss on a set of randomly generated facts. This architecture is particularly effective in reducing hallucinations, which are a common issue in LLMs that can lead to inaccurate or fabricated responses. This checkpoint demonstrates an example MoME model trained with approximately one million memory experts.
9
 
10
  # Warning:
11
  This model checkpoint is meant to demonstrate the ability of the architecture to scale to millions of experts and fit specific facts precisely. It is intended for research reproducibility purposes. It is not meant to be used for commercial applications because it is not loaded with facts from a real application. Contact us at info@lamini.ai to explore using Memory Tuning and the Lamini 1 architecture to remove hallucinations by adding your data.
12
 
13
  # Training Details:
14
+ This checkpoint of a Lamini-1 MoME model was trained on a dataset of over one million random facts, with each fact consisting of a question and a corresponding answer. The model was trained using a combination of randomization tests and information retrieval methods to ensure that it can accurately recall and retrieve the stored facts. The training process involved selecting a subset of experts from the massive array of MoEs, freezing the backbone network and cross-attention mechanism, and taking gradient descent steps until the loss is reduced sufficiently to memorize the fact. The resulting model, Lamini-1, demonstrates improved factual recall and reduced hallucinations compared to traditional LLMs.
15
 
16
  # Key Features:
17
  Lamini-1's architecture is designed to address the issue of hallucinations in LLMs. The model's massive array of MoEs allows it to store and retrieve a large number of facts precisely, reducing the likelihood of hallucinations. Additionally, the model's ability to freeze the backbone network and cross-attention mechanism during training helps to prevent overfitting and ensures that the model learns to generalize well to new, unseen facts.