jgrosjean-mathesis
/

sentence-swissbert

Sentence Similarity

Transformers

PyTorch

xmod

Inference Endpoints

Model card Files Files and versions Community

jgrosjean commited on Dec 11, 2023

Commit

dbc65e7

•

1 Parent(s): 3e2d610

Update README.md

Browse files

Files changed (1) hide show

README.md +42 -23

README.md CHANGED Viewed

@@ -6,7 +6,7 @@
 <!-- Provide a quick summary of what the model is/does. -->
-The [SwissBERT](https://huggingface.co/ZurichNLP/swissbert) model finetuned via [SimCSE](http://dx.doi.org/10.18653/v1/2021.emnlp-main.552) (Gao et al., EMNLP 2021) for sentence embeddings, using ~1 million Swiss news articles published in 2022 from [Swissdox@LiRI](https://t.uzh.ch/1hI). Following [Sentence Transformers](https://huggingface.co/sentence-transformers)) example (Reimers and Gurevych,
 2019), the average of the last hidden states (pooler_type=avg) is used as sentence representation.
 The fine-tuning script can be accessed [here](Link).
@@ -19,43 +19,62 @@ The fine-tuning script can be accessed [here](Link).
 <!-- Provide a longer summary of what this model is. -->
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
 [More Information Needed]
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
 [More Information Needed]
@@ -63,7 +82,7 @@ The fine-tuning script can be accessed [here](Link).
 <!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
 ### Recommendations

 <!-- Provide a quick summary of what the model is/does. -->
+The [SwissBERT](https://huggingface.co/ZurichNLP/swissbert) model finetuned via [SimCSE](http://dx.doi.org/10.18653/v1/2021.emnlp-main.552) (Gao et al., EMNLP 2021) for sentence embeddings, using ~1 million Swiss news articles published in 2022 from [Swissdox@LiRI](https://t.uzh.ch/1hI). Following the [Sentence Transformers](https://huggingface.co/sentence-transformers) approach (Reimers and Gurevych,
 2019), the average of the last hidden states (pooler_type=avg) is used as sentence representation.
 The fine-tuning script can be accessed [here](Link).
 <!-- Provide a longer summary of what this model is. -->
+- **Developed by:** [Juri Grosjean](https://huggingface.co/jgrosjean)
+- **Model type:** [XMOD](https://huggingface.co/facebook/xmod-base)
+- **Language(s) (NLP):** [de_CH, fr_CH, it_CH, rm_CH]
+- **License:** [More Information Needed]
+- **Finetuned from model:** [SwissBERT](https://huggingface.co/ZurichNLP/swissbert)
+## Use
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+```python
+import torch
+from transformers import AutoModel, AutoTokenizer
+### German example
+```python
+def generate_sentence_embedding(sentence, model_name="jgrosjean-mathesis/swissbert-for-sentence-embeddings"):
+    # Load swissBERT model
+    model = AutoModel.from_pretrained(model_name)
+    tokenizer = AutoTokenizer.from_pretrained(model_name)
+    model.set_default_language("de_CH")
+    # Tokenize input sentence
+    inputs = tokenizer(sentence, padding=True, truncation=True, return_tensors="pt", max_length=512)
+    # Set the model to evaluation mode
+    model.eval()
+    # Take tokenized input and pass it through the model
+    with torch.no_grad():
+        outputs = model(**inputs)
+    # Extract average sentence embeddings from the last hidden layer
+    embedding = outputs.last_hidden_state.mean(dim=1)
+    return embedding
+sentence_embedding = generate_sentence_embedding("Wir feiern am 1. August den Schweizer Nationalfeiertag.")
+print(sentence_embedding)
+```
+Output:
+```
+tensor([[ 5.6306e-02, -2.8375e-01, -4.1495e-02,  7.4393e-02, -3.1552e-01,
+          1.5213e-01, -1.0258e-01,  2.2790e-01, -3.5968e-02,  3.1769e-01,
+          1.9354e-01,  1.9748e-02, -1.5236e-01, -2.2657e-01,  1.3345e-02,
+        ...]])
+```
 [More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
 [More Information Needed]
 <!-- This section is meant to convey both technical and sociotechnical limitations. -->
+This multilingual model has not been fine-tuned for cross-lingual transfer. It is intended for computing sentence embeddings that can be compared mono-lingually.
 ### Recommendations