Overview
Language model: gbert-large-sts
Language: German
Training data: German STS benchmark train and dev set
Eval data: German STS benchmark test set
Infrastructure: 1x V100 GPU
Published: August 12th, 2021
Details
- We trained a gbert-large model on the task of estimating semantic similarity of German-language text pairs. The dataset is a machine-translated version of the STS benchmark, which is available here.
Hyperparameters
batch_size = 16
n_epochs = 4
warmup_ratio = 0.1
learning_rate = 2e-5
lr_schedule = LinearWarmup
Performance
Stay tuned... and watch out for new papers on arxiv.org ;)
Authors
- Julian Risch:
julian.risch [at] deepset.ai
- Timo Möller:
timo.moeller [at] deepset.ai
- Julian Gutsch:
julian.gutsch [at] deepset.ai
- Malte Pietsch:
malte.pietsch [at] deepset.ai
About us
deepset is the company behind the production-ready open-source AI framework Haystack.
Some of our other work:
- Distilled roberta-base-squad2 (aka "tinyroberta-squad2")
- German BERT, GermanQuAD and GermanDPR, German embedding model
- deepset Cloud, deepset Studio
Get in touch and join the Haystack community
For more info on Haystack, visit our GitHub repo and Documentation.
We also have a Discord community open to everyone!
Twitter | LinkedIn | Discord | GitHub Discussions | Website | YouTube
By the way: we're hiring!
- Downloads last month
- 104
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.