File size: 1,441 Bytes
36fe9f2
 
 
84e898a
fdd9bef
 
48be887
 
 
 
 
 
 
 
 
 
 
36fe9f2
 
 
 
 
1062ecc
 
622d1ec
1062ecc
48be887
 
f93954f
 
a5d4316
9c2e2e3
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
---
language:
- {en}  # Example: fr
license: mit
widget:
- text: "Lou Gehrig who works for XCorp and lives in New York suffers from [MASK]"
  example_title: "Test for entity type: Disease"
- text: "Overexpression of [MASK] occurs across a wide range of cancers"
  example_title: "Test for entity type: Gene"
- text: "Patients treated with [MASK] are vulnerable to infectious diseases"
  example_title: "Test for entity type: Drug"
- text: "A eGFR level below [MASK] indicates chronic kidney disease"
  example_title: "Test for entity type: Measure "
- text: "In the [MASK], increased daily imatinib dose induced MMR"
  example_title: "Test for entity type: STUDY/TRIAL"
- text: "Paul Erdos died at [MASK]"
  example_title: "Test for entity type: TIME"
tags:
- {fill-mask}  # Example: audio
---


This model was pretrained from scratch on a custom vocabulary on Pubmed, Clinical trials corpus, and a small subset of Bookcorpus

It was used to do NER as is, **with no fine-tuning** as described [in this post](https://ajitrajasekharan.github.io/2021/01/02/my-first-post.html)

[Towards Data Science link](https://twitter.com/TDataScience/status/1486300137366466560?s=20) to the same post

[Github link](https://github.com/ajitrajasekharan/unsupervised_NER) to NER using this model in an ensemble with bert-base cased to detect 69 entity types (17 broad entity groups)

 <img src="https://ajitrajasekharan.github.io/images/1.png" width="600">