File size: 5,959 Bytes
9bd6000
 
 
837fc06
9bd6000
837fc06
 
 
 
 
 
 
4ec1dfc
837fc06
4ec1dfc
9bd6000
837fc06
e29e0a3
acdcfcc
 
837fc06
c69c1d0
acdcfcc
 
837fc06
acdcfcc
837fc06
c629693
0c46e14
5f12f84
 
acdcfcc
837fc06
fe28236
837fc06
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
acdcfcc
837fc06
acdcfcc
9bd6000
8a640c0
837fc06
8a640c0
 
 
 
5ae1d45
8a640c0
 
837fc06
8a640c0
 
 
 
c69c1d0
f1c84bf
f2a3761
 
9bd6000
 
837fc06
acdcfcc
9bd6000
 
 
e29e0a3
9bd6000
 
 
 
 
 
837fc06
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
---
license: apache-2.0
language:
  - en
tags:
  - ai
  - rvc
  - vc
  - voice-cloning
  - applio
  - titan
  - pretrained
datasets:
  - blaise-tk/TITAN-Medium
pipeline_tag: audio-to-audio
---

# TITAN: A Versatile, Robust, and High-Quality Pretrained Model for Retrieval-based Voice Conversion (RVC) Training

## Overview

TITAN is a state-of-the-art pretrained model designed for Retrieval-based Voice Conversion (https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/) training. It offers a robust solution for transforming voice characteristics from one speaker to another, providing high-quality results with minimal training effort.

## Model Details

### Titan-Medium

- Training Environment: Utilized a RTX 3060 TI on Applio v3.1.1 (https://github.com/IAHispano/Applio), employing a batch size of 8 over a span of 3 weeks.
- Iterations: 1010588 Steps and 467 Epochs
- Sampling rate: 40k, 32k (still training)
- Fine-tuning Process: RVC v2 pretrained with pitch guidance, leveraging an 11.15-hour dataset sourced from Expresso (https://arxiv.org/abs/2308.05725) also available on [datasets/blaise-tk/TITAN-Medium](https://huggingface.co/datasets/blaise-tk/TITAN-Medium).

#### Samples
*Tests performed with a premature ckpt at ~700k steps doing all tests under the same conditions.*

<table style="width:100%; text-align:center;">
  <tr>
    <th>Titan-Medium</th>
    <th>Ov2</th>
    <th>Ov2.1</th>
  </tr>
    <tr>
    <td>
      <audio controls>
        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 1 - Test 1 - Titan.wav?download=true" type="audio/wav">
        Your browser does not support the audio element.
      </audio>
    </td>
    <td>
      <audio controls>
        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 1 - Test 1 - Ov2.wav?download=true" type="audio/wav">
        Your browser does not support the audio element.
      </audio>
    </td>
  </tr>
  
  </tr>
    <tr>
    <td>
      <audio controls>
        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 1 - Test 2 - Titan.wav?download=true" type="audio/wav">
        Your browser does not support the audio element.
      </audio>
    </td>
    <td>
      <audio controls>
        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 1 - Test 2 - Ov2.wav?download=true" type="audio/wav">
        Your browser does not support the audio element.
      </audio>
    </td>
  </tr>

  <tr>
    <td>
      <audio controls>
        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 2 - Test 1 - Titan.wav?download=true" type="audio/wav">
        Your browser does not support the audio element.
      </audio>
    </td>
    <td>
      <audio controls>
        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 2 - Test 1 - Ov2.wav?download=true" type="audio/wav">
        Your browser does not support the audio element.
      </audio>
    </td>
  
  </tr>
    <tr>
    <td>
      <audio controls>
        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 2 - Test 2 - Titan.wav?download=true" type="audio/wav">
        Your browser does not support the audio element.
      </audio>
    </td>
    <td>
      <audio controls>
        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 2 - Test 2 - Ov2.wav?download=true" type="audio/wav">
        Your browser does not support the audio element.
      </audio>
    </td>
  </tr>

  </tr>
    <tr>
    <td>
      <audio controls>
        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 1 - Titan.wav?download=true" type="audio/wav">
        Your browser does not support the audio element.
      </audio>
    </td>
    <td>
      <audio controls>
        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 1 - Ov2.wav?download=true" type="audio/wav">
        Your browser does not support the audio element.
      </audio>
    </td>
        <td>
      <audio controls>
        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 1 - Ov2.1.wav?download=true" type="audio/wav">
        Your browser does not support the audio element.
      </audio>
    </td>
  </tr>

  </tr>
    <tr>
    <td>
      <audio controls>
        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 2 - Titan.wav?download=true" type="audio/wav">
        Your browser does not support the audio element.
      </audio>
    </td>
    <td>
      <audio controls>
        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 2 - Ov2.wav?download=true" type="audio/wav">
        Your browser does not support the audio element.
      </audio>
    </td>
        <td>
      <audio controls>
        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 2 - Ov2.1.wav?download=true" type="audio/wav">
        Your browser does not support the audio element.
      </audio>
    </td>
  </tr>

</table>

### Titan-Large

- Details forthcoming...

## Collaborators

We appreciate the contributions of our collaborators who have helped in the development and refinement of TITAN.

- Mustar
- SimplCup
- UnitedShoes

## Beta Testers

We extend our gratitude to the beta testers who provided valuable feedback during the testing phase of TITAN.

- SimplCup
- Leo_Frixi
- Light
- SCRFilms
- Ryanz
- Litsa_the_dancer

## Citation

Should you find TITAN beneficial for your research endeavors or projects, we kindly request citing our repository:

```
@article{titan,
  title={TITAN: A Versatile, Robust, and High-Quality Pretrained Model for Retrieval-based Voice Conversion (RVC) Training},
  author={Blaise},
  journal={Hugging Face},
  year={2024},
  publisher={Blaise},
  url={https://huggingface.co/blaise-tk/TITAN/}
}
```