File size: 6,091 Bytes
6a3fc58
 
 
 
 
 
 
 
 
 
 
 
99d511b
6a3fc58
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
---
license: apache-2.0
language:
  - en
tags:
  - ai
  - rvc
  - vc
  - voice-cloning
  - applio
  - titan
  - pretrained
base_model: lj1995/VoiceConversionWebUI
datasets:
  - blaise-tk/TITAN-Medium
pipeline_tag: audio-to-audio
---

# TITAN: A Versatile, Robust, and High-Quality Pretrained Model for Retrieval-based Voice Conversion (RVC) Training

## Overview

TITAN is a state-of-the-art pretrained model designed for Retrieval-based Voice Conversion (https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/) training. It offers a robust solution for transforming voice characteristics from one speaker to another, providing high-quality results with minimal training effort.

## Model Details

### Titan-Medium

- Training Environment: Utilized a RTX 3060 TI on Applio v3.1.1 (https://github.com/IAHispano/Applio), employing a batch size of 8 over a span of 3 weeks.
- Iterations (48k): 1018660 Steps and 530 Epochs
- Iterations (40k): 1010588 Steps and 467 Epochs
- Iterations (32k): 1001469 Steps and 463 Epochs
- Sampling rate: 48k, 40k, 32k
- Fine-tuning Process: RVC v2 pretrained with pitch guidance, leveraging an 11.15-hour dataset sourced from Expresso (https://arxiv.org/abs/2308.05725) also available on [datasets/blaise-tk/TITAN-Medium](https://huggingface.co/datasets/blaise-tk/TITAN-Medium).

#### Samples
*Tests performed with a premature ckpt at ~700k steps doing all tests under the same conditions.*

<table style="width:100%; text-align:center;">
  <tr>
    <th>Titan-Medium</th>
    <th>Ov2</th>
    <th>Ov2.1</th>
  </tr>
    <tr>
    <td>
      <audio controls>
        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 1 - Test 1 - Titan.wav?download=true" type="audio/wav">
        Your browser does not support the audio element.
      </audio>
    </td>
    <td>
      <audio controls>
        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 1 - Test 1 - Ov2.wav?download=true" type="audio/wav">
        Your browser does not support the audio element.
      </audio>
    </td>
  </tr>
  
  </tr>
    <tr>
    <td>
      <audio controls>
        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 1 - Test 2 - Titan.wav?download=true" type="audio/wav">
        Your browser does not support the audio element.
      </audio>
    </td>
    <td>
      <audio controls>
        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 1 - Test 2 - Ov2.wav?download=true" type="audio/wav">
        Your browser does not support the audio element.
      </audio>
    </td>
  </tr>

  <tr>
    <td>
      <audio controls>
        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 2 - Test 1 - Titan.wav?download=true" type="audio/wav">
        Your browser does not support the audio element.
      </audio>
    </td>
    <td>
      <audio controls>
        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 2 - Test 1 - Ov2.wav?download=true" type="audio/wav">
        Your browser does not support the audio element.
      </audio>
    </td>
  
  </tr>
    <tr>
    <td>
      <audio controls>
        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 2 - Test 2 - Titan.wav?download=true" type="audio/wav">
        Your browser does not support the audio element.
      </audio>
    </td>
    <td>
      <audio controls>
        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 2 - Test 2 - Ov2.wav?download=true" type="audio/wav">
        Your browser does not support the audio element.
      </audio>
    </td>
  </tr>

  </tr>
    <tr>
    <td>
      <audio controls>
        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 1 - Titan.wav?download=true" type="audio/wav">
        Your browser does not support the audio element.
      </audio>
    </td>
    <td>
      <audio controls>
        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 1 - Ov2.wav?download=true" type="audio/wav">
        Your browser does not support the audio element.
      </audio>
    </td>
        <td>
      <audio controls>
        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 1 - Ov2.1.wav?download=true" type="audio/wav">
        Your browser does not support the audio element.
      </audio>
    </td>
  </tr>

  </tr>
    <tr>
    <td>
      <audio controls>
        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 2 - Titan.wav?download=true" type="audio/wav">
        Your browser does not support the audio element.
      </audio>
    </td>
    <td>
      <audio controls>
        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 2 - Ov2.wav?download=true" type="audio/wav">
        Your browser does not support the audio element.
      </audio>
    </td>
        <td>
      <audio controls>
        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 2 - Ov2.1.wav?download=true" type="audio/wav">
        Your browser does not support the audio element.
      </audio>
    </td>
  </tr>

</table>

### Titan-Large

- Details forthcoming...

## Collaborators

We appreciate the contributions of our collaborators who have helped in the development and refinement of TITAN.

- Mustar
- SimplCup
- UnitedShoes

## Beta Testers

We extend our gratitude to the beta testers who provided valuable feedback during the testing phase of TITAN.

- SimplCup
- Leo_Frixi
- Light
- SCRFilms
- Ryanz
- Litsa_the_dancer

## Citation

Should you find TITAN beneficial for your research endeavors or projects, we kindly request citing our repository:

```
@article{titan,
  title={TITAN: A Versatile, Robust, and High-Quality Pretrained Model for Retrieval-based Voice Conversion (RVC) Training},
  author={Blaise},
  journal={Hugging Face},
  year={2024},
  publisher={Blaise},
  url={https://huggingface.co/blaise-tk/TITAN/}
}
```