Train tokens amount

#74

by aynetdia - opened Jan 23

Jan 23

Hi,

the technical report states that phi-1.5 was trained on a dataset of 30 billion tokens (Section 1), however Table 1 and Section 2.3 indicate that it was trained for 150 billion tokens. Does it mean that the model went over the synthetic 30B token dataset 5 times, i.e. that the pre-training lasted 5 epochs?

Best,
Ansar

gugarosa

Microsoft org Jan 24

Hi @aynetdia ! I hope everything is going well with you.

Exactly, it was pre-trained for 5 epochs.

Regards,
Gustavo.

gugarosa changed discussion status to closed Jan 24

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment