Text model not being loaded with Flash Attention 2

#27
No description provided.

yes you are correct @starzmustdie
we left that as an improvement when integrating into hf transformers, but I just opened a public issue to track this https://github.com/huggingface/transformers/issues/30394

Hey @VictorSanh

After some time debugging, this was the reason why I was getting OOM when trying to fine-tune the model.

I opened a PR which patches this (https://github.com/huggingface/transformers/pull/30395).

I would appreciate feedback on necessary changes.

woow let's go!! issue to PR time = 13 mins hehe.
cc @ArthurZ @amyeroberts

VictorSanh changed pull request status to closed

Sign up or log in to comment