finetuning error

#12
by adonlee - opened

RuntimeError: `<class 'flash_attn.layers.rotary.RotaryEmbedding'>' was not properly set up for sharding by zero.Init(). A subclass of torch.nn.Module must be defined before zero.Init() where an instance of the class is created.

Qwen org

Please update the deepspeed package.

jklj077 changed discussion status to closed

Sign up or log in to comment