You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When FA2 is enabled ("FA2=True" shows up when tuning),
"Unsloth 2024.8: Fast Llama patching. Transformers = 4.44.2.
\ /| GPU: NVIDIA GeForce RTX 4090. Max memory: 23.617 GB. Platform = Linux.
O^O/ _/ \ Pytorch: 2.4.0. CUDA = 8.9. CUDA Toolkit = 12.1.
\ / Bfloat16 = TRUE. FA [Xformers = 0.0.27.post2. FA2 = True]"
and "use_dora = True," in the script,
it always errors out "RuntimeError: FlashAttention only support fp16 and bf16 data type".
And there is no way to disable FA2 in the script - I have tried many FA2 configs in the script.
The only way to use dora is to use Unsloth in a env which has no FA2 installed.
The text was updated successfully, but these errors were encountered:
When FA2 is enabled ("FA2=True" shows up when tuning),
"Unsloth 2024.8: Fast Llama patching. Transformers = 4.44.2.
\ /| GPU: NVIDIA GeForce RTX 4090. Max memory: 23.617 GB. Platform = Linux.
O^O/ _/ \ Pytorch: 2.4.0. CUDA = 8.9. CUDA Toolkit = 12.1.
\ / Bfloat16 = TRUE. FA [Xformers = 0.0.27.post2. FA2 = True]"
and "use_dora = True," in the script,
it always errors out "RuntimeError: FlashAttention only support fp16 and bf16 data type".
And there is no way to disable FA2 in the script - I have tried many FA2 configs in the script.
The only way to use dora is to use Unsloth in a env which has no FA2 installed.
The text was updated successfully, but these errors were encountered: