Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update flash_attention_fwd_benchmark.py #2265

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

Conversation

anmyachev
Copy link
Contributor

@anmyachev anmyachev commented Sep 17, 2024

CI:

Error:

torch.OutOfMemoryError: XPU out of memory. Tried to allocate 32.00 GiB. GPU 0 has a total capacity of 64.00 GiB. Of the allocated memory 32.81 GiB is allocated by PyTorch, and 0 bytes is reserved by PyTorch but unallocated. Please use `empty_cache` to release all unoccupied cached memory.

It's strange that a total capacity is 64.00 GiB. I need to understand why (the expected capacity should be more in my understanding).

UPD: Maybe it's related to https://spec.oneapi.io/level-zero/latest/core/PROG.html#environment-variables ZE_FLAT_DEVICE_HIERARCHY

Signed-off-by: Anatoly Myachev <[email protected]>
Signed-off-by: Anatoly Myachev <[email protected]>
Signed-off-by: Anatoly Myachev <[email protected]>
This reverts commit de9335c.
Signed-off-by: Anatoly Myachev <[email protected]>
q, k, v, attn_mask=None, dropout_p=0.0, is_causal=False, scale=sm_scale).to(torch.float32)
atol = 1e-1 if N_CTX == 16384 else 1e-2
benchmark_suit.assert_close(triton_fn(), torch_fn(), atol=atol, rtol=1e-3, err_msg='triton to torch')
torch_fn = lambda: torch.nn.functional.scaled_dot_product_attention(
Copy link
Contributor Author

@anmyachev anmyachev Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using ZE_FLAT_DEVICE_HIERARCHY=COMPOSITE the available memory is doubled and there is no more out of memory error for upstream pytorch (however, this affects the performance)

@anmyachev anmyachev marked this pull request as ready for review September 19, 2024 18:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant