Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Install failed #20

Open
1787648106 opened this issue Nov 27, 2023 · 12 comments
Open

Install failed #20

1787648106 opened this issue Nov 27, 2023 · 12 comments
Labels

Comments

@1787648106
Copy link

1787648106 commented Nov 27, 2023

unk #1 FAILED at 816.
1 out of 1 hunk FAILED -- saving rejects to file /work/home/xxx/anaconda3/envs/vllm/lib/python3.8/site-packages/torch/utils/hipify/hipify_python.py.rej.

Is there a restriction on the graphics card model? Thanks.

@gregoamd
Copy link

gregoamd commented Nov 28, 2023

I had the same problem trying the apply the included patch to hipify_python.py
I was trying to build the latest source of flash attention as follows:

  1. Created a virtual environment on an mi210 system with Ubuntu 22.04 python 3.10 and pytorch 2.1.1 (nightly)
  2. git clone https://github.com/ROCmSoftwarePlatform/flash-attention.git
  3. export GPU_ARCHS="gfx90a"
  4. export PYTHON_SITE_PACKAGES=$(python -c 'import site; print(site.getsitepackages()[0])')
  5. patch "${PYTHON_SITE_PACKAGES}/torch/utils/hipify/hipify_python.py" hipify_patch.patch

On completing step to patch hipify_python.py with your provided patch "hipify_patch.patch" I get the error reported by 1787648106 as follows:

"patching file /app/llm_fa/lib/python3.10/site-packages/torch/utils/hipify/hipify_python.py
patch unexpectedly ends in middle of line
Hunk #1 FAILED at 816.
1 out of 1 hunk FAILED -- saving rejects to file /app/llm_fa/lib/python3.10/site-packages/torch/utils/hipify/hipify_python.py.rej

Here is the contents of the hipify_python.py.rej file you asked for :
--- /dev/null
+++ /dev/null
@(/app/llm_fa) root@257e93192413:/app/llm_fa/flash-attention# more /app/llm_fa/lib/python3.10/site-packages/torch/utils/hipify/hipify_python.py.rej

Here is the contents of the hipify_python.py.rej file you asked for :
--- /dev/null
+++ /dev/null
@@ -816,10 +816,15 @@
return m.group(0)
# Hipify header file first if needed
if header_filepath not in HIPIFY_FINAL_RESULT:

  •                preprocess_file_and_save_result(output_directory,
    
  •                                                header_filepath,
    
  •                                                all_files, header_include_dirs, stats, hip_clang_launch
    

,

  •                                                is_pytorch_extension, clean_ctx, show_progress)
    
  •                #JCG added skip logic
    
  •                 if "composable_kernel" in header_filepath:
    
  •                     print("Force skipping hipification of CK file: " + header_filepath)
    
  •                     HIPIFY_FINAL_RESULT[header_filepath] = {"hipified_path":header_filepath}
    
  •                 else:
    
  •                     preprocess_file_and_save_result(output_directory,
    
  •                                                     header_filepath,
    
  •                                                     all_files, header_include_dirs, stats, hip_clang_l
    

aunch,

  •                                                     is_pytorch_extension, clean_ctx, show_progress)
               hipified_header_filepath = HIPIFY_FINAL_RESULT[header_filepath]["hipified_path"]
               return templ.format(os.path.relpath(hipified_header_filepath if hipified_header_filepath is
    

not None

@dejay-vu
Copy link

dejay-vu commented Nov 28, 2023

unk #1 FAILED at 816.

1 out of 1 hunk FAILED -- saving rejects to file /work/home/xxx/anaconda3/envs/vllm/lib/python3.8/site-packages/torch/utils/hipify/hipify_python.py.rej.

Is there a restriction on the graphics card model? Thanks.

Can you provide the hipify_python.py.rej file? Are you building flash attention inside Docker and what is your PyTorch version?

Also what bash commands are you using for patching?

The current FA has been tested on MI200 & MI300. I can build the FA on Navi cards but I am not sure if it would work correctly due to the memory limitation.

@gregoamd
Copy link

gregoamd commented Nov 28, 2023

I had the same problem trying the apply the included patch to hipify_python.py I was trying to build the latest source of flash attention as follows:

  1. Created a virtual environment on an mi210 system with Ubuntu 22.04 python 3.10 and pytorch 2.1.1 (nightly)
  2. git clone https://github.com/ROCmSoftwarePlatform/flash-attention.git
  3. export GPU_ARCHS="gfx90a"
  4. export PYTHON_SITE_PACKAGES=$(python -c 'import site; print(site.getsitepackages()[0])')
  5. patch "${PYTHON_SITE_PACKAGES}/torch/utils/hipify/hipify_python.py" hipify_patch.patch

On completing step to patch hipify_python.py with your provided patch "hipify_patch.patch" I get the error reported by 1787648106 as follows:

"patching file /app/llm_fa/lib/python3.10/site-packages/torch/utils/hipify/hipify_python.py patch unexpectedly ends in middle of line Hunk #1 FAILED at 816. 1 out of 1 hunk FAILED -- saving rejects to file /app/llm_fa/lib/python3.10/site-packages/torch/utils/hipify/hipify_python.py.rej

Here is the contents of the hipify_python.py.rej file you asked for : --- /dev/null +++ /dev/null @(/app/llm_fa) root@257e93192413:/app/llm_fa/flash-attention# more /app/llm_fa/lib/python3.10/site-packages/torch/utils/hipify/hipify_python.py.rej

Here is the contents of the hipify_python.py.rej file you asked for : --- /dev/null +++ /dev/null @@ -816,10 +816,15 @@ return m.group(0) # Hipify header file first if needed if header_filepath not in HIPIFY_FINAL_RESULT:

  •                preprocess_file_and_save_result(output_directory,
    
  •                                                header_filepath,
    
  •                                                all_files, header_include_dirs, stats, hip_clang_launch
    

,

  •                                                is_pytorch_extension, clean_ctx, show_progress)
    
  •                #JCG added skip logic
    
  •                 if "composable_kernel" in header_filepath:
    
  •                     print("Force skipping hipification of CK file: " + header_filepath)
    
  •                     HIPIFY_FINAL_RESULT[header_filepath] = {"hipified_path":header_filepath}
    
  •                 else:
    
  •                     preprocess_file_and_save_result(output_directory,
    
  •                                                     header_filepath,
    
  •                                                     all_files, header_include_dirs, stats, hip_clang_l
    

aunch,

  •                                                     is_pytorch_extension, clean_ctx, show_progress)
               hipified_headefilepath = HIPIFY_FINAL_RESULT[header_filepath]["hipified_path"]
               return templ.format(os.path.relpath(hipified_header_filepath if hipified_header_filepath is
    

not None

I found the issue. Your patch is reliant on a specific version of Pytorch 2.0.1 The line HUNK failed because it did not match line 816 in the hipify_python.py from the later release ( current) version of pytorch I am using. I downgraded to ptorch 2.0.1 and the patch succeeded.

However there is another build problem in that the "pip install ." fails with this repeating message "error: #error The version of CUB in your include path is not compatible with this release of Thrust. CUB is now included in the CUDA Toolkit, so you no longer need to use your own checkout of CUB. Define THRUST_IGNORE_CUB_VERSION_CHECK to ignore this.
p setup . The check should be suppressed. Also this build from source of flash attention is targeting ROCm not CUDA in this instance.

Can you please provide a more detailed note on the build from flash attention source for ROCm 5.7 including a prerequisite / dependency list . Thank you

@dejay-vu
Copy link

@gregoamd Seems to be a compatibility issue. I will investigate on this.

@dejay-vu
Copy link

dejay-vu commented Nov 29, 2023

@gregoamd It seems that for PyTorch v2.1+ there is no need to apply the hipify patch, I will change the patch logic to support the newer versions.

For your second issue: By "pip setup .", do you mean "pip install ."? It's quite strange because I don't think flash-attention is dependent on either CUB or hipCUB.

The prerequisites are PyTorch and Composable Kernel which is a submodule of Flash-Attention for ROCm.

@gregoamd
Copy link

@gregoamd It seems that for PyTorch v2.1+ there is no need to apply the hipify patch, I will change the patch logic to support the newer versions.

For your second issue: By "pip setup .", do you mean "pip install ."? It's quite strange because I don't think flash-attention is dependent on either CUB or hipCUB.

The prerequisites are PyTorch and Composable Kernel which is a submodule of Flash-Attention for ROCm.

Yes. Typo by me it should read pip install .

@gregoamd
Copy link

gregoamd commented Nov 30, 2023

FIXED. I was able to build from source with the latest nightly release of Pytorch with ROCm 5/7 and Python 3.10. I ran this in a conda virtual environment as follows:

  1. conda create --prefix /lab/llm_fa python=3.10
  2. conda activate /lab/llm_fa
  3. pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm5.7
  4. cd /lab/llm_fa
  5. git clone --recursive https://github.com/ROCmSoftwarePlatform/flash-attention.git
  6. export GPU_ARCHS="gfx90a"
  7. cd flash-attention
  8. export PYTHON_SITE_PACKAGES=$(python -c 'import site; print(site.getsitepackages()[0])')
  9. Add a missing module "packaging" -> pip install packaging
  10. pip install .

Ran with vLLM ported to ROCm and achieved an ~3.5X throughput increase using flash attention on MI210.

@ikcikoR
Copy link

ikcikoR commented Jan 9, 2024

FIXED. I was able to build from source with the latest nightly release of Pytorch with ROCm 5/7 and Python 3.10. I ran this in a conda virtual environment as follows:

  1. conda create --prefix /lab/llm_fa python=3.10
  2. conda activate /lab/llm_fa
  3. pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm5.7
  4. cd /lab/llm_fa
  5. git clone --recursive https://github.com/ROCmSoftwarePlatform/flash-attention.git
  6. export GPU_ARCHS="gfx90a"
  7. cd flash-attention
  8. export PYTHON_SITE_PACKAGES=$(python -c 'import site; print(site.getsitepackages()[0])')
  9. Add a missing module "packaging" -> pip install packaging
  10. pip install .

Ran with vLLM ported to ROCm and achieved an ~3.5X throughput increase using flash attention on MI210.

I'm running Python 3.10.13 in rocm5.7 pytorch docker where Stable Diffusion stuff works with it properly with my GPU (gfx1100), I tried applying both main branch hipify_patch.patch and hipify_python.patch from the navi_support branch but neither worked. To be fair I didn't override GPU_ARCHS but it shouldn't make the difference here since from what you're saying you did this after installing pytorch? @howiejayz any idea what might be causing it? Also, what are the recommended versions of relevant software/libraries to use or commits to this branch that I should try this with, if any?

@ikcikoR
Copy link

ikcikoR commented Jan 9, 2024

@gregoamd could you check what is your currently installed torch version? I've just realized nightly most likely changed since you've written this comment so that could be one of the potential reasons

Edit: My current torch package version is 2.3.0.dev20240109+rocm5.7
Edit 2: Guess I'll try compiling Pytorch from source, whatever pull request was the most recent one at the time gregoamd wrote that comment
Edit 3: It failed to compile and my brain is too small to even begin figuring out why, that thing took like 10 minutes to git clone alone

@ikcikoR
Copy link

ikcikoR commented Feb 2, 2024

Update: Tried it today and the hipify patch doesn't work still BUT everything works without applying it.
Edit: By everything I mean I compiled this branch succesfully and ran a program using it.

Running this Mixtral finetune via tabbyAPI on RX 7900 XTX at around 50t/s with 8k context I believe, will experiment further later but I think this issue is as good as closed more-less? I'm assuming whatever the patch was meant to fix already got fixed upstream? Haven't looked into the code since I'm rather new to all of that but yeah, it works now.

@michaelfeil
Copy link

Would it be possible to update the readme with the information about the hipify_patch?

@ghost
Copy link

ghost commented Jun 6, 2024

I also got the hipify_patch error on MI210 with Ubuntu 22.04, Python 3.10, and 2.4.0.dev20240520+rocm6.1 though I am able to install flash-attn successfully.

If the hipify_patch is not necessary, shall we remove the description from the README so that a user is not confused and doesn't waist time to investigate the error?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants