Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LoRA support for image classification and segmentation #2052

Open
namrahrehman opened this issue Sep 6, 2024 · 4 comments
Open

LoRA support for image classification and segmentation #2052

namrahrehman opened this issue Sep 6, 2024 · 4 comments
Labels
documentation Improvements or additions to documentation

Comments

@namrahrehman
Copy link

I had a question regarding LoRA support for image classification and segmentation. I understand that LoRA support is available for both as specified in the following tutorials:
https://github.com/huggingface/peft/blob/main/examples/semantic_segmentation/semantic_segmentation_peft_lora.ipynb
https://huggingface.co/docs/peft/main/en/task_guides/image_classification_lora

but is LoHa, LoKr, AdaLoRA, and QLoRA support available for image classification and segmentation? Or can we only use the traditional LoRA?

I could not find a definite answer to my question anywhere in the official documentation.

@BenjaminBossan
Copy link
Member

AdaLoRA does not work. LoHa and LoKr have support for Conv2d layers, as such they should work. However, they don't support quantization. Therefore, LoRA is most feature rich when it comes to support for image models.

I agree that this information is not easily figured out, I'd have to think a bit about how to best document this. From a user's perspective, the easiest way is probably to just try it out.

@BenjaminBossan BenjaminBossan added the documentation Improvements or additions to documentation label Sep 6, 2024
@namrahrehman
Copy link
Author

Okay and by "image models" do you mean ViTs (Dino, Swin, DeiT, etc) as well?
As my experiments involve the use of ViT-based backbones.

@BenjaminBossan
Copy link
Member

Ah yes, sorry, vision transformers should generally work, as they use linear layers. All methods, except for prompt-tuning methods, implement linear layers. So even AdaLoRA should work there.

@namrahrehman
Copy link
Author

Thanks for your response @BenjaminBossan , I will get back to you with an implementation and then we can discuss it further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants