-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request to intergrate Structure Sparsity-based PEFT (S2FT) #2329
Comments
Thank you for presenting this novel PEFT technique. I skimmed the paper and code and this could indeed be a nice addition to PEFT. Feel free to open a PR with your contribution. As a tip:
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
@Hanyuezhuohua do you still intend to work on this? |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
Feature request
This request proposes to intergrate S2FT, a pure structure sparsity-based PEFT method that concurrently achieve state-of-theart fine-tuning performance, training efficiency, and inference scalability. More information about our NeurIPS paper can be found here: https://infini-ai-lab.github.io/S2FT-Page/, of which i'm the first author. Here is our code for the implementation: https://github.com/Infini-AI-Lab/S2FT.
Motivation
As far as I know, S2FT is the first one to offer efficient and flexible sparsity-based PEFT for LLMs (previously only some add sparsity to LoRA or use layerwise freezing). Here, we'd like to mention several importance features of S2FT:
Model Versatility: The design of our structure sparsity is based on the coupled structure in LLMs, which commonly exists in LLMs, VLMs, CNNs, and GNNs. Therefore, our method should work for many different structures.
Generalization Ability: When evaluated on more recent models such as LLaMA-3-8B, we observe that our method can outperform both LoRA and Full FT, which is because we only modified a small fraction of the original parameters. Therefore, we can maintain most advanced abilities during pre-training.
Based on these information, although S2FT is just released, we think it is new kind of PEFT method showing very good potential. And the integration of it should be benefit for future sparsity-based PEFT methods.
Your contribution
I will try to write most code for this new PEFT method based on the current PEFT
The text was updated successfully, but these errors were encountered: