Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hessian Calculation #29

Open
lucasliunju opened this issue Sep 25, 2024 · 1 comment
Open

Hessian Calculation #29

lucasliunju opened this issue Sep 25, 2024 · 1 comment

Comments

@lucasliunju
Copy link

May I ask how to calculate the hessian of each layer on the popular llm models, such as llama. Or do you have some suggestions on the popular hessian calculation repo.

Thank you very much for your great work!

@zyushun
Copy link
Owner

zyushun commented Oct 16, 2024

Hi @lucasliunju , sorry for the delay!

I talked with my advisors about sharing the code for Hessian calculation. We apologize that we need to keep this code in private for now since we still have several ongoing works on this topic, and the code of Hessian calculation serves as the foundation.
Nevertheless, we are very happy to share some implementation guidance to help you reproduce the results that you are interested in.

Step 1: use the codebase by Andrej Karpathy https://colab.research.google.com/drive/1SiF0KZJp75rUeetKOWqpsA8clmHP6jMg?usp=sharing. Enlarge the vocabulary size from 2 to 4 or 8 or larger. Change the sentence-level loss to token-level pre-train loss.

Step 2: Calculate Hessian with two backpropagation passes. For the first pass, remember to set retain_graph = True. For the second pass, you also need to define a bunch of new losses to get each entry of the Hessian. You might get some guidance from this repo for calculating the Hessian-vector product: https://github.com/zyushun/hessian-spectrum . The calculation of Hessian is a natural extension of calculating the Hessian-vector product in this repo.
Thanks again for your interest! Hope the above suggestions will help you a bit.

Sincerely,
Yushun

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants