You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
May I ask how to calculate the hessian of each layer on the popular llm models, such as llama. Or do you have some suggestions on the popular hessian calculation repo.
Thank you very much for your great work!
The text was updated successfully, but these errors were encountered:
I talked with my advisors about sharing the code for Hessian calculation. We apologize that we need to keep this code in private for now since we still have several ongoing works on this topic, and the code of Hessian calculation serves as the foundation.
Nevertheless, we are very happy to share some implementation guidance to help you reproduce the results that you are interested in.
Step 2: Calculate Hessian with two backpropagation passes. For the first pass, remember to set retain_graph = True. For the second pass, you also need to define a bunch of new losses to get each entry of the Hessian. You might get some guidance from this repo for calculating the Hessian-vector product: https://github.com/zyushun/hessian-spectrum . The calculation of Hessian is a natural extension of calculating the Hessian-vector product in this repo.
Thanks again for your interest! Hope the above suggestions will help you a bit.
May I ask how to calculate the hessian of each layer on the popular llm models, such as llama. Or do you have some suggestions on the popular hessian calculation repo.
Thank you very much for your great work!
The text was updated successfully, but these errors were encountered: