Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question/Idea: Automatic Gradient Clearing #78

Open
AhmedThahir opened this issue Jul 25, 2024 · 0 comments
Open

Question/Idea: Automatic Gradient Clearing #78

AhmedThahir opened this issue Jul 25, 2024 · 0 comments

Comments

@AhmedThahir
Copy link

AhmedThahir commented Jul 25, 2024

Hi Prof Karpathy,
I wanted to create a discussion to ask this question, but there was no provision as such. I was watching https://youtu.be/VMj-3S1tku0 and got an idea.

Context

This is in reference to the step of clearing accumulated gradients at:

" model.zero_grad()\n",

Problem

People tend to forget to clear the gradients wrt the loss function backward pass.

Idea

Create a way to bind the loss function to the network once, and then automatically clear accumulated gradients automatically when performing the backward pass.

Advantage

We can perform backward pass whenever, wherever, and as many times as we want without worrying about accumulated gradient.

Pseudocode

class Loss(Value):
  def __init__(self, bound_network):
    self.bound_network = bound_network

  def __call__(self, batch_size=None):
    # loss function definition
    self.data = data_loss + reg_loss

  def backward():
    # clear gradients of bound network
    bound_network.zero_grad()
    super().backward()    

total_loss = Loss(
  bound_network = model
)

for k in range(100):
  # ...

  # model.zero_grad() # since total_loss is bound to network, it should automatically perform model.zero_grad() before doing the backward
  total_loss.backward()

  # ...

Questions

  1. Is my understanding of the problem correct?
  2. Is this change value-adding?
  3. Is the above pseudocode logically correct?
  4. If the answer to all the above are yes, I could work on a PR with your guidance.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant