Question/Idea: Automatic Gradient Clearing #78

AhmedThahir · 2024-07-25T08:43:40Z

Hi Prof Karpathy,
I wanted to create a discussion to ask this question, but there was no provision as such. I was watching https://youtu.be/VMj-3S1tku0 and got an idea.

Context

This is in reference to the step of clearing accumulated gradients at:

micrograd/demo.ipynb

Line 265 in c911406

" model.zero_grad()\n",

Problem

People tend to forget to clear the gradients wrt the loss function backward pass.

Idea

Create a way to bind the loss function to the network once, and then automatically clear accumulated gradients automatically when performing the backward pass.

Advantage

We can perform backward pass whenever, wherever, and as many times as we want without worrying about accumulated gradient.

Pseudocode

class Loss(Value):
  def __init__(self, bound_network):
    self.bound_network = bound_network

  def __call__(self, batch_size=None):
    # loss function definition
    self.data = data_loss + reg_loss

  def backward():
    # clear gradients of bound network
    bound_network.zero_grad()
    super().backward()    

total_loss = Loss(
  bound_network = model
)

for k in range(100):
  # ...

  # model.zero_grad() # since total_loss is bound to network, it should automatically perform model.zero_grad() before doing the backward
  total_loss.backward()

  # ...

Questions

Is my understanding of the problem correct?
Is this change value-adding?
Is the above pseudocode logically correct?
If the answer to all the above are yes, I could work on a PR with your guidance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question/Idea: Automatic Gradient Clearing #78

Question/Idea: Automatic Gradient Clearing #78

AhmedThahir commented Jul 25, 2024 •

edited

Loading

Question/Idea: Automatic Gradient Clearing #78

Question/Idea: Automatic Gradient Clearing #78

Comments

AhmedThahir commented Jul 25, 2024 • edited Loading

Context

Problem

Idea

Advantage

Pseudocode

Questions

AhmedThahir commented Jul 25, 2024 •

edited

Loading