Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upperbound results #34

Open
jmin0530 opened this issue Mar 21, 2023 · 6 comments
Open

Upperbound results #34

jmin0530 opened this issue Mar 21, 2023 · 6 comments

Comments

@jmin0530
Copy link

Hello!

I have a confusion about how to view Upperbound(Joint) results. First of all, I am running the code by setting the approach to "joint" to see the Upperbound(Joint) result in CIFAR-100 in the script.

There are four types of results for each seed: avg_accs_tag, avg_accs_taw, acc_tag, and acc_taw. In your paper results (Fig 8), it is confusing how to view the Upperbound(Joint) result using the above four results.

Thank you.

@mmasana
Copy link
Owner

mmasana commented Mar 21, 2023

Hi @jmin0530 ,

The joint training is an incremental one, meaning that the network goes through a training session at each task, with access to all data from previous tasks. Basically emulates as if you would be able to store everything into the exemplars memory, thus being an upperbound baseline.

The metrics from the files that contain the word tag mean task-agnostic (the task-ID is not known at test time, class-IL setting). The word taw means task-aware (task-id is known at test time, task-IL setting). Whenever you see the avg it means that it calculates the average, while wavg means weighted average (the average is weighted based on the number of classes in the task). All of the ones you mention are called acc because they provide the accuracy metric (number of correct cases divided by total amount of test samples).

The survey paper mainly covers the scenario for class-IL (task-agnostic). Figure 8 in particular shows the average accuracy of all classes from all tasks learned so far. That means that non-learned classes are not included in the average, since it doesn't make sense for the incremental learning scenario.

Hope that helps!

@jmin0530
Copy link
Author

Thank you for your reply. But I want to understand defenitely.
My "joint" approach class-incremental result(avg_accs_tag) at seed 0 is at below.

0.806000 0.687500 0.669333 0.645000 0.645200 0.678000 0.655857 0.664750 0.671556 0.657800

Is this result for upperbound(joint) at seed0 like Fig 8 result??

@mmasana
Copy link
Owner

mmasana commented Mar 23, 2023

The results for CIFAR-100 (10/10) (Figure 8 left) for joint training for the first 3 seeds:

seed 0: 0.788000 0.699500 0.726000 0.737750 0.729400 0.701333 0.694286 0.657750 0.669889 0.670000
seed 1: 0.863000 0.739000 0.700333 0.701250 0.723000 0.675833 0.705429 0.683875 0.679778 0.685100
seed 2: 0.839000 0.755500 0.740333 0.744500 0.741800 0.725333 0.720286 0.705750 0.678111 0.676000

Some comments:

  • Note that even with the fixed same seed, you might have slightly different results depending on the machine you use.
  • For Joint, we did not define an exemplar memory of 2,000 samples since, by definition, it has access to all data from all seen tasks so far.
  • Those results were obtained by using the Continual Hyperparameter Framework (gridsearch options).
  • What you see in Figure 8 is an average over 10 seeds.
  • The average standard deviation across all seeds from Joint for this scenario (CIFAR-100 10/10) is around 2.2.

Since you seem to have a bit lower results than the seeds, my guess would be that you did not set up the gridsearch. But if you did, you can provide some more context to figure out where the difference might come from.

@jmin0530
Copy link
Author

My results for CIFAR-100 (10/10) for joint training for the 10 seeds:

seed 0: 0.806000 0.687500 0.669333 0.645000 0.645200 0.678000 0.655857 0.664750 0.671556 0.657800
seed 1: 0.805000 0.766500 0.720000 0.732500 0.733000 0.703000 0.709000 0.685375 0.678889 0.664400
seed 2: 0.744000 0.692000 0.735333 0.697250 0.704600 0.704833 0.688571 0.678625 0.665556 0.665100
seed 3: 0.842000 0.738000 0.735333 0.719750 0.711800 0.709333 0.696857 0.679625 0.684333 0.661500
seed 4: 0.834000 0.651000 0.671333 0.662500 0.676800 0.673000 0.670286 0.642750 0.653222 0.641200
seed 5: 0.767000 0.665500 0.704333 0.724750 0.708800 0.728000 0.713429 0.688625 0.675111 0.657000
seed 6: 0.808000 0.677500 0.731667 0.736750 0.722200 0.711500 0.700429 0.669125 0.676667 0.674500
seed 7: 0.109000 0.548500 0.705000 0.711250 0.714800 0.697500 0.708000 0.685875 0.677889 0.663400
seed 8: 0.858000 0.725500 0.723667 0.720000 0.720400 0.715333 0.718571 0.681125 0.668000 0.652500
seed 9: 0.806000 0.712000 0.745000 0.702750 0.714800 0.699500 0.698714 0.687250 0.652333 0.664800
average: 0.7379 0.6864 0.7140999 0.70525 0.70524 0.7019999 0.6959714 0.6763125 0.6703556 0.66022

  • My results were obtained by using the Continual Hyperparameter Framework.
  • My average standard deviation result is 0.846.
  • My seed 7 task 0 result is wrong. But I can't figure out why this happened. So I will show seed 7 task 0 stdout.
    image
    image
    image
    image
    image
    image
    image
    image
    image
    image
    image
  • My gpu machine is NVIDIA Geforce 3090, and Cuda version is 11.4
  • My torch version is 1.11.0+cu113, and torchvision version is 0.12.0+cu113
  • My experiment argument:
    image

@mmasana
Copy link
Owner

mmasana commented Mar 27, 2023

Looking at the arguments, the difference I see is that you have "num_exemplars" : 2000. As I mentioned, since Joint has already access to all the images, I set up "num_exemplars" : 0. Maybe the difference comes from that. I also have the exemplar sampling set to random, but that has no effect since there is no exemplar memory for Joint.

The error from seed 7 could be anything. It happens rarely sometimes, but I think is just that there is a combination with the initialization or batch order that makes the network reach an unstable point it cannot recover from. As you can see, the loss never really moves much after the first few epochs. I do not have much insight on those cases. What I do is I just run another seed more and ignore this one since it is clearly an unexpected outlier.

Considering the wrong result from seed 7, your average without that outlier seed compares like this:

facil: 80.7, 69.1, 72.0, 70.7, 71.0, 69.5, 69.4, 67.3, 66.5, 66.3
you: 80.8, 70.2, 71.5, 70.5, 70.4, 70.3, 69.5, 67.5, 66.9, 66.0 (no seed 7)
you: 73.8, 68.6, 71.4, 70.5, 70.5, 70.2, 69.6, 67.6, 67.0, 66.0 (with seed 7)

which is very similar, considering that the standard deviation is 2.2 for Joint.

@arnabphoenix
Copy link

Respected Sir,@jmin0530
Can you please tell me how you are changing the num_exemplars arguments,means it is present in exemplar.py file and when I am changing it there and running the main.py file still it is showing that it is running with default exemplars only.If you could please tell me how to run the main.py file with the modified exemplars arguments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants