Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'Non-existent config key: MODEL.ROI_HEADS.OUTPUT_LAYER' #2

Open
ChenZhiMing66 opened this issue May 25, 2023 · 7 comments
Open

'Non-existent config key: MODEL.ROI_HEADS.OUTPUT_LAYER' #2

ChenZhiMing66 opened this issue May 25, 2023 · 7 comments

Comments

@ChenZhiMing66
Copy link

Hi. I have followed the TFA installization and dataset preparation. But why do I get this error when I run the code? I would be very grateful if you could tell me the reason.
image

@Phoenix-V
Copy link
Owner

Hi. Could you please check or let me know the config file you are running, i.e., what "args.config" is.

Regarding the missing key, the key MODEL.ROI_HEADS.OUTPUT_LAYER is used to call the layer FastRCNNOutputETFLayers defined in https://github.com/Phoenix-V/DiGeo/blob/master/fsdet/modeling/roi_heads/fast_rcnn.py#L617

The initial value of this argument is defined here
https://github.com/Phoenix-V/DiGeo/blob/master/fsdet/config/defaults.py#L13

Please check whether your process successfully get into these two place.

@ChenZhiMing66
Copy link
Author

Thank you for your reply, but I haven't made any modifications to the two codes you mentioned, but they still won't run。
I have ran the “prior.sh” file and the "arg.config" is
image

@Phoenix-V
Copy link
Owner

Thank you for your feedback. If you are sure that you are at the root of this repo (i.e., outside the folder script) and you ran the sh file with the command provided before, I would recommend to check whether the detectron2 is installed properly. I would recommend to re-install the detectron2 following the instruction in TFA. I remembered that I got stuck into similar issue when I first run fsdet and it was solved after I re-installed everything.

Meanwhile, you can also check whether the code runs to "https://github.com/Phoenix-V/DiGeo/blob/master/fsdet/config/config.py#L75" correctly. If not, I would recommend check the document of detectron2 to find the solution. When you debug, you can choose to only use a single GPU such that you can use pdb.set_trace() to get more information

@ChenZhiMing66
Copy link
Author

Thank you for your patient answer. I successfully ran the code and also ran several sets of experiments, but I have some questions. I have tried VOC dataset split 1. But my results are far lower than the experimental results mentioned in your paper, such as nAP50(26.1 vs 37.9) AP50(66.2 vs 69.7) in 1-shot, nAP50(40.1 vs 48.5) AP50(70.5 vs 72.4) in 3-shot, nAP50(53.5 vs 58.6) AP50(72.8 vs 75.4) in 5-shot. And I have tried more experiments, but almost every experiment is 4-5 points lower than the accuracy indicators you provided, and even a bit 10 points lower. What is the reason for this? My experimental parameters are basically the same as the source code you provided, I only modified the batch_ size has been changed from 16 to 8 because I only have one 3090 gpu.

Additionally, I have another question about why the pre_2shot In the config file, the parameter of REPEAT_THRESHOLD is set to 0.05, while the same parameter for other shots is 0.01

@Phoenix-V
Copy link
Owner

Glad to hear the previous issue was resolved.

For the performance gap, I am not sure if I could provide any meaningful feedback as I have never tried the batch size as 8. However, what I would explain is that the length of model training is not determined by epoch by default in Detectron2 but the the steps. In other words, if you half the batch size but without doubling the steps, then the model in your training process only see half of the samples seen by my model. For the debugging purpose, I would recommend to try to run TFA first with your hyper-parameter setup and check their performance as a baseline reference.

For the value set of threshold, it is purely obtained through grid search. However, I acknowledge that it is not always the best value for all environments (e.g., pytorch version, random seed, and even on different server instances). As such, it may not be the best value in your environment.

@ChenZhiMing66
Copy link
Author

So I should double the MAX_ITER, and change it from 20000 to 40000.

@Phoenix-V
Copy link
Owner

Frankly speaking, I am not sure purely changing MAX_ITER from 20K to 40K will reach the same performance as the scores obtained in the original setup. However, I believe the gap should be reduced.

Meanwhile, you can also consider changing the learning rate as well as the lr scheduler as they are also very important. Actually, the steps, lr and lr scheduler are mostly referred to the original setup in TFA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants