This is the official code of RTIC: Residual Learning for Text and Image Composition using Graph Convolutional Network. The code only supports training and evaluation on FashionIQ. We release the implementations for the other baselines together.
- (2021.10.26) Update model checkpoints, trianing configs and tensorboard logs.
- (2021.09.10) The official code is released.
Prepare your environment with virtualenv.
python3 -m virtualenv --python=python3 venv # create virtualenv.
. venv/bin/activate # activate environment.
pip3 install -r requirements.txt # install require packages.
We provide script for downloading FashionIQ images. Note that it does not ensure that all images can be downloaded because we found some urls are broken.
sh script/download_fiq.sh
We provide pretrained checkpoints for RTIC / RTIC-GCN trained on FashionIQ.
Model | Recall | Checkpoint | Config | Training Log |
---|---|---|---|---|
RTIC | 39.22 | ckpt | config | tensorboard_log |
RTIC-GCN (scratch) | 39.55 | ckpt | config | tensorboard_log |
RTIC-GCN (finetune) | 40.64 | ckpt | config | tensorboard_log |
Method | Metric ((R@10 + R@50) / 2) | Paper |
---|---|---|
JVSM | 19.26 | |
TRACE w/ BERT | 34.38 | |
VAL w/ GloVe | 35.38 | |
CIRPLANT w/ OSCAR | 30.20 | |
MAAF | 36.60 | |
CurlingNet | 38.45 | |
CoSMo | 39.45 | |
RTIC w/ GloVe | 39.22 | - |
RTIC-GCN w/ GloVe (scratch) | 39.55 | - |
RTIC-GCN w/ GloVe (fine-tune) | 40.64 | - |
We provide sample training script to run on different configurations.
The default configurations are stored in cfg/default.yaml
which represents "unified environmet" in our paper.
To try with "optimal environment", please use +optimize=<someting>
option.
(1) RTIC (unified env)
EXPR_NAME=testrun python main.py \
config.EXPR_NAME=${EXPR_NAME}
(2) RTIC (optimal env)
EXPR_NAME=testrun python main.py \
+optimize=rtic \
config.EXPR_NAME=${EXPR_NAME}
(3) RTIC-GCN (optimal env, scratch)
EXPR_NAME=testrun_gcn LOAD_FROM=testrun python main.py \
+optimize=rtic_gcn_scratch \
+gcn=enabled \
config.LOAD_FROM=${LOAD_FROM} \
config.EXPR_NAME=${EXPR_NAME}
(4) RTIC-GCN (optimal env, finetune)
EXPR_NAME=testrun_gcn LOAD_FROM=testrun python main.py \
+optimize=rtic_gcn_finetune \
+gcn=enabled \
config.LOAD_FROM=${LOAD_FROM} \
config.EXPR_NAME=${EXPR_NAME}
(5) Other Baselines
you can train any other baselines by simply changing config.TRAIN.MODEL.composer_model.name
.
(w/o GCN)
EXPR_NAME=testrun python main.py \
config.TRAIN.MODEL.composer_model.name=<any-composer-method-you-want-to-try> \
config.EXPR_NAME=${EXPR_NAME}
(w GCN)
EXPR_NAME=testrun_gcn LOAD_FROM=testrun python main.py \
+gcn=enabled \
config.TRAIN.MODEL.composer_model.name=<any-composer-method-you-want-to-try> \
config.LOAD_FROM=${LOAD_FROM} \
config.EXPR_NAME=${EXPR_NAME}
If you find this work useful for your research, please cite our paper:
@article{shin2021rtic,
title={RTIC: Residual Learning for Text and Image Composition using Graph Convolutional Network},
author={Shin, Minchul and Cho, Yoonjae and Ko, Byungsoo and Gu, Geonmo},
journal={arXiv preprint arXiv:2104.03015},
year={2021}
}