RTIC: Residual Learning for Text and Image Composition using Graph Convolutional Network

This is the official code of RTIC: Residual Learning for Text and Image Composition using Graph Convolutional Network. The code only supports training and evaluation on FashionIQ. We release the implementations for the other baselines together.

Updates

(2021.10.26) Update model checkpoints, trianing configs and tensorboard logs.
(2021.09.10) The official code is released.

Requirements

Prepare your environment with virtualenv.

python3 -m virtualenv --python=python3 venv # create virtualenv.
. venv/bin/activate # activate environment.
pip3 install -r requirements.txt # install require packages.

Download Data

We provide script for downloading FashionIQ images. Note that it does not ensure that all images can be downloaded because we found some urls are broken.

sh script/download_fiq.sh

Model Zoo

We provide pretrained checkpoints for RTIC / RTIC-GCN trained on FashionIQ.

Model	Recall	Checkpoint	Config	Training Log
RTIC	39.22	ckpt	config	tensorboard_log
RTIC-GCN (scratch)	39.55	ckpt	config	tensorboard_log
RTIC-GCN (finetune)	40.64	ckpt	config	tensorboard_log

Benchmark Score on FashionIQ Dataset

Method	Metric ((R@10 + R@50) / 2)	Paper
JVSM	19.26	pdf
TRACE w/ BERT	34.38	pdf
VAL w/ GloVe	35.38	pdf
CIRPLANT w/ OSCAR	30.20	pdf
MAAF	36.60	pdf
CurlingNet	38.45	pdf
CoSMo	39.45	pdf
RTIC w/ GloVe	39.22	-
RTIC-GCN w/ GloVe (scratch)	39.55	-
RTIC-GCN w/ GloVe (fine-tune)	40.64	-

Quick Start

We provide sample training script to run on different configurations. The default configurations are stored in cfg/default.yaml which represents "unified environmet" in our paper. To try with "optimal environment", please use +optimize=<someting> option.

(1) RTIC (unified env)

EXPR_NAME=testrun python main.py \
    config.EXPR_NAME=${EXPR_NAME}

(2) RTIC (optimal env)

EXPR_NAME=testrun python main.py \
    +optimize=rtic \
    config.EXPR_NAME=${EXPR_NAME}

(3) RTIC-GCN (optimal env, scratch)

EXPR_NAME=testrun_gcn LOAD_FROM=testrun python main.py \
    +optimize=rtic_gcn_scratch \
    +gcn=enabled \
    config.LOAD_FROM=${LOAD_FROM} \
    config.EXPR_NAME=${EXPR_NAME}

(4) RTIC-GCN (optimal env, finetune)

EXPR_NAME=testrun_gcn LOAD_FROM=testrun python main.py \
    +optimize=rtic_gcn_finetune \
    +gcn=enabled \
    config.LOAD_FROM=${LOAD_FROM} \
    config.EXPR_NAME=${EXPR_NAME}

(5) Other Baselines

you can train any other baselines by simply changing config.TRAIN.MODEL.composer_model.name.

(w/o GCN)
EXPR_NAME=testrun python main.py \
    config.TRAIN.MODEL.composer_model.name=<any-composer-method-you-want-to-try> \
    config.EXPR_NAME=${EXPR_NAME}

(w GCN)
EXPR_NAME=testrun_gcn LOAD_FROM=testrun python main.py \
    +gcn=enabled \
    config.TRAIN.MODEL.composer_model.name=<any-composer-method-you-want-to-try> \
    config.LOAD_FROM=${LOAD_FROM} \
    config.EXPR_NAME=${EXPR_NAME}

Citation

If you find this work useful for your research, please cite our paper:

@article{shin2021rtic,
  title={RTIC: Residual Learning for Text and Image Composition using Graph Convolutional Network},
  author={Shin, Minchul and Cho, Yoonjae and Ko, Byungsoo and Gu, Geonmo},
  journal={arXiv preprint arXiv:2104.03015},
  year={2021}
}

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
cfg		cfg
dataset		dataset
glove		glove
img		img
misc		misc
model		model
runner		runner
script		script
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
linter.sh		linter.sh
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RTIC: Residual Learning for Text and Image Composition using Graph Convolutional Network

Updates

Requirements

Download Data

Model Zoo

Benchmark Score on FashionIQ Dataset

Quick Start

Citation

License

About

Releases

Packages

Languages

License

nashory/rtic-gcn-pytorch

Folders and files

Latest commit

History

Repository files navigation

RTIC: Residual Learning for Text and Image Composition using Graph Convolutional Network

Updates

Requirements

Download Data

Model Zoo

Benchmark Score on FashionIQ Dataset

Quick Start

Citation

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages