Customize dataset for Knowledge Graph-based model #982

mvOvOmg · 2021-09-30T07:07:27Z

mvOvOmg
Sep 30, 2021

Hello, developers,
I have to use originial dataset and have already split into training, valid and test set.
Without KG part, I have to rename them as A.train.inter, A.valid.inter and A.test.inter, and set 'benchmark_filename' as ['train','val','test].
I am wondering how to construct customize dataset for Knowledge Graph-based models?

How about those with .kg and .link? Should I also set 3 different .kg and .link files?
I download ML-KG to find how to construct Knowledge Graph, and there are many .kg files. Should I also construct the dataset like this?

Sincerely,
Mina

Sherry-XLL · 2021-09-30T08:34:01Z

Sherry-XLL
Sep 30, 2021
Maintainer

@mvOvOmg Hello, Mina. Thanks for your attention to RecBole! For the first question, you don't need to manually divide the originial dataset into three parts, just prepare the atomic files required by RecBole. In other words, your atomic file contains the entire dataset, and RecBole will divide the dataset into training, validation and testing parts. You can also change the data splitting strategy by config settings.

Knowledge-based recommendation models utilize KG information to make recommendations, so it needs to specify and load the kg information of the dataset. If you want to use KG models on ml-1m datasets, you need to put ml-1m.inter, ml-1m.link and ml-1m.kg into RecBole/dataset/ml-1m, then specify and load the kg entity columns in the configuration file as follows.

USER_ID_FIELD: user_id
ITEM_ID_FIELD: item_id
HEAD_ENTITY_ID_FIELD: head_id
TAIL_ENTITY_ID_FIELD: tail_id
RELATION_ID_FIELD: relation_id
ENTITY_ID_FIELD: entity_id
load_col:
    inter: [user_id, item_id]
    kg: [head_id, relation_id, tail_id]
    link: [item_id, entity_id]

For the second question, you should generate the KG atomic files (*.kg and *.link) for MovieLens dataset based on ML-KG that you have downloaded. In order to facilitate the users using RecBole, we have developed a repository RecSysDatasets of public data sources for Recommender Systems (RS), including knowledge graph generation. You can refer to MovieLens-KG.md for more details about the conversion process.

2 replies

mvOvOmg Sep 30, 2021
Author

Thanks for replying.
My dataset cannot be split by ratio or according to time, so I have to divide before using Recbole.
The dataset is not movie-lens, it has its own structure, I am now changing it to fit the Recbole.

And I am sorry I have not read all the code of Recbole, so I do not know which model need hop1-3.kg, can I only generate only one whole knowledge graph and the .link file?
Thus, the whole dataset will be
A.train.inter, A.valid.inter, A.test.inter, A.kg, A.link.

Sherry-XLL Oct 1, 2021
Maintainer

@mvOvOmg hop is the number of neighbor hops we generate for the items in knowledge graph. In our setting, the maximum is 3, default is 1.

The image above is taken from RippleNet: Propagating User Preferences on the Knowledge Graph for Recommender Systems.

From the processing of hop parameters in the source code, it can be seen that hop=1 means to use only hop1.kg, hop=2 means to use hop1.kg and hop2.kg, and hop=3 means to use hop1.kg, hop2.kg and hop3.kg. You can choose the size of hop according to your needs. If you want to generate only one whole knowledge graph including three hops, just set hop=3 when preprocessing.

for i in range(self.hop):
    kg_file = self.kg_file_list[i]
    history_entities |= seed_entities

    seed_entities, temp_triples = extract_hop_graph(kg_file, self.selected_relations,
                                                    seed_entities, history_entities)
    hop_triples += temp_triples

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Customize dataset for Knowledge Graph-based model #982

{{title}}

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{title}}

Select a reply

Customize dataset for Knowledge Graph-based model #982

mvOvOmg Sep 30, 2021

Replies: 1 comment · 2 replies

Sherry-XLL Sep 30, 2021 Maintainer

mvOvOmg Sep 30, 2021 Author

Sherry-XLL Oct 1, 2021 Maintainer

mvOvOmg
Sep 30, 2021

Replies: 1 comment 2 replies

Sherry-XLL
Sep 30, 2021
Maintainer

mvOvOmg Sep 30, 2021
Author

Sherry-XLL Oct 1, 2021
Maintainer