Welcome to the repository for the model of KG Language (MKGL). This project investigates the potential of LLMs in understanding and interacting with knowledge graphs, a domain that has received limited exploration in the context of NLP.
Large language models (LLMs) have significantly advanced performance across a spectrum of natural language processing (NLP) tasks. Yet, their application to knowledge graphs (KGs), which describe facts in the form of triplets and allow minimal hallucinations, remains an underexplored frontier. In this project, we investigate the integration of LLMs with KGs by introducing a specialized KG Language (KGL), where a sentence precisely consists of an entity noun, a relation verb, and ends with another entity noun.
To run this project, please first install all required packages:
pip install --upgrade pandas transformers peft==0.9 bitsandbytes swifter deepspeed easydict pyyaml
please kindly install the pyg packages via wheels, which is much faster:
pip install --find-links MKGL/pyg_wheels/ torch-scatter torch-sparse torchdrug
Then, we need to preprocess the datasets,
for standard KG completion:
python preprocess.py -c config/fb15k237.yaml
python preprocess.py -c config/wn18rr.yaml
for inductive setting:
python preprocess.py -c config/fb15k237_ind.yaml --version v1
python preprocess.py -c config/wn18rr_ind.yaml --version v1
If you only has one GPU (better has 80GB memory under the default setting), please run the model with the following command:
python main.py -c config/fb15k237.yaml
If you can access multiple GPUs, please run the model with the following command:
accelerate launch --gpu_ids 'all' --num_processes 8 --mixed_precision bf16 main.py -c config/fb15k237.yaml
Please kindly use the provide scripts to run the model:
sh scripts/fb15k237.sh
Please condiser citing our paper if it is helpful to your work!
@inproceedings{MKGL,
author = {Lingbing Guo and
Zhongpu Bo and
Zhuo Chen and
Yichi Zhang and
Jiaoyan Chen and
Lan Yarong and
Mengshu Sun and
Zhiqiang Zhang and
Yangyifei Luo and
Qian Li and
Qiang Zhang and
Wen Zhang and
Huajun Chen},
title = {MGKL: Mastery of a Three-Word Language},
booktitle = {{NeurIPS}},
year = {2024}
}
We appreciate LLaMA, Huggingface Transformers, Alpaca, Alpaca-LoRA, and many other related works for their open-source contributions.