We use poerty to manage the package
To install poetry
curl -sSL https://install.python-poetry.org | python3 -
To install packages
poetry install
If your poetry.lock file is too old, use this.
poetry update --lock
Currently, we don't track the poetry.lock file in the early development stage
We provide our new sampled queries in the following link (https://drive.google.com/file/d/1UrOMijvCL7OVqDQsYKYQ7ETV-u7nXENO/view?usp=sharing). Then, after unzipping the query data. an example data folder should look like this:
data/KG&queries/
- kgindex.json
- train_kg.tsv
- valid_kg.tsv
- test_kg.tsv
- test_type0000_EFO3-small_qaa.json
To reproduce the results of basslines in the paper, we have provided the checkpoint for each model foreach knowledge graph, we offer the checkpoint for six representative model (BetaE, LogicE, ConE, CQD, LMPNN, FIT), which can be downloaded from here,
It should be unzipped and put in the ckpt folder.
An example of the ckpt sub folder, which includes the model trained on the knowledge graph ``FB15k-237'' should look like this:
ckpt/FB15k-237
- BetaE_full/checkpoint
- LogicE_full/450000.ckpt
- ConE_full/300000.ckpt
- CQD/FB15k-237-model-rank-1000-epoch-100-1602508358.pt
- LMPNN/lmpnn-FB15k-237.ckpt
- FIT/torch_0.005_0.001.ckpt
where each sub folder is the checkpoint for each model, and the name of the sub folder is the name of the model.
The backbone models involves query embedding menthod and knowledge graph embedding with hyper network that address relation tail prediction. offer our used checkpoint can be found here (https://drive.google.com/drive/folders/1xVd-PihAbwi4RE9rmkMGkHtgQLssJZE2?usp=share_link). It should be unzipped and put in the pretrain folder. An example of the pretrain sub folder trained on the knowledge graph ``FB15k-237'' should look like this:
pretrain/FB15K-237
- cache_ada.pt
- FB15K-237_best_valid.model
- lmpnn-FB15K-237.ckpt
lmpnn-FB15K-237.ckpt refer to LMPNN and FB15K-237_best_valid.model refers to knowledge graph embedding with hyper network. cache_ada.pt is the cached adapatived values.
In this structure:
lmpnn-FB15K-237.ckpt refers to the LMPNN model. FB15K-237_best_valid.model corresponds to the knowledge graph embedding with a hypernetwork. cache_ada.pt contains the cached adaptive values.
We implement NLI using LMPNN and ComplEx as the backbone models. We also present options to cache the scaling values and parallelize the entire search process.
To reproduce our method on Real EFO1 benchmark, please run the following command:
python solve_EFOX_small_NS3.py --batch_size 32 --data_folder data/KG&queries --num_domain 2000 --num_candidate 2000