Deep AUC Maximization on Graph Property Prediction

This repo contains code submission for OGB challenge. Here, we focus on ogbg-molhiv, which is a binary classification task to predict target molecular property, e.g, whether a molecule inhibits HIV virus replication or not. The evaluation metric is AUROC. To our best knowledge, this is the first solution to directly optimize AUC score in this task. Our AUC-Margin loss improves baseline (DeepGCN) to 0.8159 and achieves SOTA performance 0.8352 when jointly training with Neural FingerPrints. Our approaches are implemented in LibAUC, which is a ML library for AUC optimization.

Results on ogbg-molhiv

Our method ranks 1st place as of 10/11/2021 on the leaderboard! We present our results on the ogbg-molhiv dataset with some strong baselines as below:

Method	Test AUROC	Validation AUROC	Parameters	Hardware
DeepGCN	0.7858±0.0117	0.8427±0.0063	531,976	Tesla V100 (32GB)
DeeperGCN+FLAG	0.7942±0.0120	0.8425±0.0061	531,976	Tesla V100 (32GB)
Neural FingerPrints	0.8232±0.0047	0.8331±0.0054	2,425,102	Tesla V100 (32GB)
Graphormer	0.8051±0.0053	0.8310±0.0089	47,183,040	Tesla V100 (16GB)
DeepAUC (Ours)	0.8159±0.0059	0.8054±0.0080	1,019,407	Tesla V100 (32GB)
DeepAUC+FPs (Ours)	0.8352±0.0054	0.8238±0.0061	1,019,407**	Tesla V100 (32GB)

Note that this number** doesn't count the parameters of Random Forest model.

Requirements

Install base packages:

Python>=3.7
Pytorch>=1.9.0
tensorflow>=2.0.0
pytorch_geometric>=1.6.0
ogb>=1.3.2 
dgl>=0.5.3 
numpy==1.20.3
pandas==1.2.5
scikit-learn==0.24.2
deep_gcns_torch

Install LibAUC (using AUC-Margin loss and PESG optimizer):
```
pip install LibAUC
```

Training

The training process has two steps: 1) we train a DeepGCN model using our AUC-margin loss from scratch. 2) we jointly finetuning the pretrained model from (1) with FingerPrints models.

Training from scratch using AUC-margin loss:

Train DeepGCN model with AUC-Margin loss and PESG optimizer by default parameters

python main.py --use_gpu --conv_encode_edge --num_layers 14 --block res+ --gcn_aggr softmax --t 1.0 --learn_t --dropout 0.2 \
            --dataset ogbg-molhiv \
	    --loss auroc \
            --optimizer pesg \
            --batch_size 512 \
	    --lr 0.1 \
            --gamma 500 \
            --margin 1.0 \
            --weight_decay 1e-5 \
            --random_seed 0 \
            --epochs 300

Jointly traininig with FingerPrints Model

Extract fingerprints and train Random Forest by following PaddleHelix

python extract_fingerprint.py
python random_forest.py

Finetuning pretrained model with FingerPrints model using AUC-margin loss by default parameters

python finetune.py --use_gpu --conv_encode_edge --num_layers 14 --block res+ --gcn_aggr softmax --t 1.0 --learn_t --dropout 0.2 \
            --dataset ogbg-molhiv \
	    --loss auroc \
	    --optimizer pesg \
            --batch_size 512 \
	    --lr 0.01 \
            --gamma 300 \
            --margin 1.0 \
            --weight_decay 1e-5 \
            --random_seed 0 \
            --epochs 100

Results

The results (1) improves the original baseline (DeepGCN) to 0.8159, which is ~3% improvement. The result (2) achieves a higher SOTA performance 0.8352, which is ~1% improvement over previous baselines. For each stage, we train model by 10 times using different random seeds, e.g., 0 to 9.

Citation

If you have any questions, please open an new issue in this repo or contact us @ Zhuoning Yuan [yzhuoning@gmail.com]. If you find this work useful, please cite the following paper for our method and library:

@inproceedings{yuan2021robust,
	title={Large-scale Robust Deep AUC Maximization: A New Surrogate Loss and Empirical Studies on Medical Image Classification},
	author={Yuan, Zhuoning and Yan, Yan and Sonka, Milan and Yang, Tianbao},
	booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
	year={2021}
	}

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
gcn_lib		gcn_lib
utils		utils
README.md		README.md
__init__.py		__init__.py
args.py		args.py
extract_fingerprint.py		extract_fingerprint.py
finetune.py		finetune.py
main.py		main.py
model.py		model.py
model_att.py		model_att.py
random_forest.py		random_forest.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gcn_lib

gcn_lib

utils

utils

README.md

README.md

init.py

init.py

args.py

args.py

extract_fingerprint.py

extract_fingerprint.py

finetune.py

finetune.py

main.py

main.py

model.py

model.py

model_att.py

model_att.py

random_forest.py

random_forest.py

Repository files navigation

Deep AUC Maximization on Graph Property Prediction

Results on ogbg-molhiv

Requirements

Training

Training from scratch using AUC-margin loss:

Jointly traininig with FingerPrints Model

Results

Citation

Reference

About

Releases

Packages

Languages

yzhuoning/DeepAUC_OGB_Challenge

Folders and files

Latest commit

History

Repository files navigation

Deep AUC Maximization on Graph Property Prediction

Results on ogbg-molhiv

Requirements

Training

Training from scratch using AUC-margin loss:

Jointly traininig with FingerPrints Model

Results

Citation

Reference

About

Topics

Resources

Stars

Watchers

Forks

Languages