Skip to content

WANGLin0126/GCGP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Efficient Graph Condensation via Gaussian Process (GCGP)

📄 Read the Paper


📚 Table of Contents


🧠 Abstract

Graph condensation reduces graph sizes while maintaining performance, addressing the scalability challenges of GNNs caused by computational inefficiencies on large datasets. Existing methods often rely on bi-level optimization, which requires repeated GNN training and limits scalability.

This paper proposes Graph Condensation via Gaussian Process (GCGP) — a computationally efficient method that leverages a Gaussian Process (GP) to estimate predictions from input nodes without iterative GNN training.

Key innovations:

  • A covariance function aggregates local neighborhoods to capture complex node dependencies.
  • Concrete random variables approximate binary adjacency matrices in a differentiable form, enabling gradient-based optimization of discrete graph structures.

🔬 Methodology

Graph Condensation

Figure 1: Graph condensation condenses a large graph $G$ into a smaller but informative graph $G^{\mathcal{S}}$ that preserves performance on downstream tasks like GNN training.


Conventional graph condensation methods use a bi-level optimization framework:

  • Inner loop: Train a GNN on the condensed graph.
  • Outer loop: Update the condensed graph based on performance loss.

This is computationally expensive due to repeated GNN training.

🧪 GCGP: A Simpler Alternative

GCGP replaces iterative GNN training with a Gaussian Process, treating the condensed synthetic graph $G^{\mathcal{S}}$ as GP observations. The GP combines these with model priors to make predictions on the original graph $G$.

GCGP Workflow

Figure 2: The GCGP workflow includes:

  1. Using the condensed graph $G^{\mathcal{S}}$ as GP observations.
  2. Predicting node labels in the original graph $G$.
  3. Optimizing the condensed graph by minimizing the discrepancy between predictions and ground-truth labels.

🛠️ Implementation

🔧 Requirements

  • python=3.8.20
  • ogb=1.3.6
  • pytorch=1.12.1
  • pyg=2.5.2
  • numpy=1.24.3

💡 Tip: Install ogb first to avoid CUDA device recognition issues.

To set up the environment, run:

conda env create -f environment.yml

📂 Small Datasets (Cora, Citeseer, Pubmed, Photo, Computers)

Navigate to the gcgp folder:

cd gcgp

Run GCGP on a dataset (e.g., Cora):

python main.py --dataset Cora --cond_ratio 0.5 --ridge 0.5 --k 4 --epochs 200 --learn_A 0

To reproduce all results:

sh run.sh
  • Outputs will be saved in ./gcgp/outputs/
  • Final results collected in ./gcgp/results.csv via results.py

For generalization experiments:

sh run_generalization.sh
  • Outputs: ./gcgp/outputs_generalization/
  • Results: ./gcgp/results_generalization.csv

For efficiency/time evaluation:

sh run_time.sh
  • Outputs: ./gcgp/outputs_time/

🗂️ Large Datasets (Ogbn-arxiv and Reddit)

🔹 Ogbn-arxiv Dataset

Go to the folder:

cd gcgp_ogb

Run GCGP:

python main.py --dataset ogbn-arxiv --cond_size 90 --ridge 5 --k 2 --epochs 200 --learn_A 0

To reproduce all results:

sh run.sh
  • Outputs: ./gcgp_ogb/outputs/
  • Results: ./gcgp_ogb/results.csv

For time analysis:

sh run_time.sh
  • Outputs: ./gcgp_ogb/outputs_time/

🔹 Reddit Dataset

Navigate to:

cd gcgp_reddit

Run GCGP:

python main.py --dataset Reddit --cond_size 77 --ridge 0.1 --k 2 --epochs 270 --learn_A 0

To reproduce all results:

sh run.sh
  • Outputs: ./gcgp_reddit/outputs/
  • Results: ./gcgp_reddit/results.csv

For training time evaluation:

sh run_time.sh
  • Outputs: ./gcgp_reddit/outputs_time/

📖 Cite Our Paper

If you find our paper or code useful, please cite:

@article{wang2025efficient,
  title={Efficient Graph Condensation via Gaussian Process},
  author={Wang, Lin and Li, Qing},
  journal={arXiv preprint arXiv:2501.02565},
  year={2025}
}

📄 License

MIT License © 2025 WANG Lin

About

Efficient Graph Condensation via Gaussian Process

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published