Skip to content

sanianayab/KG-FakeBench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

25 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

MIT License Python

KG-FakeBench: Generating and Detecting Knowledge-Graph-Grounded Fake Information with LLMs

KG-FakeBench is a large-scale benchmark for evaluating large language models (LLMs) on AI-generated misinformation under controlled factual deviations.

The benchmark leverages knowledge graphs (KGs) to generate misinformation that is factually incorrect yet semantically plausible, and introduces a KG-consistent evidence framework for structured, evidence-based detection.

πŸ“„ Paper: coming soon


πŸ”‘ Key Contributions

  • KG-Grounded Benchmark
    A large-scale dataset generated via structured KG deviations, ensuring fine-grained control over factual plausibility and transparent provenance.

  • KG-Consistent Evidence Detection
    A structured pipeline that extracts (s, r, o) triples and uses them as external evidence for LLM-based verification.

  • Comprehensive Evaluation
    Evaluation across standard, CoT, and KG-grounded prompting shows that external evidence improves detection reliability and reveals model behavior and bias.


πŸ“Š Dataset

  • 28,900 synthetic samples
    • 14,450 high-plausibility
    • 14,450 low-plausibility
  • 1,239 real samples

The dataset is derived from WikiGraphs, ensuring structured and verifiable factual grounding.

πŸ“‚ See data/ for details.


🧠 Methodology

KG-FakeBench consists of two main components:

πŸ”Ή 1. KG-Driven Fake Information Generation

  • Extract reference triples ⟨s, r, o⟩ from KG
  • Generate fake triples ⟨s, r, oβ€²βŸ© with controlled plausibility
  • Use LLMs to synthesize natural language misinformation

πŸ”Ή 2. KG-Consistent Evidence Detection

  • Tokenize and normalize input statements
  • Retrieve candidate entities via similarity matching
  • Ground statements into KG triples
  • Use triple-based prompting for factual verification

πŸ“‚ Repository Structure

KG-FakeBench/
β”œβ”€β”€ data/                                   # Dataset (fake + real samples)
β”œβ”€β”€ code/
β”‚   β”œβ”€β”€ KG-Driven Fake Information Generation/   # Generation pipeline
β”‚   β”œβ”€β”€ KG-Consistent Evidence Detection/        # Detection pipeline
β”œβ”€β”€ README.md

βš™οΈ Code

The implementation is divided into:

  • KG-Driven Fake Information Generation
  • KG-Consistent Evidence Detection

πŸ“‚ See code/ for details.


⚠️ Disclaimer

This dataset contains synthetic misinformation generated for research purposes only.
It is intended to support the development of robust and trustworthy detection systems.


πŸ”— Resources

  • πŸ“¦ Dataset: data/
  • βš™οΈ Code: code/

🀝 Contributing

Contributions are welcome. Please open an issue or submit a pull request.

About

KG-FakeBench is a knowledge graph-grounded benchmark for evaluating LLM-generated misinformation. It provides 28.9K synthetic samples with controlled factual distortions and KG-based evidence, enabling reproducible evaluation of detection methods under varying plausibility levels.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages