UQLM: Uncertainty Quantification for Language Models, is a Python package for UQ-based LLM hallucination detection
-
Updated
Apr 16, 2026 - Python
UQLM: Uncertainty Quantification for Language Models, is a Python package for UQ-based LLM hallucination detection
Lightweight hallucination detection framework for RAG applications
up-to-date curated list of state-of-the-art Large vision language models hallucinations research work, papers & resources
[ACL 2024] User-friendly evaluation framework: Eval Suite & Benchmarks: UHGEval, HaluEval, HalluQA, etc.
HaluMem is the first operation level hallucination evaluation benchmark tailored to agent memory systems.
🔎Official code for our paper: "VL-Uncertainty: Detecting Hallucination in Large Vision-Language Model via Uncertainty Estimation".
[ACL 2026] Paper list of Video LLM hallucination. Welcome to Star and Contribute!
Unofficial implementation of Microsoft’s Claimify Paper: extracts specific, verifiable, decontextualized claims from LLM Q&A to be used for Hallucination, Groundedness, Relevancy and Truthfulness detection
A benchmark for evaluating hallucinations in large visual language models
Code release for THRONE, a CVPR 2024 paper on measuring object hallucinations in LVLM generated text.
Repository for the paper "A Unified Definition of Hallucination: It’s The World Model, Stupid!" https://arxiv.org/abs/2512.21577
When AI makes $10M decisions, hallucinations aren't bugs—they're business risks. We built the verification infrastructure that makes AI agents accountable without slowing them down.
TrustScoreEval: Trust Scores for AI/LLM Responses — Detect hallucinations, flags misinformation & Validate outputs. Build trustworthy AI.
A blind benchmark for legal citation verification — 4-label classification over IL + federal primary law
MCP server for Eruka — anti-hallucination context memory for AI agents
HALLUCINATED BY CURSOR WITh CODEX PLUGIN:::BEWARE:::::BaseX Coding Language - Revolutionary Base 5.10 Quantum Teleportation & Infinite Storage System by Joshua Hendricks Cole
A comprehensive study on reducing hallucinations in Large Language Models through strategic prompt engineering techniques. (COV + COT + Hybrid)
Driving away from the binary "hallucinations" evals to a more nuanced and context-dependent eval technique.
An interactive Python chatbot demonstrating real-time contextual hallucination detection in Large Language Models using the "Lookback Lens" method. This project implements the attention-based ratio feature extraction and a trained classifier to identify when an LLM deviates from the provided context during generation.
Dataset Generation and Pre-processing Scripts for the Research titled: Leveraging the Domain Adaptation of Retrieval Augmented Generation (RAG) Models in Conversational AI for Enhanced Customer Service
Add a description, image, and links to the hallucination-evaluation topic page so that developers can more easily learn about it.
To associate your repository with the hallucination-evaluation topic, visit your repo's landing page and select "manage topics."