FNP – FNP 2026 FNP 2026 proceedings PDF. When Tables Go Crazy: Evaluating Multimodal Models on French Financial Documents Links [2602.10384] When Tables Go Crazy: Evaluating Multimodal Models on French Financial Documents Github contains both dataset and scripts! Benchmark French visual finance docs, includes evaluation on multiple VLMs Chart checkboxes graphs and weird edge cases specifically present Bits LLMs when asked to generate questions usually generate only questions they have answers to CFQA: A Chinese Financial Question Answering Benchmark from Corporate Annual Reports Links: ZackZhu00/CFQA_Chinese_Finance_Question_Answering · Datasets at Hugging Face zhutianning/Hallucination-detection-for-RAG: This project aims to design and implement a hallucination detection and evaluation pipeline for Retrieval- Augmented Generation (RAG) systems processing multi-modal financial reports (text, tables, images/charts). Unclear relation: Github (?) / Addressing investor concerns: a Chinese financial question-answering benchmark with LLM-based evaluation | EPJ Data Science | Springer Nature Link (??) Takeaway RAG decreases hallucinations and improves fact extraction scores but not clear for other tasks involving reasoning, and decrease scores w/ some model[s (!) Verifiable Financial Enterprise Question Answering via Inference-Time Grounding and Traceability LLMs grounding verifiable citations Framework, modular, real-time Verify citations and fix citation drift by looking at ovelaps/presence w/ source documents. Sentence-level citations allegedly lead to better groundedness TODO Many many interesting citations to parse Related paper I just found: [2604.23588] FinGround: Detecting and Grounding Financial Hallucinations via Atomic Claim Verification
No pages have linked to this URL yet.