Liang, J.[Jian],
Huang, W.K.[Wen-Ke],
Wan, G.C.[Guan-Cheng],
Yang, Q.[Qu],
Ye, M.[Mang],
LoRASculpt: Sculpting LoRA for Harmonizing General and Specialized
Knowledge in Multimodal Large Language Models,
CVPR25(26170-26180)
IEEE DOI
2508
Visualization, Large language models, Prevention and mitigation,
Redundancy, Trajectory, Tuning
BibRef
Cao, Y.[Yue],
Xing, Y.[Yun],
Zhang, J.[Jie],
Lin, D.[Di],
Zhang, T.W.[Tian-Wei],
Tsang, I.[Ivor],
Liu, Y.[Yang],
Guo, Q.[Qing],
SceneTAP: Scene-Coherent Typographic Adversarial Planner against
Vision-Language Models in Real-World Environments,
CVPR25(25050-25059)
IEEE DOI Code:
WWW Link.
2508
Printing, Visualization, Codes, Cognition, Planning,
physical adversarial attack, typographic attack, llm agent,
BibRef
Wang, Y.B.[Yan-Bo],
Guan, J.[Jiyang],
Liang, J.[Jian],
He, R.[Ran],
Do We Really Need Curated Malicious Data for Safety Alignment in
Multi-modal Large Language Models?,
CVPR25(19879-19889)
IEEE DOI
2508
Large language models, Current measurement, Boosting, Data models,
Safety, Security, distribution gap, MLLM safety
BibRef
Peng, R.[Ruotian],
He, H.Y.[Hai-Ying],
Wei, Y.[Yake],
Wen, Y.D.[Yan-Dong],
Hu, D.[Di],
Matters: Training-free Fine-grained Image Caption Enhancement via
Local Perception,
CVPR25(3963-3973)
IEEE DOI Code:
WWW Link.
2508
Visualization, Filtering, Computational modeling, Source coding,
Semantics, Pipelines, Text to image, Reliability, Text to video,
hallucinations
BibRef
Yang, Z.[Zhihe],
Luo, X.[Xufang],
Han, D.Q.[Dong-Qi],
Xu, Y.J.[Yun-Jian],
Li, D.S.[Dong-Sheng],
Mitigating Hallucinations in Large Vision-Language Models via DPO:
On-Policy Data Hold the Key,
CVPR25(10610-10620)
IEEE DOI
2508
Training, Fault diagnosis, Benchmark testing, Data models,
Optimization, reinforcement learning, dpo, hallucinations,
large language models
BibRef
Bae, K.[Kyungho],
Kim, J.[Jinhyung],
Lee, S.[Sihaeng],
Lee, S.[Soonyoung],
Lee, G.[Gunhee],
Choi, J.[Jinwoo],
MASH-VLM: Mitigating Action-Scene Hallucination in Video-LLMs through
Disentangled Spatial-Temporal Representations,
CVPR25(13744-13753)
IEEE DOI
2508
Technological innovation, Attention mechanisms,
Large language models, Benchmark testing, Predictive models,
Context modeling
BibRef
Yin, H.[Hao],
Si, G.Z.[Gunag-Zong],
Wang, Z.[Zilei],
ClearSight: Visual Signal Enhancement for Object Hallucination
Mitigation in Multimodal Large Language Models,
CVPR25(14625-14634)
IEEE DOI Code:
WWW Link.
2508
Training, Visualization, Accuracy, Prevention and mitigation,
Large language models, Computational modeling, Coherence, Decoding,
attention mechanism
BibRef
Yang, L.[Le],
Zheng, Z.W.[Zi-Wei],
Chen, B.[Boxu],
Zhao, Z.Y.[Zheng-Yu],
Lin, C.H.[Chen-Hao],
Shen, C.[Chao],
Nullu: Mitigating Object Hallucinations in Large Vision-Language
Models via HalluSpace Projection,
CVPR25(14635-14645)
IEEE DOI Code:
WWW Link.
2508
Visualization, Costs, Codes, Large language models,
Computational modeling, Null space, Feature extraction,
ai safety
BibRef
Wu, Y.C.[Yuan-Chen],
Zhang, L.[Lu],
Yao, H.[Hang],
Du, J.L.[Jun-Long],
Yan, K.[Ke],
Ding, S.H.[Shou-Hong],
Wu, Y.S.[Yun-Sheng],
Li, X.Q.[Xiao-Qiang],
Antidote: A Unified Framework for Mitigating LVLM Hallucinations in
Counterfactual Presupposition and Object Perception,
CVPR25(14646-14656)
IEEE DOI
2508
Prevention and mitigation, Focusing, Benchmark testing,
Reliability, Optimization, Synthetic data
BibRef
Tu, Y.[Yahan],
Hu, R.[Rui],
Sang, J.[Jitao],
ODE: Open-Set Evaluation of Hallucinations in Multimodal Large
Language Models,
CVPR25(19836-19845)
IEEE DOI Code:
WWW Link.
2508
Visualization, Protocols, Codes, Large language models,
Benchmark testing, Question answering (information retrieval),
Contamination
BibRef
Liu, J.Z.[Jia-Zhen],
Fu, Y.H.[Yu-Han],
Xie, R.[Ruobing],
Xie, R.[Runquan],
Sun, X.[Xingwu],
Lian, F.Z.[Feng-Zong],
Kang, Z.[Zhanhui],
Li, X.R.[Xi-Rong],
PhD: A ChatGPT-Prompted Visual hallucination Evaluation Dataset,
CVPR25(19857-19866)
IEEE DOI
2508
Visualization, Image synthesis, Large language models, Pipelines,
Question generation, Distance measurement, computer vision, mllms, hallucination evaluation
BibRef
Jiang, Z.Q.[Zhang-Qi],
Chen, J.K.[Jun-Kai],
Zhu, B.[Beier],
Luo, T.J.[Ting-Jin],
Shen, Y.K.[Yan-Kun],
Yang, X.[Xu],
Devils in Middle Layers of Large Vision-Language Models:
Interpreting, Detecting and Mitigating Object Hallucinations via
Attention Lens,
CVPR25(25004-25014)
IEEE DOI
2508
Training, Visualization, Computational modeling, Semantics,
Reliability, Lenses, vision-language models, hallucinations
BibRef
Park, E.[Eunkyu],
Kim, M.[Minyeong],
Kim, G.[Gunhee],
HalLoc: Token-level Localization of Hallucinations for Vision
Language Models,
CVPR25(29893-29903)
IEEE DOI Code:
WWW Link.
2508
Training, Location awareness, Visualization, Accuracy,
Computational modeling, Prevention and mitigation,
hallucination detection benchmark for vision and language models
BibRef
Suo, W.[Wei],
Zhang, L.J.[Li-Jun],
Sun, M.Y.[Meng-Yang],
Wu, L.Y.B.[Lin Yuan-Bo],
Wang, P.[Peng],
Zhang, Y.N.[Yan-Ning],
Octopus: Alleviating Hallucination via Dynamic Contrastive Decoding,
CVPR25(29904-29914)
IEEE DOI Code:
WWW Link.
2508
Visualization, Adaptation models, Codes, Benchmark testing,
Hybrid power systems, Cognition, Decoding, Faces,
contrastive decoding
BibRef
An, W.B.[Wen-Bin],
Tian, F.[Feng],
Leng, S.[Sicong],
Nie, J.H.[Jia-Hao],
Lin, H.[Haonan],
Wang, Q.Y.[Qian-Ying],
Chen, P.[Ping],
Zhang, X.Q.[Xiao-Qin],
Lu, S.J.[Shi-Jian],
Mitigating Object Hallucinations in Large Vision-Language Models with
Assembly of Global and Local Attention,
CVPR25(29915-29926)
IEEE DOI Code:
WWW Link.
2508
Visualization, Codes, Grounding, Prevention and mitigation,
Computational modeling, Decoding, Object recognition, Assembly,
large vision-language models
BibRef
Zhuang, X.W.[Xian-Wei],
Zhu, Z.H.[Zhi-Hong],
Xie, Y.X.[Yu-Xin],
Liang, L.M.[Li-Ming],
Zou, Y.X.[Yue-Xian],
VASparse: Towards Efficient Visual Hallucination Mitigation via
Visual-Aware Token Sparsification,
CVPR25(4189-4199)
IEEE DOI Code:
WWW Link.
2508
Training, Visualization, Codes, Prevention and mitigation,
Benchmark testing, Inference algorithms, Decoding
BibRef
Basak, D.[Debolena],
Bhatt, S.[Soham],
Kanduri, S.[Sahith],
Desarkar, M.S.[Maunendra Sankar],
Aerial Mirage: Unmasking Hallucinations in Large Vision Language
Models,
WACV25(5500-5508)
IEEE DOI
2505
Training, Reviews, Annotations, Surveillance, Computational modeling,
Decision making, Data models, Reliability, Drones
BibRef
Tang, F.L.[Fei-Long],
Liu, C.Z.[Cheng-Zhi],
Xu, Z.X.[Zhong-Xing],
Hu, M.[Ming],
Huang, Z.[Zile],
Xue, H.C.[Hao-Chen],
Chen, Z.Y.[Zi-Yang],
Peng, Z.L.[Ze-Lin],
Yang, Z.W.[Zhi-Wei],
Zhou, S.J.[Si-Jin],
Li, W.X.[Wen-Xue],
Li, Y.L.[Yu-Long],
Song, W.X.[Wen-Xuan],
Su, S.Y.[Shi-Yan],
Feng, W.[Wei],
Su, J.[Jionglong],
Lin, M.[Minquan],
Peng, Y.F.[Yi-Fan],
Cheng, X.L.[Xue-Lian],
Razzak, I.[Imran],
Ge, Z.Y.[Zong-Yuan],
Seeing Far and Clearly: Mitigating Hallucinations in MLLMs with
Attention Causal Decoding,
CVPR25(26147-26159)
IEEE DOI
2508
Heart, Visualization, Large language models, Video sequences,
Interference, Question answering (information retrieval),
proving its effectiveness
BibRef
Yang, J.N.[Jia-Ning],
Chen, X.[Xuweiyi],
Madaan, N.[Nikhil],
Iyengar, M.[Madhavan],
Qian, S.[Shengyi],
Fouhey, D.F.[David F.],
Chai, J.[Joyce],
3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding
and Less Hallucination,
CVPR25(29501-29512)
IEEE DOI
2508
Technological innovation, Solid modeling, Grounding,
Benchmark testing, Reliability engineering, Sparks, Tuning,
visual grounding
BibRef
Yoon, D.[Dokyoon],
Song, Y.[Youngsook],
Park, W.[Woomyong],
Stop learning it all to mitigate visual hallucination, Focus on the
hallucination target,
CVPR25(4200-4208)
IEEE DOI
2508
Learning systems, Visualization, Large language models, Focusing,
Information filters, Data augmentation, Reliability,
preference learning
BibRef
s
Chen, J.Z.[Jun-Zhe],
Zhang, T.S.[Tian-Shu],
Huang, S.Y.[Shi-Yu],
Niu, Y.W.[Yu-Wei],
Zhang, L.F.[Lin-Feng],
Wen, L.J.[Li-Jie],
Hu, X.M.[Xu-Ming],
ICT: Image-Object Cross-Level Trusted Intervention for Mitigating
Object Hallucination in Large Vision-Language Models,
CVPR25(4209-4221)
IEEE DOI Code:
WWW Link.
2508
Visualization, Head, Costs, Computational modeling, Data models,
Information and communication technology, Decoding,
inference intervention
BibRef
Kim, B.[Bumsoo],
Shin, W.[Wonseop],
Lee, K.[Kyuchul],
Jung, Y.[Yonghoon],
Seo, S.[Sanghyun],
Make VLM Recognize Visual Hallucination on Cartoon Character Image
with Pose Information,
WACV25(5398-5407)
IEEE DOI Code:
WWW Link.
2505
Training, Visualization, Solid modeling, Accuracy,
Large language models, Semantics, Text to image,
large-scale text-to-image (TTI) models
BibRef
Huang, P.H.[Po-Hsuan],
Li, J.L.[Jeng-Lin],
Chen, C.P.[Chin-Po],
Chang, M.C.[Ming-Ching],
Chen, W.C.[Wei-Chao],
Who Brings the Frisbee: Probing Hidden Hallucination Factors in Large
Vision-Language Model via Causality Analysis,
WACV25(6125-6135)
IEEE DOI
2505
Training, Visualization, Prevention and mitigation,
Computational modeling, Semantics, Natural languages,
causal analysis
BibRef
Liu, S.[Shi],
Zheng, K.[Kecheng],
Chen, W.[Wei],
Paying More Attention to Image: A Training-free Method for Alleviating
Hallucination in LVLMS,
ECCV24(LXXXIII: 125-140).
Springer DOI
2412
BibRef
Zhang, J.[Jinrui],
Wang, T.[Teng],
Zhang, H.G.[Hai-Gang],
Lu, P.[Ping],
Zheng, F.[Feng],
Reflective Instruction Tuning: Mitigating Hallucinations in Large
Vision-language Models,
ECCV24(LXVIII: 196-213).
Springer DOI
2412
BibRef
Kaul, P.[Prannay],
Li, Z.Z.[Zhi-Zhong],
Yang, H.[Hao],
Dukler, Y.[Yonatan],
Swaminathan, A.[Ashwin],
Taylor, C.J.,
Soatto, S.[Stefano],
THRONE: An Object-Based Hallucination Benchmark for the Free-Form
Generations of Large Vision-Language Models,
CVPR24(27218-27228)
IEEE DOI
2410
Measurement, Training, Ethics, Accuracy, Computational modeling,
Graphics processing units, hallucination, benchmark, LLM, LVLM,
large vision-language model
BibRef
Jiang, C.Y.[Chao-Ya],
Xu, H.Y.[Hai-Yang],
Dong, M.F.[Meng-Fan],
Chen, J.X.[Jia-Xing],
Ye, W.[Wei],
Yan, M.[Ming],
Ye, Q.H.[Qing-Hao],
Zhang, J.[Ji],
Huang, F.[Fei],
Zhang, S.K.[Shi-Kun],
Hallucination Augmented Contrastive Learning for Multimodal Large
Language Model,
CVPR24(27026-27036)
IEEE DOI Code:
WWW Link.
2410
Representation learning, Visualization, Codes,
Large language models, Natural languages, Contrastive learning
BibRef
Huang, Q.D.[Qi-Dong],
Dong, X.Y.[Xiao-Yi],
Zhang, P.[Pan],
Wang, B.[Bin],
He, C.H.[Cong-Hui],
Wang, J.Q.[Jia-Qi],
Lin, D.[Dahua],
Zhang, W.M.[Wei-Ming],
Yu, N.H.[Neng-Hai],
OPERA: Alleviating Hallucination in Multi-Modal Large Language Models
via Over-Trust Penalty and Retrospection-Allocation,
CVPR24(13418-13427)
IEEE DOI Code:
WWW Link.
2410
Training, Measurement, Costs, Codes, Large language models, Focusing,
Hallucination, Large vision-language model, Multimodal LLM, LLM
BibRef
Yu, Q.F.[Qi-Fan],
Li, J.C.[Jun-Cheng],
Wei, L.H.[Long-Hui],
Pang, L.[Liang],
Ye, W.T.[Wen-Tao],
Qin, B.S.[Bo-Sheng],
Tang, S.L.[Si-Liang],
Tian, Q.[Qi],
Zhuang, Y.T.[Yue-Ting],
HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual
Instruction Data,
CVPR24(12944-12953)
IEEE DOI Code:
WWW Link.
2410
Measurement, Visualization, Toxicology, Correlation, Codes,
Large language models, Hallucinations,
Vision-language reasoning
BibRef
Favero, A.[Alessandro],
Zancato, L.[Luca],
Trager, M.[Matthew],
Choudhary, S.[Siddharth],
Perera, P.[Pramuditha],
Achille, A.[Alessandro],
Swaminathan, A.[Ashwin],
Soatto, S.[Stefano],
Multi-Modal Hallucination Control by Visual Information Grounding,
CVPR24(14303-14312)
IEEE DOI
2410
Training, Visualization, Grounding, Linguistics, Sampling methods,
Inference algorithms, Vision, language, reasoning
BibRef
Ouali, Y.[Yassine],
Bulat, A.[Adrian],
Martinez, B.[Brais],
Tzimiropoulos, G.[Georgios],
CLIP-DPO: Vision-language Models as a Source of Preference for Fixing
Hallucinations in LVLMS,
ECCV24(LXXVI: 395-413).
Springer DOI
2412
BibRef
Ye-Bin, M.[Moon],
Hyeon-Woo, N.[Nam],
Choi, W.[Wonseok],
Oh, T.H.[Tae-Hyun],
Beaf: Observing Before-after Changes to Evaluate Hallucination in
Vision-language Models,
ECCV24(XI: 232-248).
Springer DOI
2412
BibRef
Kim, M.[Minchan],
Kim, M.[Minyeong],
Bae, J.[Junik],
Choi, S.[Suhwan],
Kim, S.[Sungkyung],
Chang, B.[Buru],
Exploiting Semantic Reconstruction to Mitigate Hallucinations in
Vision-language Models,
ECCV24(LXXXVI: 236-252).
Springer DOI
2412
BibRef
Guan, T.R.[Tian-Rui],
Liu, F.[Fuxiao],
Wu, X.[Xiyang],
Xian, R.Q.[Rui-Qi],
Li, Z.X.[Zong-Xia],
Liu, X.Y.[Xiao-Yu],
Wang, X.[Xijun],
Chen, L.[Lichang],
Huang, F.[Furong],
Yacoob, Y.[Yaser],
Manocha, D.[Dinesh],
Zhou, T.Y.[Tian-Yi],
Hallusionbench: An Advanced Diagnostic Suite for Entangled Language
Hallucination and Visual Illusion in Large Vision-Language Models,
CVPR24(14375-14385)
IEEE DOI Code:
WWW Link.
2410
Visualization, Analytical models, Accuracy, Statistical analysis,
Computational modeling, Benchmark testing, Vision language model,
VLM Evaluation
BibRef
Leng, S.[Sicong],
Zhang, H.[Hang],
Chen, G.Z.[Guan-Zheng],
Li, X.[Xin],
Lu, S.J.[Shi-Jian],
Miao, C.Y.[Chun-Yan],
Bing, L.[Lidong],
Mitigating Object Hallucinations in Large Vision-Language Models
through Visual Contrastive Decoding,
CVPR24(13872-13882)
IEEE DOI
2410
Training, Visualization, Accuracy, Computational modeling,
Benchmark testing, Decoding, Multimodality,
Vision and Language
BibRef
Wang, Z.[Zhecan],
Bingham, G.[Garrett],
Yu, A.W.[Adams Wei],
Le, Q.V.[Quoc V.],
Luong, T.[Thang],
Ghiasi, G.[Golnaz],
Haloquest: A Visual Hallucination Dataset for Advancing Multimodal
Reasoning,
ECCV24(LXXVII: 288-304).
Springer DOI
2412
BibRef
Wang, T.J.J.[Tzu-Jui Julius],
Laaksonen, J.[Jorma],
Langer, T.[Tomas],
Arponen, H.[Heikki],
Bishop, T.E.[Tom E.],
Learning by Hallucinating:
Vision-Language Pre-training with Weak Supervision,
WACV23(1073-1083)
IEEE DOI
2302
Visualization, Vocabulary, Computational modeling, Detectors,
Benchmark testing, Transformers, un-supervised learning
BibRef
Chapter on Implementations and Applications, Databases, QBIC, Video Analysis, Hardware and Software, Inspection continues in
Jailbreaking Language Models .