CPJ: Explainable Agricultural Pest Diagnosis via Caption-Prompt-Judge with LLM-Judged Refinement
By: Wentao Zhang , Tao Fang , Lina Lu and more
Potential Business Impact:
Helps farmers quickly identify plant sicknesses.
Accurate and interpretable crop disease diagnosis is essential for agricultural decision-making, yet existing methods often rely on costly supervised fine-tuning and perform poorly under domain shifts. We propose Caption--Prompt--Judge (CPJ), a training-free few-shot framework that enhances Agri-Pest VQA through structured, interpretable image captions. CPJ employs large vision-language models to generate multi-angle captions, refined iteratively via an LLM-as-Judge module, which then inform a dual-answer VQA process for both recognition and management responses. Evaluated on CDDMBench, CPJ significantly improves performance: using GPT-5-mini captions, GPT-5-Nano achieves \textbf{+22.7} pp in disease classification and \textbf{+19.5} points in QA score over no-caption baselines. The framework provides transparent, evidence-based reasoning, advancing robust and explainable agricultural diagnosis without fine-tuning. Our code and data are publicly available at: https://github.com/CPJ-Agricultural/CPJ-Agricultural-Diagnosis.
Similar Papers
Are vision-language models ready to zero-shot replace supervised classification models in agriculture?
CV and Pattern Recognition
Helps farmers spot plant problems better.
AgroBench: Vision-Language Model Benchmark in Agriculture
CV and Pattern Recognition
Helps AI tell sick plants from healthy ones.
Rethinking Plant Disease Diagnosis: Bridging the Academic-Practical Gap with Vision Transformers and Zero-Shot Learning
CV and Pattern Recognition
Helps farmers spot plant sickness from photos.