CienaLLM: Generative Climate-Impact Extraction from News Articles with Autoregressive LLMs
By: Javier Vela-Tambo, Jorge Gracia, Fernando Dominguez-Castro
Potential Business Impact:
Helps computers find climate news facts fast.
Understanding and monitoring the socio-economic impacts of climate hazards requires extracting structured information from heterogeneous news articles on a large scale. To that end, we have developed CienaLLM, a modular framework based on schema-guided Generative Information Extraction. CienaLLM uses open-weight Large Language Models for zero-shot information extraction from news articles, and supports configurable prompts and output schemas, multi-step pipelines, and cloud or on-premise inference. To systematically assess how the choice of LLM family, size, precision regime, and prompting strategy affect performance, we run a large factorial study in models, precisions, and prompt engineering techniques. An additional response parsing step nearly eliminates format errors while preserving accuracy; larger models deliver the strongest and most stable performance, while quantization offers substantial efficiency gains with modest accuracy trade-offs; and prompt strategies show heterogeneous, model-specific effects. CienaLLM matches or outperforms the supervised baseline in accuracy for extracting drought impacts from Spanish news, although at a higher inference cost. While evaluated in droughts, the schema-driven and model-agnostic design is suitable for adapting to related information extraction tasks (e.g., other hazards, sectors, or languages) by editing prompts and schemas rather than retraining. We release code, configurations, and schemas to support reproducible use.
Similar Papers
A Large-Language-Model Framework for Automated Humanitarian Situation Reporting
Computation and Language
Helps aid workers quickly understand disaster situations.
ClimateChat: Designing Data and Methods for Instruction Tuning LLMs to Answer Climate Change Queries
Computation and Language
Creates better AI for climate change questions.
On-Premise AI for the Newsroom: Evaluating Small Language Models for Investigative Document Search
Information Retrieval
Helps reporters find facts faster and safer.