EMNLP2023
Deep Natural Language Feature Learning for Interpretable Prediction
Felipe Urrutia, Cristian Buc Calderon, Valentin Barrière
被引用 3 次
摘要
We propose a general method to break down a main complex task into a set of intermediary easier sub-tasks, which are formulated in natural language as binary questions related to the final target task. Our method allows for representing each example by a vector consisting of the answers to these questions. We call this representation Natural Language Learned Features (NLLF). NLLF is generated by a small transformer language model (e.g., BERT) that has been trained in a Natural Language Inference (NLI) fashion, using weak labels automatically obtained from a Large Language Model (LLM). We show that the LLM normally struggles for the main task using in-context learning, but can handle these easiest subtasks and produce useful weak labels to train a BERT. The NLI-like training of the BERT allows for tackling zeroshot inference with any binary question, and not necessarily the ones seen during the training. We show that this NLLF vector not only helps to reach better performances by enhancing any classifier, but that it can be used as input of an easy-to-interpret machine learning model like a decision tree. This decision tree is interpretable but also reaches high performances, surpassing those of a pre-trained transformer in some cases. We have successfully applied this method to two completely different tasks: detecting incoherence in students' answers to open-ended mathematics exam questions, and screening abstracts for a systematic literature review of scientific papers on climate change and agroecology. 1 Interpretable ML model Q: Does the article address the relationship between agroecological practices and climate change? A: Yes The text has a word with the prefix convent Q: Does the article analyze how agroecology affects nitrogen dynamics? A: Yes Include ✔ Q: Does the article assess agroecological practices' impact on climate change? A: Yes Contribution of crop residue, soil, and fertilizer nitrogen to nitrous oxide emissions varies with long-term crop rotation and tillage Agriculture is an important contributor to N2O emissions -a potent greenhouse gas -with high peaks occurring when soil mineral nitrogen (N) is high (e.g., after mineralization of organic N and N fertilizer application). Nitrogen dynamics in soil and consequently N2O emissions are affected by crop and soil management practices (e.g., crop rotation and tillage), an effect mostly assessed in the literature through comparisons of total N2O emission. Hence, information is scarce on the effect of these management practices on specific N sources affecting N2O emissions (i.e., N fertilizer, soil, above and belowground crop residues) -a knowledge gap explored in this study with the use of N-15 tracers. The isotope approach enabled … (more)