Extracting Abstraction Dimensions by Identifying Syntax Pattern from Texts
By: Jian Zhou , Jiazheng Li , Sirui Zhuge and more
Potential Business Impact:
Helps computers understand and find information in text.
This paper proposed an approach to automatically discovering subject dimension, action dimension, object dimension and adverbial dimension from texts to efficiently operate texts and support query in natural language. The high quality of trees guarantees that all subjects, actions, objects and adverbials and their subclass relations within texts can be represented. The independency of trees ensures that there is no redundant representation between trees. The expressiveness of trees ensures that the majority of sentences can be accessed from each tree and the rest of sentences can be accessed from at least one tree so that the tree-based search mechanism can support querying in natural language. Experiments show that the average precision, recall and F1-score of the abstraction trees constructed by the subclass relations of subject, action, object and adverbial are all greater than 80%. The application of the proposed approach to supporting query in natural language demonstrates that different types of question patterns for querying subject or object have high coverage of texts, and searching multiple trees on subject, action, object and adverbial according to the question pattern can quickly reduce search space to locate target sentences, which can support precise operation on texts.
Similar Papers
Automatic Construction of Multiple Classification Dimensions for Managing Approaches in Scientific Papers
Computation and Language
Helps scientists find research methods faster.
A Hybrid Architecture with Efficient Fine Tuning for Abstractive Patent Document Summarization
Computation and Language
Makes patent summaries easier to understand.
An Aspect Extraction Framework using Different Embedding Types, Learning Models, and Dependency Structure
Computation and Language
Finds what people like or dislike about things.