Javascript must be enabled to continue!
OpenFact: Factuality Enhanced Open Knowledge Extraction
View through CrossRef
Abstract
We focus on the factuality property during the extraction of an OpenIE corpus named OpenFact, which contains more than 12 million high-quality knowledge triplets. We break down the factuality property into two important aspects—expressiveness and groundedness—and we propose a comprehensive framework to handle both aspects. To enhance expressiveness, we formulate each knowledge piece in OpenFact based on a semantic frame. We also design templates, extra constraints, and adopt human efforts so that most OpenFact triplets contain enough details. For groundedness, we require the main arguments of each triplet to contain linked Wikidata1 entities. A human evaluation suggests that the OpenFact triplets are much more accurate and contain denser information compared to OPIEC-Linked (Gashteovski et al., 2019), one recent high-quality OpenIE corpus grounded to Wikidata. Further experiments on knowledge base completion and knowledge base question answering show the effectiveness of OpenFact over OPIEC-Linked as supplementary knowledge to Wikidata as the major KG.
Title: OpenFact: Factuality Enhanced Open Knowledge Extraction
Description:
Abstract
We focus on the factuality property during the extraction of an OpenIE corpus named OpenFact, which contains more than 12 million high-quality knowledge triplets.
We break down the factuality property into two important aspects—expressiveness and groundedness—and we propose a comprehensive framework to handle both aspects.
To enhance expressiveness, we formulate each knowledge piece in OpenFact based on a semantic frame.
We also design templates, extra constraints, and adopt human efforts so that most OpenFact triplets contain enough details.
For groundedness, we require the main arguments of each triplet to contain linked Wikidata1 entities.
A human evaluation suggests that the OpenFact triplets are much more accurate and contain denser information compared to OPIEC-Linked (Gashteovski et al.
, 2019), one recent high-quality OpenIE corpus grounded to Wikidata.
Further experiments on knowledge base completion and knowledge base question answering show the effectiveness of OpenFact over OPIEC-Linked as supplementary knowledge to Wikidata as the major KG.
Related Results
Utilizing Large Language Models for Geoscience Literature Information Extraction
Utilizing Large Language Models for Geoscience Literature Information Extraction
Extracting information from unstructured and semi-structured geoscience literature is a crucial step in conducting geological research. The traditional machine learning extraction ...
Linking White‐Tailed Deer Density, Nutrition, and Vegetation in a Stochastic Environment
Linking White‐Tailed Deer Density, Nutrition, and Vegetation in a Stochastic Environment
ABSTRACT
Density‐dependent behavior underpins white‐tailed deer (
Odocoileus virginianus
) theory and...
Optimization of ultrasonic extraction of
Lycium barbarum
polysaccharides using response surface methodology
Optimization of ultrasonic extraction of
Lycium barbarum
polysaccharides using response surface methodology
Abstract
Ultrasonic extraction was a new development method to achieve high-efficiency extraction of
Lycium barbarum
...
KNOWLEDGE IN PRACTICE
KNOWLEDGE IN PRACTICE
Knowledge is an understanding of someone or something, such as facts, information, descriptions or skills, which is acquired by individuals through education, learning, experience ...
EWOD Based Liquid-Liquid Extraction and Separation
EWOD Based Liquid-Liquid Extraction and Separation
Liquid-liquid extraction techniques are one of the major tools in chemical engineering, analytical chemistry, and biology, especially in a system where two immiscible liquids have ...
Extraction of Mogroside and Limonin with Different Extraction Methods and its Modeling
Extraction of Mogroside and Limonin with Different Extraction Methods and its Modeling
The extraction yields of mogroside from Siraitia grosvenorii fruits and limonin from orange (Citrus reticulata Blanco) seeds were compared with different extraction methods, respec...
Robust knowledge extraction over large text collections
Robust knowledge extraction over large text collections
Automatic knowledge extraction over large text collections has been a challenging task due to many constraints such as needs of large annotated training data, requirement of extens...
Response Surface Analysis on Multiple Parameter Effects on Borehole Gas Extraction Efficiency
Response Surface Analysis on Multiple Parameter Effects on Borehole Gas Extraction Efficiency
To explore the impact of different factors on the effectiveness of borehole gas extraction, in situ stress tests were conducted in a test mining area. A theoretical model of gas mi...

