Javascript must be enabled to continue!
Bag-of-Frames: Improving Bag-of-Words for a better similarity measure
View through CrossRef
Abstract
This paper introduces the Bag-of-Frames (BoF) model, a novel approach to textual document representation that extends and improves the classical Bag-of-Words (BoW) model using VerbNet frames instead of words. While BoW treats documents as collections of word frequencies, BoF captures semantic content conveyed by verbal frames rather than lexical items. Our experiments suggest that BoF can improve performance in various natural language processing tasks. BoF dimensionality is considerably lower than the dimensionality of BoW. Compared with BoW, the lower dimensionality results in reduced complexity and improved performance. BoF discovers frame-level similarity where BoW finds none, because BoW works at the word-level similarity .
We implement BoF and present empirical results for estimating similarity between sets of sentences. A sentence, as well as a document, is represented by a vector. Therefore, sentences and documents can be interpreted and viewed as points in a multidimensional vector space.
Title: Bag-of-Frames: Improving Bag-of-Words for a better similarity measure
Description:
Abstract
This paper introduces the Bag-of-Frames (BoF) model, a novel approach to textual document representation that extends and improves the classical Bag-of-Words (BoW) model using VerbNet frames instead of words.
While BoW treats documents as collections of word frequencies, BoF captures semantic content conveyed by verbal frames rather than lexical items.
Our experiments suggest that BoF can improve performance in various natural language processing tasks.
BoF dimensionality is considerably lower than the dimensionality of BoW.
Compared with BoW, the lower dimensionality results in reduced complexity and improved performance.
BoF discovers frame-level similarity where BoW finds none, because BoW works at the word-level similarity .
We implement BoF and present empirical results for estimating similarity between sets of sentences.
A sentence, as well as a document, is represented by a vector.
Therefore, sentences and documents can be interpreted and viewed as points in a multidimensional vector space.
Related Results
Afrikanske smede
Afrikanske smede
African Smiths Cultural-historical and sociological problems illuminated by studies among the Tuareg and by comparative analysisIn KUML 1957 in connection with a description of sla...
Similarity Search with Data Missing
Similarity Search with Data Missing
Similarity search is a fundamental research problem with broad applications in various research fields, including data mining, information retrieval, and machine learning. The core...
Analysis of a Similarity Measure for Non-Overlapped Data
Analysis of a Similarity Measure for Non-Overlapped Data
A similarity measure is a measure evaluating the degree of similarity between two fuzzy data sets and has become an essential tool in many applications including data mining, patte...
CRISPR/Cas9-mediated Bag-1 knockout increased mesenchymal characteristics of MCF-7 cells via Akt hyperactivation-mediated actin cytoskeleton remodeling
CRISPR/Cas9-mediated Bag-1 knockout increased mesenchymal characteristics of MCF-7 cells via Akt hyperactivation-mediated actin cytoskeleton remodeling
Bag-1 protein is a crucial target in cancer to increase the survival and proliferation of cells. The Bag-1 expression is significantly upregulated in primary and metastatic cancer ...
Using covariance weighted euclidean distance to assess the dissimilarity between integral experiments
Using covariance weighted euclidean distance to assess the dissimilarity between integral experiments
Integral experiments especially criticality experiments help a lot in designing either new nuclear reactor or criticality assembly. The calculation uncertainty of the integral para...
Finite Element Simulation of Driver Folded Air Bag Deployment
Finite Element Simulation of Driver Folded Air Bag Deployment
<div class="htmlview paragraph">Finite element simulation of air bags as part of the automotive occupant restraint system is rapidly evolving as a new CAE tool in support of ...
SNOMED CT Primitive Concept Similarity Measure by Concept Name Text Similarity Approach
SNOMED CT Primitive Concept Similarity Measure by Concept Name Text Similarity Approach
In the last few years, Concept Similarity Measures (CSMs) become important for the biomedical ontologies in order to find adaptable treatments from the conceptually similar disease...
Characterizations and representations of Hilbert-Schmidt frames in Hilbert spaces
Characterizations and representations of Hilbert-Schmidt frames in Hilbert spaces
Hilbert-Schmidt frame(HS-frame) is essentially an operator-valued frame,
it is more general than g-frames, and thus, covers some generalizations
of frames. This paper addresses the...


