Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

TriJoin: A Time-Efficient and Scalable Three-Way Distributed Stream Join System

View through CrossRef
<p>Stream join is one of the most fundamental operations in data stream processing applications. Existing distributed stream join systems can support efficient two-way join, which is a join operation between two streams. Based the two-way join, implementing a three-way join require to be split into double two-way joins, where the second two-way join needs to wait for the join result transmitted from the first two-way join. We show through experiments that such a design raises prohibitively high processing latency. To solve this problem, we propose TriJoin, a time-efficient three-way distributed stream join system. We design a symmetric wait-free structure by symmetrically partitioning tuples and reused join. TriJoin utilizes reused join to join each new tuple with the intermediate result of the other two streams and stored tuples locally. For a new tuple, TriJoin only joins it with the intermediate result to generate the final result without waiting, greatly reducing the processing latency. In TriJoin, we design two partitioning and storage schemes according to two different forms of three-way stream join. We implement TriJoin and conduct comprehensive experiments to evaluate the performance using real-world traces. Results show that TriJoin significantly reduces the processing latency by up to 68%, compared to existing designs.</p> <p>&nbsp;</p>
Title: TriJoin: A Time-Efficient and Scalable Three-Way Distributed Stream Join System
Description:
<p>Stream join is one of the most fundamental operations in data stream processing applications.
Existing distributed stream join systems can support efficient two-way join, which is a join operation between two streams.
Based the two-way join, implementing a three-way join require to be split into double two-way joins, where the second two-way join needs to wait for the join result transmitted from the first two-way join.
We show through experiments that such a design raises prohibitively high processing latency.
To solve this problem, we propose TriJoin, a time-efficient three-way distributed stream join system.
We design a symmetric wait-free structure by symmetrically partitioning tuples and reused join.
TriJoin utilizes reused join to join each new tuple with the intermediate result of the other two streams and stored tuples locally.
For a new tuple, TriJoin only joins it with the intermediate result to generate the final result without waiting, greatly reducing the processing latency.
In TriJoin, we design two partitioning and storage schemes according to two different forms of three-way stream join.
We implement TriJoin and conduct comprehensive experiments to evaluate the performance using real-world traces.
Results show that TriJoin significantly reduces the processing latency by up to 68%, compared to existing designs.
</p> <p>&nbsp;</p>.

Related Results

Using join.me to help library patrons
Using join.me to help library patrons
PurposeAs the Informatics Librarian at Olivet Nazarene University, my staff and I are often responsible for troubleshooting our patrons' technology issues. My experience with join....
Wadeable stream habitat monitoring at Chattahoochee River National Recreation Area: 2021 change report
Wadeable stream habitat monitoring at Chattahoochee River National Recreation Area: 2021 change report
The Southeast Coast Network (SECN) stream habitat monitoring protocol collects data to give park resource managers insight into the status of and trends in stream and near-channel ...
Continental hydrosystem modelling: the concept of nested stream–aquifer interfaces
Continental hydrosystem modelling: the concept of nested stream–aquifer interfaces
Abstract. Recent developments in hydrological modelling are based on a view of the interface being a single continuum through which water flows. These coupled hydrological-hydrogeo...
EPD Electronic Pathogen Detection v1
EPD Electronic Pathogen Detection v1
Electronic pathogen detection (EPD) is a non - invasive, rapid, affordable, point- of- care test, for Covid 19 resulting from infection with SARS-CoV-2 virus. EPD scanning techno...
A Bayesian hierarchical model of channel network dynamics reveals the impact of stream dynamics and connectivity on metapopulation
A Bayesian hierarchical model of channel network dynamics reveals the impact of stream dynamics and connectivity on metapopulation
&lt;p&gt;The active portion of river networks varies in time thanks to event-based and seasonal cycles of expansion-retraction, mimicking the unsteadyness of the underlying...
General Curved Surface Fitting and Calculation of Flow Along Arbitrarily Twisted Stream Surface
General Curved Surface Fitting and Calculation of Flow Along Arbitrarily Twisted Stream Surface
This paper consists of two parts. (1) General curved surface fitting and grid refining. A method of fitting a set of given discrete points on several stream lines to give a smooth ...
Unravelling alkalinity and dissolved inorganic carbon dynamics in an alpine stream network
Unravelling alkalinity and dissolved inorganic carbon dynamics in an alpine stream network
Alkalinity in river ecosystems plays a crucial role in regulating carbon cycle across basin, regional, and global scales. Streamflow alkalinity acts as a pH buffer and drives the r...
Finitely Presented Heyting Algebras
Finitely Presented Heyting Algebras
In this paper we study the structure of finitely presented Heyting<br />algebras. Using algebraic techniques (as opposed to techniques from proof-theory) we show that every s...

Back to Top