Javascript must be enabled to continue!
TinyLFU-Based Semi-Stream Cache Join for Near-Real-Time Data Warehousing
View through CrossRef
Abstract
Semi-stream join is an emerging research problem in the domain of near-real-time data warehousing. A semi-stream join is basically a join between a fast stream (S) and a slow disk-based relation (R). In the modern era of technology, huge amounts of data are being generated swiftly on a daily basis which needs to be instantly analyzed for making successful business decisions. Keeping this in mind, a famous algorithm called CACHEJOIN (Cache Join) was proposed. The limitation of the CACHEJOIN algorithm is that it does not deal with the frequently changing trends in a stream data efficiently. To overcome this limitation, in this paper we propose a TinyLFU-CACHEJOIN algorithm, a modified version of the original CACHEJOIN algorithm, which is designed to enhance the performance of a CACHEJOIN algorithm. TinyLFU-CACHEJOIN employs an intelligent strategy which keeps only those records of $R$ in the cache that have a high hit rate in S. This mechanism of TinyLFU-CACHEJOIN allows it to deal with the sudden and abrupt trend changes in S. We developed a cost model for our TinyLFU-CACHEJOIN algorithm and proved it empirically. We also assessed the performance of our proposed TinyLFU-CACHEJOIN algorithm with the existing CACHEJOIN algorithm on a skewed synthetic dataset. The experiments proved that TinyLFU-CACHEJOIN algorithm significantly outperforms the CACHEJOIN algorithm.
Title: TinyLFU-Based Semi-Stream Cache Join for Near-Real-Time Data Warehousing
Description:
Abstract
Semi-stream join is an emerging research problem in the domain of near-real-time data warehousing.
A semi-stream join is basically a join between a fast stream (S) and a slow disk-based relation (R).
In the modern era of technology, huge amounts of data are being generated swiftly on a daily basis which needs to be instantly analyzed for making successful business decisions.
Keeping this in mind, a famous algorithm called CACHEJOIN (Cache Join) was proposed.
The limitation of the CACHEJOIN algorithm is that it does not deal with the frequently changing trends in a stream data efficiently.
To overcome this limitation, in this paper we propose a TinyLFU-CACHEJOIN algorithm, a modified version of the original CACHEJOIN algorithm, which is designed to enhance the performance of a CACHEJOIN algorithm.
TinyLFU-CACHEJOIN employs an intelligent strategy which keeps only those records of $R$ in the cache that have a high hit rate in S.
This mechanism of TinyLFU-CACHEJOIN allows it to deal with the sudden and abrupt trend changes in S.
We developed a cost model for our TinyLFU-CACHEJOIN algorithm and proved it empirically.
We also assessed the performance of our proposed TinyLFU-CACHEJOIN algorithm with the existing CACHEJOIN algorithm on a skewed synthetic dataset.
The experiments proved that TinyLFU-CACHEJOIN algorithm significantly outperforms the CACHEJOIN algorithm.
Related Results
An Efficient Software-Managed Cache Based on Cell Broadband Engine Architecture
An Efficient Software-Managed Cache Based on Cell Broadband Engine Architecture
While the CBEA (Cell Broadband Engine Architecture) offers substantial computational power, its explicit multilevel memory hierarchy poses significant challenges to traditional pro...
TriJoin: A Time-Efficient and Scalable Three-Way Distributed Stream Join System
TriJoin: A Time-Efficient and Scalable Three-Way Distributed Stream Join System
<p>Stream join is one of the most fundamental operations in data stream processing applications. Existing distributed stream join systems can support efficient two-way join, ...
A Hierarchical Cache Architecture-Oriented Cache Management Scheme for Information-Centric Networking
A Hierarchical Cache Architecture-Oriented Cache Management Scheme for Information-Centric Networking
Information-Centric Networking (ICN) typically utilizes DRAM (Dynamic Random Access Memory) to build in-network cache components due to its high data transfer rate and low latency....
Using join.me to help library patrons
Using join.me to help library patrons
PurposeAs the Informatics Librarian at Olivet Nazarene University, my staff and I are often responsible for troubleshooting our patrons' technology issues. My experience with join....
Design and Optimization of 4-way set Associative Mapped Cache Controller
Design and Optimization of 4-way set Associative Mapped Cache Controller
Abstract: In the realm of modern computer systems, the 4-way set associative mapped cache controller emerges as a cornerstone, revolutionizing memory access efficiency. This explor...
Geochemical Survey of Stream Sediments and Stream Water for Ion-Adsorption Type Rare Earth Deposits (IAREDs): A Pilot Study in Jiaping IARED, Guangxi, South China
Geochemical Survey of Stream Sediments and Stream Water for Ion-Adsorption Type Rare Earth Deposits (IAREDs): A Pilot Study in Jiaping IARED, Guangxi, South China
Rare earth elements (REEs) are critical mineral resources that play a pivotal role in modern technology and industry. Currently, the global supply of light rare earth elements (LRE...
C-Aware: A Cache Management Algorithm Considering Cache Media Access Characteristic in Cloud Computing
C-Aware: A Cache Management Algorithm Considering Cache Media Access Characteristic in Cloud Computing
Data congestion and network delay are the important factors that affect performance of cloud computing systems. Using local disk of computing nodes as a cache can sometimes get bet...
Advances in smart warehousing solutions for optimizing energy sector supply chains
Advances in smart warehousing solutions for optimizing energy sector supply chains
The energy sector faces increasing pressure to enhance the efficiency and sustainability of its supply chains. Smart warehousing solutions have emerged as a key innovation to addre...

