Javascript must be enabled to continue!
MR-AT: Map Reduce based Apriori Technique for Sequential Pattern Mining using Big Data in Hadoop
View through CrossRef
One of the most well-known and widely implemented data mining methods is Apriori algorithm which is responsible for mining frequent item sets. The effectiveness of the Apriori algorithm has been improved by a number of algorithms that have been introduced on both parallel and distributed platforms in recent years. They are distinct from one another on account of the method of load balancing, memory system, method of data degradation, and data layout that was utilised in their implementation. The majority of the issues that arise with distributed frameworks are associated with the operating costs of handling distributed systems and the absence of high-level parallel programming languages. In addition, when using grid computing, there is constantly a possibility that a node will fail, which will result in the task being re-executed multiple times. The MapReduce approach that was developed by Google can be used to solve these kinds of issues. MapReduce is a programming model that is applied to large-scale distributed processing of data on large clusters of commodity computers. It is effective, scalable, and easy to use. MapReduce is also utilised in cloud computing. This research paper presents an enhanced version of the Apriori algorithm, which is referred to as Improved Parallel and Distributed Apriori (IPDA). It is based on the scalable environment referred as Hadoop MapReduce, which was used to analyse Big Data. Through the generation of split-frequent data regionally and the early elimination of unusual data, the proposed work has its primary objective to reduce the enormous demands placed on available resources as well as the reduction of the overhead communication that occurs whenever frequent data are retrieved. The paper presents the results of tests, which demonstrate that the IPDA performs better than traditional apriori and parallel and distributed apriori in terms of the amount of time required, the number of rules created, and the various minimum support values.
Auricle Technologies, Pvt., Ltd.
Title: MR-AT: Map Reduce based Apriori Technique for Sequential Pattern Mining using Big Data in Hadoop
Description:
One of the most well-known and widely implemented data mining methods is Apriori algorithm which is responsible for mining frequent item sets.
The effectiveness of the Apriori algorithm has been improved by a number of algorithms that have been introduced on both parallel and distributed platforms in recent years.
They are distinct from one another on account of the method of load balancing, memory system, method of data degradation, and data layout that was utilised in their implementation.
The majority of the issues that arise with distributed frameworks are associated with the operating costs of handling distributed systems and the absence of high-level parallel programming languages.
In addition, when using grid computing, there is constantly a possibility that a node will fail, which will result in the task being re-executed multiple times.
The MapReduce approach that was developed by Google can be used to solve these kinds of issues.
MapReduce is a programming model that is applied to large-scale distributed processing of data on large clusters of commodity computers.
It is effective, scalable, and easy to use.
MapReduce is also utilised in cloud computing.
This research paper presents an enhanced version of the Apriori algorithm, which is referred to as Improved Parallel and Distributed Apriori (IPDA).
It is based on the scalable environment referred as Hadoop MapReduce, which was used to analyse Big Data.
Through the generation of split-frequent data regionally and the early elimination of unusual data, the proposed work has its primary objective to reduce the enormous demands placed on available resources as well as the reduction of the overhead communication that occurs whenever frequent data are retrieved.
The paper presents the results of tests, which demonstrate that the IPDA performs better than traditional apriori and parallel and distributed apriori in terms of the amount of time required, the number of rules created, and the various minimum support values.
Related Results
Enhancing Big Data Security in Hadoop using Machine Learning
Enhancing Big Data Security in Hadoop using Machine Learning
In the era of Big Data, where vast amounts of information are generated and analysed to extract valuable insights, ensuring the security of data has become paramount. Hadoop, as a ...
Distributed frequent hierarchical pattern mining for robust and efficient large-scale association discovery
Distributed frequent hierarchical pattern mining for robust and efficient large-scale association discovery
Frequent pattern mining is a classic data mining technique, generally applicable to a wide range of application domains, and a mature area of research. The fundamental challenge ar...
Light at the End of the Tunnel: Mining Justice and Health
Light at the End of the Tunnel: Mining Justice and Health
The mining industry provides valuable mined commodities and financial support for communities worldwide. Mining has become safer for workers. Significant injustices, however, are c...
Secure Cloud Data with Attribute-based Honey Encryption
Secure Cloud Data with Attribute-based Honey Encryption
Abstract
Encryption is a Technique to convert plain text into Cipher text, which is unreadable without an appropriate decryption key. Hadoop is a platform to process and st...
Hadoop Tools
Hadoop Tools
As the name indicates, this chapter explains the various additional tools provided by Hadoop. The additional tools provided by Hadoop distribution are Hadoop Streaming, Hadoop Arch...
A Comparative Study on Association Rule Mining Algorithms on the Hospital Infection Control Dataset
A Comparative Study on Association Rule Mining Algorithms on the Hospital Infection Control Dataset
Administrative procedures in various organizations produce numerous crucial records and data. These records and data are also used in other processes like customer relationship man...
Apriori Algorithm and Hybrid Apriori Algorithm in the Data Mining: A Comprehensive Review
Apriori Algorithm and Hybrid Apriori Algorithm in the Data Mining: A Comprehensive Review
Data mining has the potential to empower healthcare organizations by allowing them to analyze various aspects of patient information and discover connections between seemingly unre...
YouTube: big data analytics using Hadoop and map reduce
YouTube: big data analytics using Hadoop and map reduce
We live today in a digital world a tremendous amount of data is generated by each digital service we use. This vast amount of data generated is called Big Data. According to Wikipe...

