Javascript must be enabled to continue!

Clustering analysis for classifying fake real estate listings

With the rapid growth of online property rental and sale platforms, the prevalence of fake real estate listings has become a significant concern. These deceptive listings waste time and effort for buyers and sellers and pose potential risks. Therefore, developing effective methods to distinguish genuine from fake listings is crucial. Accurately identifying fake real estate listings is a critical challenge, and clustering analysis can significantly improve this process. While clustering has been widely used to detect fraud in various fields, its application in the real estate domain has been somewhat limited, primarily focused on auctions and property appraisals. This study aims to fill this gap by using clustering to classify properties into fake and genuine listings based on datasets curated by industry experts. This study developed a K-means model to group properties into clusters, clearly distinguishing between fake and genuine listings. To assure the quality of the training data, data pre-processing procedures were performed on the raw dataset. Several techniques were used to determine the optimal value for each parameter of the K-means model. The clusters are determined using the Silhouette coefficient, the Calinski-Harabasz index, and the Davies-Bouldin index. It was found that the value of cluster 2 is the best and the Camberra technique is the best method when compared to overlapping similarity and Jaccard for distance. The clustering results are assessed using two machine learning algorithms: Random Forest and Decision Tree. The observational results have shown that the optimized K-means significantly improves the accuracy of the Random Forest classification model, boosting it by an impressive 96%. Furthermore, this research demonstrates that clustering helps create a balanced dataset containing fake and genuine clusters. This balanced dataset holds promise for future investigations, particularly for deep learning models that require balanced data to perform optimally. This study presents a practical and effective way to identify fake real estate listings by harnessing the power of clustering analysis, ultimately contributing to a more trustworthy and secure real estate market.

PeerJ

Maifuza Mohd Amin Nor Samsiah Sani Mohammad Faidzul Nasrudin Salwani Abdullah Amit Chhabra Faizal Abd Kadir

PeerJ Computer Science

2024

Title: Clustering analysis for classifying fake real estate listings

Description:

With the rapid growth of online property rental and sale platforms, the prevalence of fake real estate listings has become a significant concern.

These deceptive listings waste time and effort for buyers and sellers and pose potential risks.

Therefore, developing effective methods to distinguish genuine from fake listings is crucial.

Accurately identifying fake real estate listings is a critical challenge, and clustering analysis can significantly improve this process.

While clustering has been widely used to detect fraud in various fields, its application in the real estate domain has been somewhat limited, primarily focused on auctions and property appraisals.

This study aims to fill this gap by using clustering to classify properties into fake and genuine listings based on datasets curated by industry experts.

This study developed a K-means model to group properties into clusters, clearly distinguishing between fake and genuine listings.

To assure the quality of the training data, data pre-processing procedures were performed on the raw dataset.

Several techniques were used to determine the optimal value for each parameter of the K-means model.

The clusters are determined using the Silhouette coefficient, the Calinski-Harabasz index, and the Davies-Bouldin index.

It was found that the value of cluster 2 is the best and the Camberra technique is the best method when compared to overlapping similarity and Jaccard for distance.

The clustering results are assessed using two machine learning algorithms: Random Forest and Decision Tree.

The observational results have shown that the optimized K-means significantly improves the accuracy of the Random Forest classification model, boosting it by an impressive 96%.

Furthermore, this research demonstrates that clustering helps create a balanced dataset containing fake and genuine clusters.

This balanced dataset holds promise for future investigations, particularly for deep learning models that require balanced data to perform optimally.

This study presents a practical and effective way to identify fake real estate listings by harnessing the power of clustering analysis, ultimately contributing to a more trustworthy and secure real estate market.

Back

Abstract. Shopee is a company engaged in online-based buying and selling services. One of the latest features of Shopee is the ShopeeFood service and has standard rules that must b...

The Kernel Rough K-Means Algorithm

Background: Clustering is one of the most important data mining methods. The k-means (c-means ) and its derivative methods are the hotspot in the field of clustering research in re...

DISCOURSE: KNOWLEDGE, NEWS, AND FAKE INTERTWINED

Discourse has been a focal point for linguists over an extended period. The multidisciplinary character of the term ‘discourse’ has resulted in diverse approaches aiming to define ...

Metadata analysis of retracted fake papers in Naunyn-Schmiedeberg’s Archives of Pharmacology

AbstractAn increasing fake paper problem is a cause for concern in the scientific community. These papers look scientific but contain manipulated data or are completely fictitious....

Image clustering using exponential discriminant analysis

Local learning based image clustering models are usually employed to deal with images sampled from the non‐linear manifold. Recently, linear discriminant analysis (LDA) based vario...

Hold On! Your Emotion and Behaviour when Falling for Fake News in Social Media

Researchers are concerned about the impact of fake news on democracy, while it could also escalate to life-threatening problems. Fake news continues to spread, so does people's beh...

Deep Learning for Forgery Face Detection Using Fuzzy Fisher Capsule Dual Graph

In digital manipulation, creating fake images/videos or swapping face images/videos with another person is done by using a deep learning algorithm is termed deep fake. Fake pornogr...

Construction of Real Estate Debt Crisis Early Warning Model Based on RBF Neural Network

The current market economic environment is constantly changing, and real estate companies are constantly facing various risks in the course of their operations, which have created ...

Email:
Password:

Email:

Clustering analysis for classifying fake real estate listings

Related Results