Javascript must be enabled to continue!
Outlier detection and rejection in scatterplots: Do outliers influence intuitive statistical judgments?
View through CrossRef
According to a growing body of research, human adults are remarkably accurate at extracting intuitive statistics from graphs, such as finding the best-fitting regression line through a scatterplot. Here, we ask whether humans can also perform outlier rejection, a non-trivial statistical problem. In three experiments, we investigated human adults’ capacity to evaluate the linear trend of a flashed scatterplot comprising 0-4 outlier datapoints. Experiment 1 showed that participants did not spontaneously reject outliers: when outliers were not mentioned, their presence biased the participants’ trend judgments and regression line estimates. In experiment 2, where participants were explicitly asked to exclude outliers, the outlier-induced bias was reduced but remained significant. In experiment 3, where participants were asked to explicitly detect any outlier before adjusting their regression line, outlier detection was satisfactory, but the detected outliers continued to bias the regression responses, unless they were quite distant from the main regression line. We propose a simple model for outlier detection, according to which humans detect outliers by computing a z-score that estimates how far a given datapoint is from the distribution of distances to the regression line. Detection is not rejection, however, and our results suggest that humans can remain biased by outliers that they have detected.
Title: Outlier detection and rejection in scatterplots: Do outliers influence intuitive statistical judgments?
Description:
According to a growing body of research, human adults are remarkably accurate at extracting intuitive statistics from graphs, such as finding the best-fitting regression line through a scatterplot.
Here, we ask whether humans can also perform outlier rejection, a non-trivial statistical problem.
In three experiments, we investigated human adults’ capacity to evaluate the linear trend of a flashed scatterplot comprising 0-4 outlier datapoints.
Experiment 1 showed that participants did not spontaneously reject outliers: when outliers were not mentioned, their presence biased the participants’ trend judgments and regression line estimates.
In experiment 2, where participants were explicitly asked to exclude outliers, the outlier-induced bias was reduced but remained significant.
In experiment 3, where participants were asked to explicitly detect any outlier before adjusting their regression line, outlier detection was satisfactory, but the detected outliers continued to bias the regression responses, unless they were quite distant from the main regression line.
We propose a simple model for outlier detection, according to which humans detect outliers by computing a z-score that estimates how far a given datapoint is from the distribution of distances to the regression line.
Detection is not rejection, however, and our results suggest that humans can remain biased by outliers that they have detected.
Related Results
Outlier Detection and Correction for the Deviations of Tooth Profiles of Gears
Outlier Detection and Correction for the Deviations of Tooth Profiles of Gears
To decrease the influence of outlier on the measurement of tooth profiles, this paper proposes a method of outlier detection and correction based on the grey system theory. After s...
Investigating Outlier Detection Techniques Based on Kernel Rough
Clustering
Investigating Outlier Detection Techniques Based on Kernel Rough
Clustering
Background:
Data quality is crucial to the success of big data analytics. However, the
presence of outliers affects data quality and data analysis. Employing effective outlier dete...
An Improved Innovation Robust Outliers Detection Method for Airborne Array Position and Orientation Measurement System
An Improved Innovation Robust Outliers Detection Method for Airborne Array Position and Orientation Measurement System
The airborne array position and orientation measurement system (array POS) is a key device for high-resolution multi-dimensional real-time imaging motion compensation of military r...
A Monte Carlo-Based Outlier Diagnosis Method for Sensitivity Analysis
A Monte Carlo-Based Outlier Diagnosis Method for Sensitivity Analysis
An iterative outlier elimination procedure based on hypothesis testing, commonly known as Iterative Data Snooping (IDS) among geodesists, is often used for the quality control of t...
A Monte Carlo-Based Outlier Diagnosis Method for Sensitivity Analysis
A Monte Carlo-Based Outlier Diagnosis Method for Sensitivity Analysis
An iterative outlier elimination procedure based on hypothesis testing, commonly known as Iterative Data Snooping (IDS) among geodesists, is often used for the quality control of m...
Outlier detection for mixed data
Outlier detection for mixed data
Abstract
Outlier detection is crucial in various sectors such as finance, insurance, medicine, and IT security. Its application helps to identify suspicious behaviors and e...
Model Hibrid Harmonik, ARMA dan Outlier Curah Hujan di Surabaya, Malang dan Banyuwangi
Model Hibrid Harmonik, ARMA dan Outlier Curah Hujan di Surabaya, Malang dan Banyuwangi
The main factors affecting the climate in Indonesia are monsoons, El Nino Southern Oscillation (ENSO) and sunspot cycles. Another influence is local characteristics, namely the geo...
Improving Human Mobility Forecasts: A Study on Outlier Correction with Multi-Agent Techniques
Improving Human Mobility Forecasts: A Study on Outlier Correction with Multi-Agent Techniques
Abstract
This research is part of designing and implementing an AutoML platform for time series forecasting. Having discussed missing value imputation and outlier detection...

