Checking for Outliers


SPSS Survival Manual by Julie Pallant: Many statistical techniques are sensitive to outliers. The previous techniques that we have talked about under the descriptive section can also be used to check for outliers. However, there is alternative way to assess them.
Procedure for Identifying Outliers:
  1. From the menu at the top of the screen, click on Analyze, then click on Descriptive Statistics, then Explore.
  2. In the Display section, make sure Both is selected. This provides both Statistics and Plots.
  3. Click on your variable (e.g. most important problems in 12 months), and move it into the Dependent list box.
  4. Click on id from your variable list and move into the section Label cases. This will give you the ID number of the outlying case.
  5. Click on the Statistics button. Click on Outliers. Click on Continue.
  6. Click on the Plots button. Click on Histogram. Ask for a Stem and Leaf plot as well.
  7. Click on the Options button. Click on Exclude cases pairwise. Click on Continues and then OK.
The output generated from this analysis as follows:

Reading the Output:
  1. Have a look at the Histogram and check the tails of distribution if there are data points falling away as the extremes.
  2. Inspect the Boxplot whether SPSS identifies outliers. These outliers are displayed as little circles with a ID number attached.
  3. Make sure that the outlier's score is genuine and not an error.
  4. Descriptive table provide you with an indication of how much a problem associated with these outlying cases. The expected value is the 5% Trimmed Mean. SPSS removes the top and bottom 5 per cent of the cases and calculated a new mean value to obtain this Trimmed Mean value. If you compare the original mean and this new trimmed mean, you can see if your more extreme scores are having a lot of influence on the mean. If you find these two mean values are very different, you need to investigate the data points further.
  5. The Extreme values table gives you with the highest and the lowest values recorded for that variable and also provide the ID of the person with that score. It helps to identify the case that has the outlying values. SPSS Survival Manual by Julie Pallant

11 comments:

Anonymous,  September 24, 2011 at 10:35 PM  

This was very informative and to the point. Excellent!

Anonymous,  July 31, 2012 at 9:43 PM  

This Was Really Helpful Thanks ....

Anonymous,  November 9, 2012 at 9:56 AM  

sooooo helpful!!!!

Unknown December 27, 2012 at 12:49 PM  

SPSS help offered by Statistics-consultation has been truly remarkable. We have a team of statisticians who are dedicated towards helping research scholars combat all the statistical data analysis issues.

Dissertation Statistics Help | Dissertation Statistics Consultant | PhD Thesis Statistics Assistance

Anonymous,  February 26, 2013 at 6:10 PM  

Great article, extremely helpful. Thank you!

Anonymous,  June 11, 2013 at 3:56 PM  

Excellent !!! Thanks a lot !!

Unknown November 19, 2013 at 8:40 AM  

Hi, thanks for this info! Question: How does one define "very different?"

"...If you find these two mean values are very different, you need to investigate the data points further."

Unknown November 19, 2013 at 8:42 AM  

Hi,
Thanks for this! How do you define "very different?"

"...If you find these two mean values are very different, you need to investigate the data points further."

Unknown November 19, 2013 at 8:42 AM  

Hi,
Thanks for this! How do you define "very different?"

"...If you find these two mean values are very different, you need to investigate the data points further."

Paul,  December 6, 2014 at 1:02 PM  

For my data set, all outliers disappeared when I changed the scale of the y-axis from linear to log. What happened?

About This Blog

Anything related to SPSS and statistics.
We are not statisticians, but we like to share simple things about SPSS and its usage.

The Performance

RSS

  © Blogger templates The Professional Template by Ourblogtemplates.com 2008

Back to TOP