Teaching:TUW - UE InfoVis WS 2008/09 - Gruppe 02 - Aufgabe 1 - Scatterplot: Difference between revisions
mNo edit summary |
mNo edit summary |
||
Line 1: | Line 1: | ||
A scatterplot (also called a ''scatter chart'', ''scatter diagram'' or ''scatter graph'' [Wikipedia]) is a diagram in which the values of two variables are applied to the horizontal and vertical axes of a cartesian coordinate system. The resulting point in the graph represents one record from a data set. The distribution pattern of points from multiple records reveals the correlation | A scatterplot (also called a ''scatter chart'', ''scatter diagram'' or ''scatter graph'' [Wikipedia]) is a diagram in which the values of two ''metric'' variables are applied to the horizontal and vertical axes of a cartesian coordinate system. The resulting point in the graph represents one record from a data set. The distribution pattern of points from multiple records reveals, among other qualities, the correlation between the selected variables in the data set. The scatterplot is not to be confused with the ''correlation plot'' [Information Technology Lab, NIST #2] which treats already adopted correlation coefficients in different data groups, while the term ''correlation diagram'' does not seem to be bound. | ||
=Revealed Information= | =Revealed Information= | ||
==Type of Correlation== | ==Type of Correlation== | ||
correlation patterns | |||
Perfect linear correlation results in all samples lying on the regression line with positive or negative incline dependent on the sign of the correlation coefficient [University of Illinois]. Note, that the nonzero incline of the line is insignificant in this kind of diagram [Wikipedia Correlation, EN] since it is dependent on axis scales. | |||
Other patterns of linear correlation. | |||
(regression function, "scatterplot smoothing" [NetMBA]) | (regression function, "scatterplot smoothing" [NetMBA]) | ||
sign, strength (TODO: add about figures with: perfect positive, strong tight negative, weak loose positive, no correlation, clusters) | sign, strength (TODO: add about figures with: perfect positive, strong tight negative, weak loose positive, no correlation, clusters) | ||
4 figures | 4 figures: | ||
*perfect positive | *perfect positive | ||
*high negative | *high negative | ||
*low positive | *low positive (with regression line) | ||
*no correlation | *no correlation | ||
*some with regression line | |||
Generally: refer to regression analysis for further ... | |||
==Density, Outlyers and Clusters== | ==Density, Outlyers and Clusters== | ||
Line 35: | Line 41: | ||
*Wikipedia, EN: http://en.wikipedia.org/wiki/Scatterplot | *Wikipedia, EN: http://en.wikipedia.org/wiki/Scatterplot | ||
*Wikipedia, DE: http://de.wikipedia.org/wiki/Streudiagramm | *Wikipedia, DE: http://de.wikipedia.org/wiki/Streudiagramm | ||
*Wikipedia Correlation, EN: http://en.wikipedia.org/wiki/Correlation | |||
*University of Illinois: http://www.mste.uiuc.edu/courses/ci330ms/youtsey/scatterinfo.html | *University of Illinois: http://www.mste.uiuc.edu/courses/ci330ms/youtsey/scatterinfo.html | ||
*Information Technology Lab, NIST #1: http://www.itl.nist.gov/div898/handbook/eda/section3/eda33q.htm | *Information Technology Lab, NIST #1: http://www.itl.nist.gov/div898/handbook/eda/section3/eda33q.htm |
Revision as of 00:42, 2 November 2008
A scatterplot (also called a scatter chart, scatter diagram or scatter graph [Wikipedia]) is a diagram in which the values of two metric variables are applied to the horizontal and vertical axes of a cartesian coordinate system. The resulting point in the graph represents one record from a data set. The distribution pattern of points from multiple records reveals, among other qualities, the correlation between the selected variables in the data set. The scatterplot is not to be confused with the correlation plot [Information Technology Lab, NIST #2] which treats already adopted correlation coefficients in different data groups, while the term correlation diagram does not seem to be bound.
Revealed Information
Type of Correlation
Perfect linear correlation results in all samples lying on the regression line with positive or negative incline dependent on the sign of the correlation coefficient [University of Illinois]. Note, that the nonzero incline of the line is insignificant in this kind of diagram [Wikipedia Correlation, EN] since it is dependent on axis scales.
Other patterns of linear correlation.
(regression function, "scatterplot smoothing" [NetMBA])
sign, strength (TODO: add about figures with: perfect positive, strong tight negative, weak loose positive, no correlation, clusters)
4 figures:
- perfect positive
- high negative
- low positive (with regression line)
- no correlation
- some with regression line
Generally: refer to regression analysis for further ...
Density, Outlyers and Clusters
density (-> cluster analysis) & outlyers
- 1 image for clusters
- 1 with outlyer
Scatterplots of Higher Dimensions
Not necessarily two variables, higher dimensions displayed spacially or by point properties (color, size, shape)
TODO: add figure with colored 3D plot, sunflower plot [addictedtor.free.fr], [York University], jitter plot
Treating Discrete Data
[Wikipedia, DE]
References
- Wikipedia, EN: http://en.wikipedia.org/wiki/Scatterplot
- Wikipedia, DE: http://de.wikipedia.org/wiki/Streudiagramm
- Wikipedia Correlation, EN: http://en.wikipedia.org/wiki/Correlation
- University of Illinois: http://www.mste.uiuc.edu/courses/ci330ms/youtsey/scatterinfo.html
- Information Technology Lab, NIST #1: http://www.itl.nist.gov/div898/handbook/eda/section3/eda33q.htm
- Information Technology Lab, NIST #2: http://www.itl.nist.gov/div898/handbook/eda/section3/linecorr.htm
- NetMBA: http://www.netmba.com/statistics/plot/scatter/
- ChartItNow: http://www.chartitnow.com/scatter%20diagram.html
- addictedtor.free.fr: http://addictedtor.free.fr/graphiques/graphcode.php?graph=59
- York University: http://www.math.yorku.ca/SCS/sasmac/sunplot.html
- NLVM: http://matti.usu.edu/nlvm/nav/frames_asid_144_g_4_t_5.html