Revision as of 01:42, 17 November 2010

Graphical Inference for Infovis

The following article summarizes the work of [Wickham et al., 2010] on graphical inference.

Introduction

Information visualization provides tools to uncover new relations in data. While these technique can be very effective they are jeopardized by apophenia, the human ability to detect patterns in noise. On the other hand, statistics provide methods that examine if such relationships can be deduced from sample data - or if the hypotheses are invalid and subsequently have to be rejected. Graphical inference tries to find a balance between these two methods: information visualisation to improve identification of new hypotheses, and statistics to reveal faulty conclusions.

The authors present two experimental protocols, Rorschach and Line-Up, which show how the techniques mentioned above (statistics and information visualisation) can be combined.

Motivation

Statistical methods generally try to show if a hypothesis is true or not. More specifically statistic investigate whether a difference exists (testing) or how big the difference is (estimating). For graphical inference you want to know if a difference actually exists, thus graphical inference works as testing procedure.

Statistic Foundation

For such a statistic test one needs to define a so-called null hypothesis H₀ which is tested against the alternative hypothesis H₁. As statistics produce results based on probabilities, mistakes can happen. The two possible errors are classified as follows:

	Null Hypothesis (H₀) is true	Alternative Hypothesis (H₁) is true
Null Hypothesis is accepted	Right decision	Type II Error False Negative
Null Hypothesis is rejected	Type I Error False Positive	Right decision

(See external link section for further information on statistical hypothesis testing)

The statistical testing process can be compared to the criminal justice system where an accused is judged guilty or innocent. During the trial the defense tries to show that the null hypothesis is true, the prosecution advocates the alternative hypothesis.

The static test compares the accused and known innocents, using a specific metric. To assess the guilt of the accused, the ration fo the innocent that look more guilty than the accused is computed. A type I error would be a convicted innocent and a type II error would be an acquitted guilty.

In statistics we use tests (e.g. t-statistic) to calculate the probability (the p-value) of rejecting or accepting the null hypothesis. When visual testing is used instead, the data is plotted and the visual difference measured (tested) by a human judge or jury.

Protocols

Within the paper two different protocols for graphical inference are presented (Rorschach and Line-Up) which are described in the following sections.

Rorschach

The Rorschach protocol (named after the Rorschach test, in which a subject has to interpret abstract ink blots) is used to calibrate the analysts intuition by showing only null plots.

An example of such a Rorschach is given in Figure 1: Nine histograms summarizing the accuracy at which 500 participants perform nine tasks. What do you see?

The goal of this operation is to train the senses to random deviations, and therefore reduce the effect of apophenia for the given type of visualisation. Although, in order to keep the analysts alert, plots of the real data may be interspersed.

Line-Up

The idea behind line-up (named after the police lineup) is showing the real data plot camouflaged by decoys. In case the observer is able to identify the real data, we can assume that it differs from the null plots. The line-up procedure consists of the following steps:

generate n - 1 decoys (null datasets)
make a plot of the decoys and the real data (positioning the real data plot ranomly)
let an observer assess which plot shows the real data.

The probability (p-value) of such a line-up is easily calculated. A practicable n of 19 leads to a probability of 1/20 = 0.05 (classical p-value) to pick the right plot by chance. To generate even more precise p-values the judge (single observer) can be replaced by a jury.

It is desireable to perform the test in a double-blind environment with neither the observer(s) nor the administrator knowing the true plots. If one has not seen the data yet a self-administered test is possible. Following software was implemented to assist such a procedure.

Software

The above mentioned protocols have been implemented by the authors as an R-package called Nullabor. This package is available for download (as of 16 November 2010).

External Links

Statistics

General overview of statistics
Statistical hypothesis testing
The null hypothesis
Normal distribution
Binomial distribution
Official R-homepage
Download-site for R-package Nullabor, accessed 16 November 2010.

Other

Sesame's street interpretation of the line-up
Interactive presentation of [Wickham et al., 2010]

References

[Wickham et al., 2010] Hadley Wickham, Dianne Cook, Heike Hofmann, and Adreas Buja. Graphical Inference for Infovis. IEEE Transaction on Visualization and Computer Graphics, 16(6):973-979, November/December 2010

@@ Line 1: / Line 1: @@
 = Graphical Inference for Infovis =
 The following article summarizes the work of [Wickham et al., 2010] on graphical inference.
 == Introduction ==
+Information visualization provides tools to uncover new relations in data. While these technique can be very effective they are jeopardized by apophenia, the human ability to detect patterns in noise. On the other hand, statistics provide methods that examine if such relationships can be deduced from sample data - or if the hypotheses are invalid and subsequently have to be rejected. Graphical inference tries to find a balance between these two methods: information visualisation to improve identification of new hypotheses, and statistics to reveal faulty conclusions.
-Information visualization provides tools to show new relations in data. While this technique can be very effictive, it is threatened by apophenia, the human ability to detect patterns in noise.
+The authors present two experimental [[#Protocols|protocols]], [[#Rorschach|Rorschach]] and [[#Line-Up|Line-Up]], which show how the techniques mentioned above (statistics and information visualisation) can be combined.
-Statistics instead provides methods that can examine if an assumption can be deduced from given sample data, and is therefore used to expose invalid hypotheses.
-Graphical inference tries to find a balance between these two methods: information visualisation to improve identification of new hypotheses, and statistics to reveal faulty conclusions.
-The authors present two experimental protocols, Rorschach and Line-Up, which show how both techniques can be combined.
 == Motivation ==
+Statistical methods generally try to show if a hypothesis is true or not. More specifically statistic investigate whether a difference exists (testing) or how big the difference is (estimating). For graphical inference you want to know if a difference actually exists, thus graphical inference works as testing procedure.
-Statistical methods generally try to show that a hypothesis is true or not. More specifically statistic  investigate whether a difference exists (testing) or how big the difference is (estimating). For graphical inference you want to know if a difference actually exists, thus graphical inference works as testing procedure.
 === Statistic Foundation ===
-For such a statistic test one needs to define a so-called null hypothesis H<sub>0</sub> which is tested against the alternative hypothesis H<sub>1</sub>. As statistic produce results based on probability errors can happen. The two possible errors are classified as follows:
+For such a statistic test one needs to define a so-called null hypothesis H<sub>0</sub> which is tested against the alternative hypothesis H<sub>1</sub>. As statistics produce results based on probabilities, mistakes can happen. The two possible errors are classified as follows:
 {| border bordercolor="lightgrey" bgcolor="#C0C0C0" cellspacing=0 cellpadding="10
@@ Line 34: / Line 29: @@
 |}
-(Also see external link section for further information on [http://en.wikipedia.org/wiki/Statistical_hypothesis_testing statistical hypothesis testing])
+(See [[#External Links|external link section]] for further information on [http://en.wikipedia.org/wiki/Statistical_hypothesis_testing statistical hypothesis testing])
-The testing process in statistics can be compared with the criminal justice system, where an accused is judged guilty or innocent. During the trial the defense tries to show that the null hypothesis is true, the prosecution advocates the alternative hypothesis.
+The statistical testing process can be compared to the criminal justice system where an accused is judged guilty or innocent. During the trial the defense tries to show that the null hypothesis is true, the prosecution advocates the alternative hypothesis.
 The static test compares the accused and known innocents, using a specific metric. To assess the guilt of the accused, the ration fo the innocent that look more guilty than the accused is computed. A type I error would be a convicted innocent and a type II error would be an acquitted guilty.
-In statistics we use test statistcs (like the t-statistic)to calculate the propapility (the p-value) that the decision for the alternate hypothesis is wrong. When we use visual testing instead, the data is ploted and the visual difference is measured by a human judge or jury.
+In statistics we use tests (e.g. ''t''-statistic) to calculate the probability (the ''p''-value) of rejecting or accepting the null hypothesis. When visual testing is used instead, the data is plotted and the visual difference measured (tested) by a human judge or jury.
 == Protocols ==
+Within the paper two different protocols for graphical inference are presented (Rorschach and Line-Up) which are described in the following sections.
-Two different protocols are presented: Rorschach and Line-Up
 === Rorschach ===
 [[Image:Rorschach_Protocol.JPG | 250px | thumb | '''Figure 1''' : ''Rorschach protocol'']]
-The Rorschach protocol was named after the Rorschach test, in which a subject has to interpret abstract ink blots.
+The Rorschach protocol (named after the [http://en.wikipedia.org/wiki/Rorschach_test Rorschach test], in which a subject has to interpret abstract ink blots) is used to calibrate the analysts intuition by showing only null plots.
-Similar to that, for the Rorschach protocol a series of plots is generated and presented to a subject, who is then asked to find patterns in the visualisations.
+An example of such a Rorschach is given in Figure 1: Nine histograms summarizing the accuracy at which 500 participants perform nine tasks. What do you see?
-An example is given in Figure 1: Nine histograms summarizing the accuracy at which 500 participants perform nine tasks. What do you see?
+The goal of this operation is to train the senses to random deviations, and therefore reduce the effect of apophenia for the given type of visualisation. Although, in order to keep the analysts alert, plots of the real data may be interspersed.
+=== Line-Up ===
+The idea behind line-up (named after [http://en.wikipedia.org/wiki/Police_lineup the police lineup]) is showing the real data plot camouflaged by decoys. In case the observer is able to identify the real data, we can assume that it differs from the null plots. The line-up procedure consists of the following steps:
+* generate ''n'' - 1 decoys (null datasets)
+* make a plot of the decoys and the real data (positioning the real data plot ranomly)
+* let an observer assess which plot shows the real data.
+The probability (''p''-value) of such a line-up is easily calculated. A practicable ''n'' of 19 leads to a probability of 1/20 = 0.05 (classical ''p''-value) to pick the right plot by chance. To generate even more precise ''p''-values the judge (single observer) can be replaced by a jury.
-The goal of this operation is to train the senses to random deviations, and therefore reduce the effect of apophenia for the given type of visualisation.
+It is desireable to perform the test in a double-blind environment with neither the observer(s) nor the administrator knowing the true plots. If one has not seen the data yet a self-administered test is possible. Following software was implemented to assist such a procedure.
-=== Line-Up ===
+=== Software ===
+The above mentioned protocols have been implemented by the authors as an [http://en.wikipedia.org/wiki/R_%28programming_language%29 R-package] called <code>Nullabor</code>. This package is available for [https://github.com/ggobi/nullabor download] (as of 16 November 2010).
-The idea behind line-up is to show the real data plot together with decoys. When the observer is able to identificate the real data, we can assume that the real data differs. The line-up consists of the following steps:
-* generate n - 1 decoys
-* make a plot of the decoys together with a plot of the real data (positioning the real data plot ranomly)
-* let an observer assess, which data shows the deviation.
-=== Software ===
-The above mentioned protocols have been implemented by the authors as an [http://en.wikipedia.org/wiki/R_%28programming_language%29 R-package] called "Nullabor". This package is available for [https://github.com/ggobi/nullabor download] (as of 16 November 2010).
+== External Links ==
-== External Links ==
 === Statistics ===
+* [http://en.wikipedia.org/wiki/Statistics General overview of statistics]
 * [http://en.wikipedia.org/wiki/Statistical_hypothesis_testing Statistical hypothesis testing]
 * [http://en.wikipedia.org/wiki/Null-hypothesis The null hypothesis]

Teaching:TUW - UE InfoVis WS 2010/11 - Gruppe 01 - Aufgabe 2: Difference between revisions