Teaching:TUW - UE InfoVis WS 2010/11 - Gruppe 05 - Aufgabe 2: Difference between revisions

From InfoVis:Wiki
Jump to navigation Jump to search
(→‎Metrics: Over-plotting)
(qyrfCoHuOEljIvG)
 
(32 intermediate revisions by 3 users not shown)
Line 1: Line 1:
== Pargnostics: Screen-Space Metrics for Parallel Coordinates ==
You've hit the ball out the park! Icnrdeible!
 
=== Introduction ===
 
Visualization still takes place in a space with a limited number of discrete
pixels. The result of this often is over-plotting, clutter or other things.
Most of the time this structures are avoided. But sometimes this artifacts can be useful, because they might point out interesting structures in the data.
Good Visuals have to show the relevant information at the first glance and therefor they have to show the information in a clear structure.
Until know not a lot of attention is paid to the way visualzation is presented on the screen.
For the case of parallel coordinates so called Pargnostics (Parallel coordinates diagnostics) should act as a bridge between the created visualization and the perceptual system of the user.
Based on metrics it's possible to provide an optimization which maximize or minimize certain visual artifacts.
 
=== Metrics ===
 
[[Image:Dasgupta+Kosara Pixel-Space Histograms.png|thumb|300px|right|Figure 1: Pixel-space histograms, see [Dasgupta and Kosara, 2010] p1018]]
To automatically enhance the analytical tasks of users Dasgupta and Kosara [2010] proposed several metrics.
The metrics are used to measure the properties of parallel coordinates.
For optimize calculation of these metrics, first ''pixel-space histograms'' are calculated.
Pixel-space histograms discretize the lines drawn in parallel coordinates into bins - each bin being one pixel (for a total of ''h'' pixels in an axis):
* ''One-Dimensional Axis Histogram'': A vector ''b'' containing the number of lines that start or end at this pixel - see columns A and B in figure 1.
* ''One-Dimensional Distance Histogram'': A vector ''d'' where each component measures the slope of lines.
* ''Two-Dimensional Axis Pair Histogram'': A matrix where each cell ''x<sub>i,j</sub>'' means that ''n'' lines are going from pixel ''i'' to pixel ''j'' - see matrix in figure 1.
 
The metrics proposed can be used for measuring different data properties: ''correlation'' (number of line crossings, angles of crossing), ''aggregation'' (parallelism), ''many-to-one/one-to-many relationships'' (convergence, divergence), ''quality'' (over-plotting), ''information density'' (pixel-based entropy).
 
;Number of Line Crossings
:The first metric proposed by Dasgupta and Kosara [2010] is number of line crossings. This intuitively just does what the name implies, count how many line crossings there are between two axes of the prallel coordinates. The count is efficiently calculated by using intervals as proposed by [Allen, 1983; Rit 1986] with a complexity of ''O(h<sup>4</sup>)''. This count is then normalized by the maximum number of possible crossings in order compare the metric between different axis combinations.
 
;Angles of Crossing
:Angles of crossing is a metric equally important to the number of line crossings [Dasgupta and Kosara, 2010]. Lines crossing at flat angles tend to create clutter. To calculate this metric, first the crossing angles between every pair of lines which are crossing get calculated. From the results of these calculations the median crossing angle is obtained to be used as the resulting metric value.
 
;Parallelism
:A pair of lines which is not crossing is parallel to each other. Parallelism is often prefered as a metric compared to the number of crossings because it tends to produce less clutter [Dasgupta and Kosara, 2010]. Parallelism is calculated by taking the inverse of the absolute interquartile range of line distances.
 
;Mutual Information
:TBD
:TBD
 
;Convergence, Divergence
:Lines from the left axis joining in single points on the right axis are converging. Divergence is the inverse of this - i.e. lines from the right joining on the left.
 
;Over-plotting
:Over-plotting measures how many lines are aggregated on a single pixel when the parallel coordinates are drawn. This metric is directly dependent on the number of bins (pixels) used for an axis [Dasgupta and Kosara, 2010]. The value obtained by this metric should be minimized.
 
;Pixel-based Entropy
:TBD
:TBD
 
===  Dimension Order Optimization ===
 
Using the metrics from above helps to find an optimization of the visualization display.
Normally this would be a NP-complete problem but by using a binned data model and by considering the special properties of parallel coordinates it's possible to reduce the amount of time which is needed to find  optimal solutions. Using a branch-and-bound algorithm also can reduce the necessary time.
 
The porpuse of the branch-and-bound algorithm is to find the optimal order of axes.
To do that a matrix of all axis pairs and their associated costs gets constructed.
The costs are a combination of several metrics.
 
The construction of the matrix has to be done only once at the beginning of the algorithm.
 
=== Axis Inversions ===
 
While constructing the matrix of axis pairs both the inverted and noninverted situation for every axis pair is taking into account.
The situation(inverted or noninverted) with the lower costs gets used in the matrix and the algorithm keeps track which one that was.
This happens locally. So inverting one axis pair doesn't have an immediate effect on other axis pairs.
 
=== Branch-and-Bound Optimization ===
 
The Branch-and-Bound algorithm uses a priority queue and best-first search. For that kind of implemantions it's very important to make precise estimates which subtrees can be culled and which can't. Since these estimates are based on the full cost matrix constructed at the beginning of the algorithm they are indeed very precise.
 
Based on case studies by Dasgupta and Kosara [2010] the branch-and-bound algorithm finds the optimal solution really quick.
The main reason for this  is that the metrics are constructed for every axis pair on their own and not for every possible combination of the axis pairs in the entire visualization.
 
=== Conclusion ===
 
{{Quotation|In this sense, Pargnostics fills a gap in the existing literature on parallel coordinates. Being able to analyze what ends up on the screen makes it possible to provide better visualization setups that take the specific properties of the visualization technique into account.|Dasgupta and Kosara, 2010}}
 
The metrics and the optimization explained above are a really important step towards better visualiztions.
This metrics not only describe the image which is rendered to screen, but also the different visual structures which can be seen in this image.
 
=== References ===
 
*[Allen, 1983] J. Allen. Maintaining knowledge about temporal intervals. ''Communications of the ACM'', 26:832-843, 1983.
*[Dasgupta and [[Kosara, Robert|Kosara]], 2010] Aritra Dasgupta and [[Kosara, Robert|Robert Kosara]]. Pargnostics: Screen-Space Metrics for Parallel Coordinates. ''IEEE Transactions on Visualization and Computer Graphics'', 16(6):1017-1026, November/December 2010.
*[Rit, 1986] J.-F. Rit. Propagating temporal constraints for scheduling. In ''Proceedings of the Fifth National Conference on Artificial Intelligence'', pages 383-388, 1986

Latest revision as of 03:50, 18 April 2011

You've hit the ball out the park! Icnrdeible!