Teaching:TUW - UE InfoVis WS 2005/06 - Gruppe G4 - Aufgabe 3 - Design: Difference between revisions

From InfoVis:Wiki
Jump to navigation Jump to search
 
(45 intermediate revisions by 5 users not shown)
Line 1: Line 1:
== Topic ==
== Topic ==
'''MP3 Archive Visualization'''
'''MP3 Archive Visualization - "Interpret Analyser"'''
 
== Specification of the Application Area and the given Dataset ==
== Specification of the Application Area and the given Dataset ==
=== Application area Analysis ===
=== Application area Analysis ===


The application area in this case is the creation of a clearly arranged visualization for a big music archive consisting of several thousand files. This can be achieved by using the values of the ID3-tags as well as the attributes of the music files itself.
Since we have chosen the MP3 Archive Visualization, our job will be the creation of a clearly arranged visualization for a big music archive consisting of several thousand files.  


Since the existing data mainly consist of discrete values, our mission to produce an appropriate visualization will basically be achieved with the creation of quantitative data by using a reasonable combination of given values.  
This can be achieved by using the already existing values of the container format ID3, additional attributes of the iTunes library, attributes of the music files themselves, as well as some system values.


Most Audio formats nowadays use the container format ID3 to store additional Information about the file, apart from the given file properties.
Due the fact that these given sources already provide loads of different Information, we will try to create and present additional Information by combining some prior chosen values in a reasonable way.  


ID3 generally supports loads of different input values and to visualize them all would go beyond the scope of our task. Besides in most cases only the most common values like Album, Interpret, Year… are specified correctly.
Thus we have to keep in mind that ID3 for example theoretically indeed supports a huge amount of input values but in most cases only the most common values like Album, Interpret or Year are specified correctly.  
Therefore we will concentrate our work for our prototype on these entries.
 
Therefore we will only use some of these entries in our prototype.


=== Dataset Analysis ===
=== Dataset Analysis ===


The values we will use in our project consist of nominal, discrete and ordinal data types and are for themselves all one-dimensional.


{|
The table below shows a complete listing:
|- style="background:#FF6633"
!Attribute
!Data type
!Description
|-
|Title
|Discrete
|Name of the Song
|- style="background:#FFCC99"
|Interpret
|Discrete
|Name of the Artist/Band who plays the song
|-
|Album
|Discrete
|Name of the Album/Compilation the song belongs to
|- style="background:#FFCC99"
|Year
|Ordinal
|Year the song was recorded in
|-
|Genre
|Nominal
|Genre the song belongs to
|- style="background:#FFCC99"
|Size
|Quantitative 
|Size of the file
|-
|Duration
|Quantitative 
|Overall play length of the song
|- style="background:#FFCC99"
|Bit rate
|Ordinal
|Bit rate the file was encoded with
|-
|Sample rate
|Ordinal
|Sample rate the file was encoded with
|- style="background:#FFCC99"
|File Location
|Discrete
|Full system path to the file
|}


[[Image:tabelle_datentyp_g4.png]]


A complete data record consists of all attributes listed above.
The complete data set is multi-dimensional and consists of all attributes listed above.
 
All data in this set is one-dimensional.


== Analysis of the Target User Group ==
== Analysis of the Target User Group ==
Line 72: Line 29:
=== Who should use this kind of visualization technique? ===
=== Who should use this kind of visualization technique? ===


This visualization technique is mainly meant for the 'end-users', that is someone who collects lots of mp3s. Our visualization should help the user to get an overview of his collection and discover coherences between them. The user should also get some more information about his collection which he did not know before or which he could not find in any usual representation.
This visualization technique is mainly meant for the 'end-users', that is someone who collects lots of MP3s. With 'lots of MP3s' we mean quite a few GBs, just more than 30 GBs. Our visualization should help the user to get an overview of his collection and his listening-habits.  
This visualization technique could also be interesting for the band and the music industry, if they want to produce a new album. for example: a band (like "Radiohead") who changed their music-style over the years wants to know which style is preferred more. But therefore they have to compare the data-sets from many users.


=== What are the characteristics of the target group? ===
=== What are the characteristics of the target group? ===


People of this group are music enthusiasts. They have thousands of mp3s on their harddisk and love it to collect them. These people mainly receive their mp3s from the internet instead of buying CDs, because they like it to see their whole musiccollection at a glance and want to browse through it in many different ways.
People of this group are music enthusiasts. They have thousands of mp3s on their hard-disk and love it to collect them. Most of them have lost track of their collection, on the strength of the abundance of their collection. These people mainly receive their mp3s from the internet instead of buying CDs, because they like to see their whole music-collection at a glance.


=== Are there any known / often used Methods / Visualisation Techniques? ===
=== Are there any known / often used Methods / Visualisation Techniques? ===


No, until now only the textbased listing-technique is used.
No, we don't know any similar visualization technique. ITunes only shows textbased info about how often a song was heard.


== Intended Purpose of our Visualization ==
== Intended Purpose of our Visualization ==


=== What should be achieved with this visualization? ===
=== What should be achieved with this visualization? ===
A better information representation of the mp3s should be achieved by combining several ID3-tags. The representation of the data and the contained information should be effective, precise and self-explanatory, so that the user can get a good and expressive overview of his collection and the contained information.
A better information representation of the MP3s should be achieved. Our visualization should help the user to get an overview of his collection and his listening-habits. He will get information about the tracks, in reference to a special artist, which he often listens to and to those which he has never heard before. The representation of the data should be expressive, precise and self-explanatory.


=== Which tasks should be solved? ===
=== Which tasks should be solved? ===
The data readout of contentspecific characteristics should be possible.
By using this visualization technique, the user will get information about a chosen artist and his discography. For example: in his database the user has got the band "Radiohead", who produced albums over 15 years and in this period they changed their music-style from alternative rock to experimental electronic. The visualization will show him from which producing period he has got more MP3s and which period he likes more, by counting the number of listenings of each song. The result could be that he has got more MP3s from their early years, but likes the experimental electronic tracks more.
These characteristics could be e.g.:
* 'the longest song' of the mp3-collection or
* 'the most present genre' or
* 'the most often heard song' or
* 'the Top Ten'
* etc.


=== Questions that should be solved with this visualization technique ===
=== Questions that should be solved with this visualization technique ===
It should be possible to get the frequency distribution of the mp3s relating to their characteristics. But the precondition therefor is that the ID3-tags are available


== Proposal of Design ==
== Proposal of Design ==


=== Types of Visualization applied ===
=== Kind of Visualization / Visualization Details ===


When the user opens the "Interpret-Analyser" he will be prompted via a text-message in the main-window to click on an artist/band in the right upper window. The artists/bands are sorted alphabetically and the subject of interest can be found by scrolling the window vertically.  If the user chose an artist/band in the upper right window, the main window will visualize him following details on the demanded item:


As there are no comparable features implemented in existing mp3 archival storage systems we decided to implement a visualization kit exploring the quantitative distribution of an mp3-archive according to the different genres. As a default the main window shows several genres the authors considered to be the right ones for music main-genres in the root directory including their percentage-rate corresponding to the whole file-archive.


*The x-axis shows the songs listed vertically by-publication-year generated out of the ID3-Data
*The y-axis shows the number of songs published per year according to the specific number in the users' iTunes-library


By a single click on one of the main genres the user gets additional information in the upper right window on the one hand and of course a file-listing in the lower left window on the other hand. Double clicking a genre in the main window brings the user straight to a sub-level containing the corresponding sub-genres again shown in the main window. Through the path shown on top of the main window the system allows to easily jump back to wherever necessary out of every level.


The respective maximum on the y-axis will give a first overview on how many songs the specific library contains per artist/band per year. Though the users' library might not be complete the visualization allows drawing conclusions according to the artist/band-activities over the last years. In any cases we assume that the user applies the "Interpret-Analyser" to artists/bands whereof he collected the whole discography and not only one song.


Further on the user is able to read out file-specific information like Title, Interpret,
For each song (= one data point) one horizontal bar is drawn along the y-axis. That means for example if the library contains 34 songs by the band "Queen" with publication-year "1985", 34 bars are drawn at the x-axis value "1985" along the vertical y-axis.
Album, Year, Duration & Bit-Rate through a click on the particular file listed below, but is also allowed to save a play-list from the files selected via the file-explorer.


In addition to that each bar drawn vertically has a specific colour, representing the date when he was last played. As it is shown in our Mock-Up below the range goes from blue (representing songs that have not been played for a long time) to red (representing songs that have been played recently).


Following feature turns the visualization kit into a rather complex and outstanding tool. By default the tool sorts the input data through a given hierarchy according to the main genres and alternatively to the sub-genres following the ID3-Tags. Additionally the User has the ability to change this system and to create his own view according to his point of interest. That means that he is able to change the main genres and the sub-genres as well. As there are always different views especially on the terms of sub-genres the user is even allowed to change the structure of the given categories. That means that one is permitted to extract an ambiguous sub-genre to directly attach it to another genre. This will be able via the right middle window showing the sub-genres and their actual belongings as well. Probably the system will give the possibility to cancel unmeant genres at all.
Further on the user can interact and influence the characteristic of the visualization by using a slider positioned in the lower right in the graphic below. Via the slider a more objective image can be drawn according to the actual point of interest. This slider with a value-range from "0" to "10" represents the counts how often a song was played. It allows setting a threshold. The default value is "3" and means that songs that were played less than 3 times do not appear coloured, but as grey bars vertically above the coloured ones along the y-axis. If for example someone drags the slider to the position with value "10" and only 1 song out of 27 with a special publishing year was played more than 9 times the "Interpret-Analyser" shows 1 coloured and 26 grey bars at the according year. This could for instance help if someone is on the way to filter out his absolute favourites of an artist/band.
 
As it is mentioned above the "Interpret-Analyser" represents highly interesting visualizations for End-Users but it might also prevent outstanding features for the Music-Industry respectively bands, who work on a Come-Back. This however would assume to arising the data of a rather big audience, what could for example be achieved via a contest.


=== Visual Mapping ===
=== Visual Mapping ===




*Due to the enormous number of music-genres ID3-Tags provide we will probably choose a tree structure in addition to the text-based genre-navigator on the right in lower levels. This structure should be similar to a hyperbolic tree.
'''2D Diagram''':


Dimension "Genre-Multitude" --> visual attribute "Tree Branches"


*'''"X-AXIS":''' the x-axis shows the songs listed vertically by-publication-year generated out of the ID3-Data.


*To afford best visualization we are right now not sure whether we should display the percentage-rate in bars as shown in the fake-screenshot below, or should rather use a Bubble-Chart, where the size of the bubbles is determined by the values of occurrence per genre.


Dimension "File-Occurrence" --> visual attribute "Area"
*'''"Y-AXIS":''' the y-axis shows the number of songs published per year according to the specific number in the users' iTunes-library. The height of the vertical bar-column represents the "Song-Occurrence" per year.


=== Specification of used Techniques / applied Principles ===


*'''"Colour":''' each bar drawn vertically has a specific colour, representing the date when he was last played. As it is shown in our Mock-Up below the range goes from blue (representing songs that have not been played for a long time) to red (representing songs that have been played recently). Grey bars represent songs that did not pass the  adjusted threshold.


*Hyperbolic Trees:
=== Specification of used Techniques / applied Principles ===
(s. slide 18, 19, 20 of Info_Vis4.pdf handed out in the course “188.305 VO Informationsvisualisierung”)


Due to the enormous number of music-genres ID-3 Tags provide we probably choose a tree structure
similar to a hyperbolic tree to visualize the raw data extracted from the corresponding mp3-archiv.


*'''Bar X Plot:''' In this plot, one vertical bar is drawn for each data point [StatSoft, 2003]


*Bubble Chart:
(s. similar to bubble-maps, slide 52 of Info_Vis4.pdf handed out in the course 188.305 VO Informationsvisualisierung, http://peltiertech.com/Excel/ChartsHowTo/HowToBubble.html)


Bubble chart, where the size of the bubbles is determined by the values of occurrence.
*'''Histograms, 2D:''' 2D histograms present a graphical representation of the frequency distribution of the selected variable(s) in which the columns are drawn over the class intervals and the heights of the columns are proportional to the class frequencies. [StatSoft, 2003]




*Details on demand:
*'''Colour-Range, Linking & Brushing:''' A colour-Range representing the levels between "not played for a long time" and "recently played". (s. slide 100 of Info_Vis0.pdf handed out in the course 188.305 VO InfoVis)
(s. slide 55 of Info_Vis4.pdf handed out in the course 188.305 VO Informationsvisualisierung)


Used several times. (File-Explorer, Genre-Information, ...)


*'''Scatterplot, 2D:''' The scatterplot visualizes a relation (correlation) between two variables X and Y (e.g., weight and height). Individual data points are represented in two-dimensional space (see below), where axes represent the variables (X on the horizontal axis and Y on the vertical axis). The two coordinates (X and Y) that determine the location of each point correspond to its specific values on the two variables. [StatSoft, 2003]


*Linking & Brushing:
(s. slide 100 of Info_Vis0.pdf handed out in the course 188.305 VO Informationsvisualisierung)


Detail Windows containing percentage-distribution according to the specific genre
*'''Dynamic Queries:''' Adjusting the slider generates dynamic queries. [StatSoft, 2003]


=== Possibilities of User-Interaction ===
*Focus & Context: Tiled Multi-Level Browser
(slide 69 of Info_Vis0.pdf handed out in the course 188.305 VO Informationsvisualisierung)


Overview Window, Details on Demand Window


=== Possibilities of User-Interaction ===
*Select item of interest (artist/band)
**Get artist/band-details




* Assemble information about occurence-percentage per genre
*Adjust the slider to influence the threshold
* Assemble information according to ID3-Tags
**Get individual Visualizations according to the users' point of interest
* Group genres
* Build paths
* Build new tree
* Modify leaves
* Save play-list


=== Mockup / Fake Screenshot ===
=== Mockup / Fake Screenshot ===




[[Image:MP3_Viewer.gif]]
[[Image:Interpret-Analyser.png]]
 


== References ==
== References ==
Line 184: Line 123:
[Wikipedia, 2005b] MP3, Wikipedia, Last updated: 21 November, 2005, Retrieved at: November 22, 2005, http://en.wikipedia.org/wiki/Mp3
[Wikipedia, 2005b] MP3, Wikipedia, Last updated: 21 November, 2005, Retrieved at: November 22, 2005, http://en.wikipedia.org/wiki/Mp3


[Id3.org, 2004] MP3, Id3.org, Last updated: 28. Februar, 2004, Retrieved at: November 22, 2005, http://www.id3.org/frames.html
[Id3.org, 2004] ID3v2 frames, Id3.org, Last updated: 28. February, 2004, Retrieved at: November 22, 2005, http://www.id3.org/frames.html


[Montclaire, 2000] Data Types, Department of Science and Mathematics at Montclair State University, Last updated: 3. August, 2000, Retrieved at: November 22, 2005, http://www.csam.montclair.edu/~mcdougal/SCP/D_types.htm
[Montclaire, 2000] Data Types, Department of Science and Mathematics at Montclair State University, Last updated: 3. August, 2000, Retrieved at: November 22, 2005, http://www.csam.montclair.edu/~mcdougal/SCP/D_types.htm
[StatSoft, 2003] Graphical Analytic Techniques, Last updated: 2003, Retrieved at: November 24, 2005, http://www.statsoft.com/textbook/stgraph.html

Latest revision as of 22:58, 12 December 2005

Topic[edit]

MP3 Archive Visualization - "Interpret Analyser"

Specification of the Application Area and the given Dataset[edit]

Application area Analysis[edit]

Since we have chosen the MP3 Archive Visualization, our job will be the creation of a clearly arranged visualization for a big music archive consisting of several thousand files.

This can be achieved by using the already existing values of the container format ID3, additional attributes of the iTunes library, attributes of the music files themselves, as well as some system values.

Due the fact that these given sources already provide loads of different Information, we will try to create and present additional Information by combining some prior chosen values in a reasonable way.

Thus we have to keep in mind that ID3 for example theoretically indeed supports a huge amount of input values but in most cases only the most common values like Album, Interpret or Year are specified correctly.

Therefore we will only use some of these entries in our prototype.

Dataset Analysis[edit]

The values we will use in our project consist of nominal, discrete and ordinal data types and are for themselves all one-dimensional.

The table below shows a complete listing:

The complete data set is multi-dimensional and consists of all attributes listed above.

Analysis of the Target User Group[edit]

Who should use this kind of visualization technique?[edit]

This visualization technique is mainly meant for the 'end-users', that is someone who collects lots of MP3s. With 'lots of MP3s' we mean quite a few GBs, just more than 30 GBs. Our visualization should help the user to get an overview of his collection and his listening-habits. This visualization technique could also be interesting for the band and the music industry, if they want to produce a new album. for example: a band (like "Radiohead") who changed their music-style over the years wants to know which style is preferred more. But therefore they have to compare the data-sets from many users.

What are the characteristics of the target group?[edit]

People of this group are music enthusiasts. They have thousands of mp3s on their hard-disk and love it to collect them. Most of them have lost track of their collection, on the strength of the abundance of their collection. These people mainly receive their mp3s from the internet instead of buying CDs, because they like to see their whole music-collection at a glance.

Are there any known / often used Methods / Visualisation Techniques?[edit]

No, we don't know any similar visualization technique. ITunes only shows textbased info about how often a song was heard.

Intended Purpose of our Visualization[edit]

What should be achieved with this visualization?[edit]

A better information representation of the MP3s should be achieved. Our visualization should help the user to get an overview of his collection and his listening-habits. He will get information about the tracks, in reference to a special artist, which he often listens to and to those which he has never heard before. The representation of the data should be expressive, precise and self-explanatory.

Which tasks should be solved?[edit]

By using this visualization technique, the user will get information about a chosen artist and his discography. For example: in his database the user has got the band "Radiohead", who produced albums over 15 years and in this period they changed their music-style from alternative rock to experimental electronic. The visualization will show him from which producing period he has got more MP3s and which period he likes more, by counting the number of listenings of each song. The result could be that he has got more MP3s from their early years, but likes the experimental electronic tracks more.

Questions that should be solved with this visualization technique[edit]

Proposal of Design[edit]

Kind of Visualization / Visualization Details[edit]

When the user opens the "Interpret-Analyser" he will be prompted via a text-message in the main-window to click on an artist/band in the right upper window. The artists/bands are sorted alphabetically and the subject of interest can be found by scrolling the window vertically. If the user chose an artist/band in the upper right window, the main window will visualize him following details on the demanded item:


  • The x-axis shows the songs listed vertically by-publication-year generated out of the ID3-Data
  • The y-axis shows the number of songs published per year according to the specific number in the users' iTunes-library


The respective maximum on the y-axis will give a first overview on how many songs the specific library contains per artist/band per year. Though the users' library might not be complete the visualization allows drawing conclusions according to the artist/band-activities over the last years. In any cases we assume that the user applies the "Interpret-Analyser" to artists/bands whereof he collected the whole discography and not only one song.

For each song (= one data point) one horizontal bar is drawn along the y-axis. That means for example if the library contains 34 songs by the band "Queen" with publication-year "1985", 34 bars are drawn at the x-axis value "1985" along the vertical y-axis.

In addition to that each bar drawn vertically has a specific colour, representing the date when he was last played. As it is shown in our Mock-Up below the range goes from blue (representing songs that have not been played for a long time) to red (representing songs that have been played recently).

Further on the user can interact and influence the characteristic of the visualization by using a slider positioned in the lower right in the graphic below. Via the slider a more objective image can be drawn according to the actual point of interest. This slider with a value-range from "0" to "10" represents the counts how often a song was played. It allows setting a threshold. The default value is "3" and means that songs that were played less than 3 times do not appear coloured, but as grey bars vertically above the coloured ones along the y-axis. If for example someone drags the slider to the position with value "10" and only 1 song out of 27 with a special publishing year was played more than 9 times the "Interpret-Analyser" shows 1 coloured and 26 grey bars at the according year. This could for instance help if someone is on the way to filter out his absolute favourites of an artist/band.

As it is mentioned above the "Interpret-Analyser" represents highly interesting visualizations for End-Users but it might also prevent outstanding features for the Music-Industry respectively bands, who work on a Come-Back. This however would assume to arising the data of a rather big audience, what could for example be achieved via a contest.

Visual Mapping[edit]

2D Diagram:


  • "X-AXIS": the x-axis shows the songs listed vertically by-publication-year generated out of the ID3-Data.


  • "Y-AXIS": the y-axis shows the number of songs published per year according to the specific number in the users' iTunes-library. The height of the vertical bar-column represents the "Song-Occurrence" per year.


  • "Colour": each bar drawn vertically has a specific colour, representing the date when he was last played. As it is shown in our Mock-Up below the range goes from blue (representing songs that have not been played for a long time) to red (representing songs that have been played recently). Grey bars represent songs that did not pass the adjusted threshold.

Specification of used Techniques / applied Principles[edit]

  • Bar X Plot: In this plot, one vertical bar is drawn for each data point [StatSoft, 2003]


  • Histograms, 2D: 2D histograms present a graphical representation of the frequency distribution of the selected variable(s) in which the columns are drawn over the class intervals and the heights of the columns are proportional to the class frequencies. [StatSoft, 2003]


  • Colour-Range, Linking & Brushing: A colour-Range representing the levels between "not played for a long time" and "recently played". (s. slide 100 of Info_Vis0.pdf handed out in the course 188.305 VO InfoVis)


  • Scatterplot, 2D: The scatterplot visualizes a relation (correlation) between two variables X and Y (e.g., weight and height). Individual data points are represented in two-dimensional space (see below), where axes represent the variables (X on the horizontal axis and Y on the vertical axis). The two coordinates (X and Y) that determine the location of each point correspond to its specific values on the two variables. [StatSoft, 2003]


  • Dynamic Queries: Adjusting the slider generates dynamic queries. [StatSoft, 2003]

Possibilities of User-Interaction[edit]

  • Select item of interest (artist/band)
    • Get artist/band-details


  • Adjust the slider to influence the threshold
    • Get individual Visualizations according to the users' point of interest

Mockup / Fake Screenshot[edit]

References[edit]

[Wikipedia, 2005a] ID3, Wikipedia, Last updated: 21 November, 2005, Retrieved at: November 22, 2005, http://www.csam.montclair.edu/~mcdougal/SCP/D_types.htm

[Wikipedia, 2005b] MP3, Wikipedia, Last updated: 21 November, 2005, Retrieved at: November 22, 2005, http://en.wikipedia.org/wiki/Mp3

[Id3.org, 2004] ID3v2 frames, Id3.org, Last updated: 28. February, 2004, Retrieved at: November 22, 2005, http://www.id3.org/frames.html

[Montclaire, 2000] Data Types, Department of Science and Mathematics at Montclair State University, Last updated: 3. August, 2000, Retrieved at: November 22, 2005, http://www.csam.montclair.edu/~mcdougal/SCP/D_types.htm

[StatSoft, 2003] Graphical Analytic Techniques, Last updated: 2003, Retrieved at: November 24, 2005, http://www.statsoft.com/textbook/stgraph.html