Data Libraries: Difference between revisions
Jump to navigation
Jump to search
added Google public data explorer |
Update of InfoVis and VAST Contests →InfoVis Contest Datasets |
||
Line 50: | Line 50: | ||
* [http://sun.cs.lsus.edu/iv06/ InfoVis 2006 Contest] 1% public use microdata sample from the 2002 Census | * [http://sun.cs.lsus.edu/iv06/ InfoVis 2006 Contest] 1% public use microdata sample from the 2002 Census | ||
* [http://eagereyes.org/InfoVisContest2007Data.html InfoVis 2007 Contest] Movie Database | * [http://eagereyes.org/InfoVisContest2007Data.html InfoVis 2007 Contest] Movie Database | ||
* [http://www.merl.com/wmd/infovis.html InfoVis 2008 Contest] MERL motion sensor data | |||
== VAST Challenge Datasets == | |||
*[http://www.cs.umd.edu/hcil/VASTchallenge08/ VAST 2008 Challenge] | |||
** phone records | |||
** geo-temporal records | |||
** Wikipedia edit data and history | |||
** location tracking | |||
*[http://hcil.cs.umd.edu/localphp/hcil/vast/index.php VAST 2009 Challenge] | |||
** Badge and computer network traffic | |||
** Social network (with a very small geospatial component) | |||
** Video | |||
* [http://www.cs.umd.edu/hcil/VASTchallenge2010/ VAST 2010 Challenge] Linkage between the illegal arms dealing and the pandemic outbreak | |||
** Text Records - Investigations into Arms Dealing | |||
** Hospitalization Records - Characterization of Pandemic Spread | |||
** Genetic Sequences – Tracing the Mutations of a Disease | |||
== Other Lists == | == Other Lists == |
Revision as of 08:13, 2 September 2010
- UCI Machine Learning Repository
- UCI KDD Archive UCI Knowledge Discovery in Databases Archive
- UW XML Repository University of Washington XML Data Repository
- StatLib Data, Software and News from the Statistics Community
- Time Series Analysis and Its Applications Book Example Data
- Time Series Data Library
- TimeWeb Web-based time series databanks
- PKDD 2000 Challenge Challenges of European Conferences on Principles and Practice of Knowledge Discovery in Databases
- PhysioNet the research resource for complex physiologic signals
- MatrixMarket A visual repository of test data for use in comparative studies of algorithms for numerical linear algebra, featuring nearly 500 sparse matrices from a variety of applications, as well as matrix generation tools and services.
- LifeLine project Visualising Migrations, Transitions and Trajectories
- 10x10 Every hour, 10x10 gathers the 100 most important words and pictures in the world, based on what's happening in the news.
- InfoVis Cyberinfrastructure Data Bases
- Enron Email Dataset
- Historical Stock Data Historical Data for S&P 500 Stocks
- Ensembl Provides sequence databases of gene, transcript and protein predictions.
- EPA air quality AirData Web site gives you access to air pollution data for the entire United States.
- Network intrusion dataset
- Internet Backbone Data Internet Mapping Project
- UCR Time Series Data Mining Archive A resource for researchers interested in the clustering, classification, indexing, segmentation, change point detection and rule extraction of time series. (by Eamonn Keogh)
- Deutsche Bundesbank Zeitreihen
- Source data sources of the Worldmapper project (Worldmapper project)
- Worldbank Development Data & Statistics
- Business Intelligence Network 2006 Data Visualization Competition (Excel spreadsheet)
- Timeseries by Eamonn Keogh et al.
- U.S. Department of Labor Bureau of Labor Statistics
- National Atlas
- UK Air Quality Archive Air Quality data in the UK from the present back to 1960
- Mitsubishi Electric Research Labs (MERL) Motion sensor data from a network of over 200 sensors for a year (WMD 2007).
- European Road Safety Observatory Road Safety Data
- CARE Community database on road accidents resulting in death or injury (EU)
- IRTAD international database that gathers data on traffic and road accidents from 28 out of the 30 OECD Member countries
- Trends Online A Compendium of Data on Global Change
- Finder! Finder is a browser-based application for finding, organizing and sharing GeoData in common formats.
- ICWSM 2009 Data Challenge data set containing 44 million blog posts; suitable for link analysis, social network extraction, analysis of influence among bloggers, ...
- CKAN Comprehensive Knowledge Archive Network
- Numbrary Numbrary is a free online service dedicated to finding, using and sharing numbers on the web.
- Infochimps Free Redistributable Rich Data Sets
- CISL Research Data Archive CISL Research Data Archive (RDA) - large and diverse collection of meteorological and oceanographic observations, operational and reanalysis model outputs, and remote sensing datasets
- CISL Research Data Archive CISL Research Data Archive (RDA) dataset selection according to variables, time resolution, etc.
- pachube a service that enables you to connect, tag and share real time sensor data from objects, devices, buildings and environments around the world.
- The World Bank Open Data
- Google Public Data Explorer
InfoVis Contest Datasets
- InfoVis 2003 Contest Tree data
- InfoVis 2004 Contest Meta Data of publications
- InfoVis 2005 Contest Technology Trends in the United States
- InfoVis 2006 Contest 1% public use microdata sample from the 2002 Census
- InfoVis 2007 Contest Movie Database
- InfoVis 2008 Contest MERL motion sensor data
VAST Challenge Datasets
- VAST 2008 Challenge
- phone records
- geo-temporal records
- Wikipedia edit data and history
- location tracking
- VAST 2009 Challenge
- Badge and computer network traffic
- Social network (with a very small geospatial component)
- Video
- VAST 2010 Challenge Linkage between the illegal arms dealing and the pandemic outbreak
- Text Records - Investigations into Arms Dealing
- Hospitalization Records - Characterization of Pandemic Spread
- Genetic Sequences – Tracing the Mutations of a Disease