Data Libraries: Difference between revisions

From InfoVis:Wiki
Jump to navigation Jump to search
Arind (talk | contribs)
No edit summary
 
(26 intermediate revisions by 2 users not shown)
Line 2: Line 2:
*[http://www.wolframalpha.com Wolfram|Alpha]
*[http://www.wolframalpha.com Wolfram|Alpha]
*[http://lib.berkeley.edu/wikis/datalab/Main/GoogleSearch Social Science Data Search] Google Custom Search Engine @ Berkeley Library (targets 800+ academic, government agency, non-profit, and other web sites that provide high quality, downloadable statistical information and data sets. Emphasis is on data pertaining to the social sciences, health, developing countries, energy, natural resources, and the environment)
*[http://lib.berkeley.edu/wikis/datalab/Main/GoogleSearch Social Science Data Search] Google Custom Search Engine @ Berkeley Library (targets 800+ academic, government agency, non-profit, and other web sites that provide high quality, downloadable statistical information and data sets. Emphasis is on data pertaining to the social sciences, health, developing countries, energy, natural resources, and the environment)
*[https://developers.google.com/search/docs/data-types/dataset Google Dataset Search]


== Misc ==
== Misc ==  


*[http://hcil.cs.umd.edu/localphp/hcil/vast/archive/ Visual Analytics Benchmark Repository]
*[http://hcil.cs.umd.edu/localphp/hcil/vast/archive/ Visual Analytics Benchmark Repository]
Line 12: Line 13:
*[http://lib.stat.cmu.edu/general/tsa/tsa.html Time Series Analysis and Its Applications] Book Example Data
*[http://lib.stat.cmu.edu/general/tsa/tsa.html Time Series Analysis and Its Applications] Book Example Data
*[http://www-personal.buseco.monash.edu.au/~hyndman/TSDL/ Time Series Data Library]
*[http://www-personal.buseco.monash.edu.au/~hyndman/TSDL/ Time Series Data Library]
*[http://www.bized.ac.uk/timeweb/ TimeWeb] Web-based time series databanks
*[http://www.bized.co.uk/timeweb/ TimeWeb] Web-based time series databanks
*[http://lisp.vse.cz/pkdd99/Challenge/ PKDD 2000 Challenge] Challenges of European Conferences on Principles and Practice of Knowledge Discovery in Databases
*[http://lisp.vse.cz/pkdd99/Challenge/ PKDD 2000 Challenge] Challenges of European Conferences on Principles and Practice of Knowledge Discovery in Databases
*[http://www.physionet.org/ PhysioNet] the research resource for complex physiologic signals
*[http://www.physionet.org/ PhysioNet] the research resource for complex physiologic signals
Line 44: Line 45:
*[http://numbrary.com/ Numbrary] Numbrary is a free online service dedicated to finding, using and sharing numbers on the web.
*[http://numbrary.com/ Numbrary] Numbrary is a free online service dedicated to finding, using and sharing numbers on the web.
*[http://infochimps.org/ Infochimps] Free Redistributable Rich Data Sets
*[http://infochimps.org/ Infochimps] Free Redistributable Rich Data Sets
*[http://http://www.redliondata.com/ Red Lion Data] Large catalog of retailer location datasets in csv
*[http://dss.ucar.edu/catalogs/free.html CISL Research Data Archive] CISL Research Data Archive (RDA) - large and diverse collection of meteorological and oceanographic observations, operational and reanalysis model outputs, and remote sensing datasets
*[http://dss.ucar.edu/catalogs/free.html CISL Research Data Archive] CISL Research Data Archive (RDA) - large and diverse collection of meteorological and oceanographic observations, operational and reanalysis model outputs, and remote sensing datasets
*[http://search.dss.ucar.edu/cgi-bin/rdabrowse?nb=true&c=list&cv=All+RDA+Datasets CISL Research Data Archive] CISL Research Data Archive (RDA) dataset selection according to variables, time resolution, etc.
*[http://search.dss.ucar.edu/cgi-bin/rdabrowse?nb=true&c=list&cv=All+RDA+Datasets CISL Research Data Archive] CISL Research Data Archive (RDA) dataset selection according to variables, time resolution, etc.
Line 61: Line 63:
*[http://aws.amazon.com/publicdatasets Amazon Public Data Sets]
*[http://aws.amazon.com/publicdatasets Amazon Public Data Sets]
*[http://www.githubarchive.org/ GitHub Archive] GitHub's public timeline is a huge time-oriented data source (e.g. commits to hosted open source projects)
*[http://www.githubarchive.org/ GitHub Archive] GitHub's public timeline is a huge time-oriented data source (e.g. commits to hosted open source projects)
*[http://www.correlatesofwar.org/ Correlates of War] All wars in a CSV file.
*[http://www.ihapss.jhsph.edu/data/NMMAPS/documentation/frame.htm NMMAPS] health, climate, pollution time series
*[http://www.quandl.com/ Quandl] financial, economic and social datasets
*[http://openfoodfacts.org/ Open Food Facts] Open Food Facts is a free, open and collaborative database of food products from the entire world.
*[https://www.wikidata.org/ Wikidata] Wikidata is a free linked database that can be read and edited by both humans and machines.
*[https://aws.amazon.com/public-data-sets/ AWS Public Data Sets] AWS hosts a variety of public data sets that anyone can access for free.
*[http://www.makeovermonday.co.uk/data/ Makeover Monday] Datasets available from the Makeover Monday initiative
*[https://datahub.io Datahub] the free, powerful data management platform from Open Knowledge International, based on the CKAN data management system.
== Medicine ==
*[http://physionet.org/challenge/ PhysioNet/Computing in Cardiology Challenges]
*[http://physionet.org/challenge/ PhysioNet/Computing in Cardiology Challenges]
*[https://wiki.openmrs.org/display/RES/Demo+Data OpenMRS Demo Data]


== Geography ==
== Geography ==
Line 68: Line 81:
*[http://www.openstreetmap.org/ OpenStreetMap]
*[http://www.openstreetmap.org/ OpenStreetMap]
*[http://www.geocommons.com/ GeoCommons] Both data and a mapmaker
*[http://www.geocommons.com/ GeoCommons] Both data and a mapmaker
*[http://www.gadm.org/country Global Administrative Areas] administrative borders for many countries in the world in different formats
*[http://www.datamaps.eu/2013/07/17/oesterreichs-verwaltungsgrenzen-im-geojson-format/ Austrias borders in GeoJSON]


== World ==
== World ==
Line 76: Line 93:
*[http://stats.oecd.org/ OECD Statistics] Major source for economic indicators.
*[http://stats.oecd.org/ OECD Statistics] Major source for economic indicators.
*[http://data.worldbank.org World Bank] Data for hundreds of indicators and developer-friendly.
*[http://data.worldbank.org World Bank] Data for hundreds of indicators and developer-friendly.
*[http://aiddata.org AidData] Open Data for International Development
*[https://github.com/factbook/factbook.json factbook.json] World Factbook Country Profiles in JSON - Free Open Public Domain Data


== Open Government Data / Government and Politics==
== Open Government Data / Government and Politics==
=== European Union ===
*[http://www.europeandataportal.eu/ European Data Portal]
=== Austria ===
=== Austria ===
*[http://gov.opendata.at Open Government Data Austria]
*[http://gov.opendata.at Open Government Data Austria]
Line 84: Line 107:
*[http://data.gv.at Offene Daten Österreich]
*[http://data.gv.at Offene Daten Österreich]
*[http://data.umweltbundesamt.at/ data.umweltbundesamt.at] Environment Agency Austria
*[http://data.umweltbundesamt.at/ data.umweltbundesamt.at] Environment Agency Austria
*[http://www.datamaps.eu/2013/07/17/oesterreichs-verwaltungsgrenzen-im-geojson-format/ Austrias borders in GeoJSON]
*[http://www.aussda.at/ AUSSDA] Austrian Social Science Data Archive


=== UK ===
=== UK ===
Line 103: Line 128:
* [http://eagereyes.org/InfoVisContest2007Data.html InfoVis 2007 Contest] Movie Database
* [http://eagereyes.org/InfoVisContest2007Data.html InfoVis 2007 Contest] Movie Database
* [http://www.merl.com/wmd/infovis.html InfoVis 2008 Contest] MERL motion sensor data
* [http://www.merl.com/wmd/infovis.html InfoVis 2008 Contest] MERL motion sensor data
== BioVis Contest Datasets ==
* [http://www.biovis.net/year/2013/info/contest-data BioVis 2013 Contest] Protein Mutations and their effect on Protein Function
== Network Data ==
* [https://snap.stanford.edu/data/ SNAP] Stanford Large Network Dataset Collection by Jure Leskovec


== Other Lists ==
== Other Lists ==
*[http://graphics.stanford.edu/~klingner/online_databases.html Jeff Klingner's List of Online Databases]
*[http://graphics.stanford.edu/~klingner/online_databases.html Jeff Klingner's List of Online Databases]
*[http://www.visualisingdata.com/index.php/2013/07/a-big-collection-of-sites-and-services-for-accessing-data/ Essential Resources: A big collection of sites and services for accessing data] (by Andy Kirk)


== Tools for Creating Synthetic Datasets ==
== Tools for Creating Synthetic Datasets ==
*[http://www.gris.tu-darmstadt.de/research/vissearch/projects/pcdc-synthetic-data-generation/index.html PCDC - On the Highway to Data] A Tool for the Fast Generation of Large Synthetic Data Sets (by TU Darmstadt)
*[http://www.gris.tu-darmstadt.de/research/vissearch/projects/pcdc-synthetic-data-generation/index.html PCDC - On the Highway to Data] A Tool for the Fast Generation of Large Synthetic Data Sets (by TU Darmstadt)


== Map Data ==
*[http://www.gadm.org/ Global Administrative Areas] GADM is a spatial database of the location of the world's administrative areas (or adminstrative boundaries) for use in GIS and similar software.
*[http://vdstech.com/map-data.aspx VDS Technologies GIS & Mapping Components]
== Armed Conflicts Data ==
* Armed Conflict Location and Event Data (ACLED): C. Raleigh, A. Linke, H. Hegre, and J. Karlsen. “Introducing ACLED: An Armed Conflict Location and Event Dataset: Special Data Feature.” In: Journal of Peace Research 47.5 (2010), pp. 651–660.
* Uppsala Conflict Data Project – Georeferenced Event Dataset (GED): R. Sundberg and E. Melander. “Introducing the UCDP Georeferenced Event Dataset.” In: Journal of Peace Research 50.4 (2013), pp. 523–532.
* [http://www.start.umd.edu/gtd Global Terrorism Database (GTD)]
* Social Conflict Analysis Database (SCAD): I. Salehyan, C. S. Hendrix, J. Hamner, C. Case, C. Linebarger, E. Stull, and J. Williams. “Social Conflict in Africa: A New Database.” In: International Interactions 38.4 (2012), pp. 503–511.
* [https://www.meltt.net/  VEHICLE] Web application to analyze the results of hierarchically integrated conflict event data (Benedikt Mayer, 2024)


[[Category:Web resources]]
[[Category:Web resources]]

Latest revision as of 08:05, 13 August 2024

Search Engines

  • Wolfram|Alpha
  • Social Science Data Search Google Custom Search Engine @ Berkeley Library (targets 800+ academic, government agency, non-profit, and other web sites that provide high quality, downloadable statistical information and data sets. Emphasis is on data pertaining to the social sciences, health, developing countries, energy, natural resources, and the environment)
  • Google Dataset Search

Misc

Medicine

Geography

(based on a list in Nathan Yau's book "Visualize This")

  • TIGER From the Census Bureau, probably the most extensive detailed data about roads, railroads, rivers, and ZIP codes you can find.
  • OpenStreetMap
  • GeoCommons Both data and a mapmaker


World

(based on a list in Nathan Yau's book "Visualize This")


  • AidData Open Data for International Development
  • factbook.json World Factbook Country Profiles in JSON - Free Open Public Domain Data

Open Government Data / Government and Politics

European Union

Austria

UK

  • Data.gov.uk Catalog for data supplied by government organizations.

USA

  • Census Bureau extensive demographics.
  • Data.gov Catalog for data supplied by government organizations.
  • DataSF Data specific to San Francisco.
  • NYC Data specific to New York.
  • Follow the Money Big set of tools and datasets to investigate money in state politics.
  • OpenSecrets provides details on government spending and lobbying.

InfoVis Contest Datasets

BioVis Contest Datasets

Network Data

  • SNAP Stanford Large Network Dataset Collection by Jure Leskovec

Other Lists

Tools for Creating Synthetic Datasets

Map Data

Armed Conflicts Data

  • Armed Conflict Location and Event Data (ACLED): C. Raleigh, A. Linke, H. Hegre, and J. Karlsen. “Introducing ACLED: An Armed Conflict Location and Event Dataset: Special Data Feature.” In: Journal of Peace Research 47.5 (2010), pp. 651–660.
  • Uppsala Conflict Data Project – Georeferenced Event Dataset (GED): R. Sundberg and E. Melander. “Introducing the UCDP Georeferenced Event Dataset.” In: Journal of Peace Research 50.4 (2013), pp. 523–532.
  • Global Terrorism Database (GTD)
  • Social Conflict Analysis Database (SCAD): I. Salehyan, C. S. Hendrix, J. Hamner, C. Case, C. Linebarger, E. Stull, and J. Williams. “Social Conflict in Africa: A New Database.” In: International Interactions 38.4 (2012), pp. 503–511.
  • VEHICLE Web application to analyze the results of hierarchically integrated conflict event data (Benedikt Mayer, 2024)