Haku

Save Consumers Time and Money: Thou Shall Not Forget Digital Native Big Data Consumers

QR-koodi

Save Consumers Time and Money: Thou Shall Not Forget Digital Native Big Data Consumers

In order to accommodate all types of data consumers, the Census Bureau (CB) distributed their 9,060-variable Census 2010 Summary File 1 (SF1) tables into 49 segment files, each with variables not exceeding 256 so as not to exceed the older generation spreadsheet column limit. By providing the data in segments, not in full, digital native big data consumers in the U.S. and all over the world who have the technical and logistical capacity to process big data have to process the summary files in the same manner as other data consumers. This translates cumulatively to thousands of person hours spent following the multi-step process of preparing and merging the segments to extract needed information. These costs could have been avoided had the CB or repositories distributing the SF1 also made available full datasets in one file with big data consumers in mind. This is precisely what the CISER Data Archive implemented as it found a repository niche – making available full datasets of the SF1 for free to Big Data Consumers in an easy, one-click download fashion.

Poster at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014

Posters, Demos and Developer "How-To's"

Tallennettuna: