Big Data Processing in the Cloud: a Hydra/Sufia Experience

Big Data Processing in the Cloud: a Hydra/Sufia Experience

http://www.doria.fi/handle/10024/97632

http://www.urn.fi/URN:NBN:fi-fe2014070432268

OR2014_BigData.pdf (Kansalliskirjasto - Doria)

Esitys

Brittle, Collin ; Xie, Zhiwu

2014

This presentation addresses the challenge of processing big data in a cloud-based data repository. Using the Hydra Project’s Hydra and Sufia ruby gems and working with the Hydra community, we created a special repository for the project, and set up background jobs. Our approach is to create the metadata with these jobs, which are distributed across multiple computing cores. This will allow us to scale our infrastructure out on an as-needed basis, and decouples automatic metadata creation from the response times seen by the user. While the metadata is not immediately available after ingestion, it does mean that the object is. By distributing the jobs, we can compute complex properties without impacting the repository server. Hydra and Sufia allowed us to get a head start by giving us a simple self deposit repository, complete with background jobs support via Redis and Resque.

Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014

General Track Papers and Panels

The session was recorded and is <a href="https://connect.funet.fi/p3rrge6sfiw/">available for watching</a> (this presentation starts at 0:00:55)

Tallennettuna:

Kieli

englanti

Aiheet

big data

cloud

hydra

sufia

resque

Kuuluu kokonaisuuteen

Parallel session 3A

Open Repositories 2014

Kommentit (0)

MARC

Big Data Processing in the Cloud: a Hydra/Sufia Experience

Big Data Processing in the Cloud: a Hydra/Sufia Experience

big data

cloud

hydra

sufia

resque

Tämän aineiston tarjoaa

Tämän aineiston tarjoaa

Haku

Big Data Processing in the Cloud: a Hydra/Sufia Experience

Big Data Processing in the Cloud: a Hydra/Sufia Experience

big data

cloud

hydra

sufia

resque

Tämän aineiston tarjoaa

Tämän aineiston tarjoaa