fosdem.org

Speakers
	Isabel Drost
Schedule
Day	Sunday
Room	Janson
Start time	14:00
End time	14:45
Duration	00:45
Info
Event type	Podium
Track	Scalability
Language	English
Media
Video (DIVX)

Large scale data analysis made easy - Apache Hadoop

The goal of Apache Hadoop is to make large scale data analysis easy. Hadoop implements a distributed filesystem based on the dieas behind GFS, the Google File System. With Map/Reduce it provides an easy way to implement parallel algorithms.

Storage has become ever cheaper in recent years. Currently one terabyte of harddisk space costs less than 100 Euros. As a result a growing number of businesses have started collecting and digitizing data: Custumer transaction logs, news articles published over decades, crawls of parts o f the world wide web are only few use cases that produce large amounts of data. But with petabytes of data at your fingertips the question of how to make ad-hoc as well as continuous processing efficient arises.

After motivating the neeed for a distributed library the talk gives an introduction to Hadoop detailing its strengths and weaknesses. It gives an introduction on how to quickly get your own Map/Reduce jobs up and running. The talk closes with an overview of the Hadoop ecosystem.

Other events at the same time:

When	Event	Track	Where
13:00-14:30	LPI exam session 4	Certification	Guillissen
13:30-14:15	Front end perfomance	Drupal	H.2214
13:45-14:15	10x performance improvements - A case study	MySQL	AW1.121
13:45-14:30	The Semantic Desktop, SPARQL and You!	CrossDesktop	H.1309
13:45-14:15	ParallelFx, bringing Mono applications in the multicore era	Mono	H.2213
13:45-14:30	Debian and Ubuntu	Distributions	H.1308
14:00-14:45	Postgresql: Lists and Recursion and Trees (oh my)	Database	Chavanne
14:00-14:15	Open-source software: Blaming the unknown, or a constructive approach to technology	Lightning Talks	Ferrer
14:00-14:30	OpenSound System v4 port to Haiku	Alt-OS	AW1.105
14:00-14:45	MDB and MDBX: Open Source SimpleDB Projects based on GTM	NoSQL	AW1.120
14:00-14:30	Objective-C 2.0: libobjc2 and Clang, current status, plans for the future	GNUstep	AW1.117
14:00-14:45	The free software desktop’s graphics driver stack	X.org	AW1.124
14:00-14:30	Explore Jetpack	Mozilla	H.1301
14:00-15:00	Rockbox: open source firmware replacement for music players	Embedded	Lameere
14:00-14:45	Because the License Matters: BSD as the Foundation for Commercial Point of Sale Applications	BSD	AW1.126
14:15-14:30	Kaizendo.org: Customizing schoolbooks the free software way	Lightning Talks	Ferrer
14:15-14:45	Correcting replication data drift with Maatkit	MySQL	AW1.121
14:15-15:00	How to setup the perfect development environment	Drupal	H.2214
14:30-14:45	The Wiki for Open Technologies: How to share your projects and knowledge	Lightning Talks	Ferrer
14:30-15:15	Nepomuk	CrossDesktop	H.1309
14:30-15:00	Generating Driver Source Code with Rathaxes	Alt-OS	AW1.105
14:30-15:30	Building The Virtual Babel: Mono In Second Life	Mono	H.2213
14:30-15:15	Mozilla Drumbeat in Europe	Mozilla	H.1301

fosdem.org

User login

Other events at the same time: