FOSDEM is the biggest free and non-commercial event organized by and for the community. Its goal is to provide Free and Open Source developers a place to meet. No registration necessary.

Michaël Figuière
Day Saturday
Room AW1.124
Capacity 59
Start time 17:30
End time 18:00
Duration 00:30
Track Data Analytics devroom

A real-time search engine with Lucene and S4

Introduction to the S4 data stream processing framework and application to real-time indexing for the Apache Lucene search engine.

Search engines have been around for a while, but only recently focus has been made on allowing search on real-time content. To enable such a thing, the whole indexing pipeline has to be made real-time : that is the data processing, and the insertion in the index itself. Lucene has been extended to allow the latter, but the former still has to be handled.

S4 is an emerging technology from Yahoo that simplifies real-time distributed data processing. The goal of this presentation is to show how S4 can be used to enable some expensive pre-processing on a stream of incoming data, right before its indexing, thus bringing a powerful real-time search capability.

Level of experience of audience : Average knowledge of Lucene

Additional note :

The talk will very quickly introduce the S4 philosophy. The rest of the presentation will be driven by a real world example of the development of a real-time search engine.

Events that start after this one (within 30 minutes):

When Event Track Where
18:00-18:15 How Seeks let you do your Web search at home Data Analytics AW1.124
18:00-18:15 OpenSC in 2015 Security & hardware crypto AW1.105
18:00-18:15 Aalto-1: A nanosatellite using open source Lightning Talks Ferrer
18:00-18:25 Multi-Master Replication Approaches MySQL & friends H.2213
18:00-18:30 Lessons Open Sourcing Java Taught Me Free Java AW1.125
18:00-18:45 "KDE Abstracted My Abstraction Layer" - Multimedia Style Crossdesktop H.1309
18:00-19:00 Mancoosi tools for the analysis and quality assurance of FOSS distributions CrossDistro H.1308
18:00-19:00 Introduction to FreeBSD BSD AW1.126
18:00-19:00 Gentoo Q&A session CrossDistro H.1302
18:00-19:00 Booting and upgrading a flashless system Embedded Lameere
18:15-18:30 Graph databases, the Web of Data storage engines Data Analytics AW1.124
18:15-18:30 GNUstep on OpenBSD, a short overview World of GNUstep AW1.117
18:15-19:00 Open Panel Discussion Security & hardware crypto AW1.105
18:20-18:35 chicken: Cheney-on-the-MTA Lightning Talks Ferrer
18:20-18:50 GNU recutils - your data in plain text GNU H.2214
18:30-19:00 Comparing Scalable NOSQL Databases: Functionality and Measurements Data Analytics AW1.124
18:30-18:55 A practical overview of Maatkit MySQL & friends H.2213
18:30-19:00 The Rise and Fall and Rise of Java Free Java AW1.125
18:30-19:00 The latest on Gorm an GNUstep theming World of GNUstep AW1.117
18:30-19:00 CloudB: a distributed hybrid storage system for the Mono framework Mono AW1.120