BEGIN:VCALENDAR VERSION:2.0 PRODID:-//Pentabarf//Schedule 0.3//EN CALSCALE:GREGORIAN METHOD:PUBLISH X-WR-CALDESC;VALUE=TEXT:Open source search devroom X-WR-CALNAME;VALUE=TEXT:Open source search devroom X-WR-TIMEZONE;VALUE=TEXT:Europe/Brussels BEGIN:VEVENT METHOD:PUBLISH UID:3494@FOSDEM15@fosdem.org TZID:Europe-Brussels DTSTART:20150131T103000 DTEND:20150131T103500 SUMMARY:Welcoming Remarks DESCRIPTION: CLASS:PUBLIC STATUS:CONFIRMED CATEGORIES:Open source search URL:https:/fosdem.org/2015/schedule/2015/schedule/event/search_welcome/ LOCATION:UA2.114 (Baudoux) ATTENDEE;ROLE=REQ-PARTICIPANT;CUTYPE=INDIVIDUAL;CN="Leslie Hawthorn":invalid:nomail END:VEVENT BEGIN:VEVENT METHOD:PUBLISH UID:3391@FOSDEM15@fosdem.org TZID:Europe-Brussels DTSTART:20150131T103500 DTEND:20150131T112000 SUMMARY:Sphinx Search technical highlights DESCRIPTION:
This talk will provide you key technical highlights about Sphinx as a full text search engine from internal architecture overview to scalability and high availability strategies.
CLASS:PUBLIC STATUS:CONFIRMED CATEGORIES:Open source search URL:https:/fosdem.org/2015/schedule/2015/schedule/event/search_sphinx/ LOCATION:UA2.114 (Baudoux) ATTENDEE;ROLE=REQ-PARTICIPANT;CUTYPE=INDIVIDUAL;CN="Vlad Fedorkov":invalid:nomail END:VEVENT BEGIN:VEVENT METHOD:PUBLISH UID:3474@FOSDEM15@fosdem.org TZID:Europe-Brussels DTSTART:20150131T112500 DTEND:20150131T121000 SUMMARY:Effective spelling correction with term relation graphs using Lucene FSTs DESCRIPTION:Our approach for doing spelling correction has deviated considerably from the default approach Lucene is offering. While Lucene's FuzzyQuery uses a compiled Automaton to filter the dictionary rather efficiently, we directly use normal Automatons and intersect that automaton with a separate FST. This proves to be more efficient for our use case, since it saves memory and time to compile the automaton and also part of the time to identify the matching terms in the dictionary.
This approach gives us the possibility to store any meta-information with each term, which is used then to pick the top N spellcorrections. We build a term co-occurence graph, where each vertex is a possible spelling correction of a term and each link is the co-occurence in the same document. Each vertex and link get a score based on the meta-information and edit distance. Then we use graph reduction techniques until the graph contains the desired number of spellcorrections.
CLASS:PUBLIC STATUS:CONFIRMED CATEGORIES:Open source search URL:https:/fosdem.org/2015/schedule/2015/schedule/event/effective_spelling_correction_with_term_relation_graphs_using_lucene_fsts/ LOCATION:UA2.114 (Baudoux) ATTENDEE;ROLE=REQ-PARTICIPANT;CUTYPE=INDIVIDUAL;CN="Anna Ohanyan":invalid:nomail END:VEVENT BEGIN:VEVENT METHOD:PUBLISH UID:3388@FOSDEM15@fosdem.org TZID:Europe-Brussels DTSTART:20150131T121500 DTEND:20150131T130000 SUMMARY:Apache Solr as a compressed, scalable and high performance time series database DESCRIPTION:How can you store 86 billion measurement points in 30 gigabytes and search through it within a few milliseconds on your laptop? A relational database? No! Apache Solr empowers you to do this if you follow a few concepts. The horizontal scalability capabilities of Solr bores away the borders of your laptop and allows you easily to scale out to your needs. In this code intense session we give an introduction how to store time series data in Apache Solr.
CLASS:PUBLIC STATUS:CONFIRMED CATEGORIES:Open source search URL:https:/fosdem.org/2015/schedule/2015/schedule/event/apache_solr_as_a_compressed,_scalable_and_high_performance_time_series_database/ LOCATION:UA2.114 (Baudoux) ATTENDEE;ROLE=REQ-PARTICIPANT;CUTYPE=INDIVIDUAL;CN="Florian Lautenschlager":invalid:nomail END:VEVENT BEGIN:VEVENT METHOD:PUBLISH UID:3473@FOSDEM15@fosdem.org TZID:Europe-Brussels DTSTART:20150131T130500 DTEND:20150131T135000 SUMMARY:Elasticsearch from the Bottom Up DESCRIPTION:This talk will teach you about Elasticsearch and Lucene's architecture.
The key data structure in search is the powerful inverted index, which is actually simple to understand. We start there, then ascend through abstraction layers to get an overview of how a distributed search cluster processes searches and changes.
CLASS:PUBLIC STATUS:CONFIRMED CATEGORIES:Open source search URL:https:/fosdem.org/2015/schedule/2015/schedule/event/elasticsearch_from_the_bottom_up/ LOCATION:UA2.114 (Baudoux) ATTENDEE;ROLE=REQ-PARTICIPANT;CUTYPE=INDIVIDUAL;CN="Alex Brasetvik":invalid:nomail END:VEVENT BEGIN:VEVENT METHOD:PUBLISH UID:3453@FOSDEM15@fosdem.org TZID:Europe-Brussels DTSTART:20150131T140000 DTEND:20150131T144500 SUMMARY:Apache Lucene 5 DESCRIPTION:Around FOSDEM 2015, the first alpha/beta releases of Apache Lucene 5 will likely be downloadable. This talk will present the improvements and new features, but also some incompatible changes in the Lucene 5 release. Lucene 5 will focus on data safety: The move to Java 7 will be completed. Lucene now uses all the brand new features (NIO.2) of Java 7 to make the indexing process more stable and resulting indexes durable. Checksums are used during merging to prevent bugs in the underlying JVM or data corruption due to networking errors (e.g., while distributing indexes during recovery in Elasticsearch) to persist in newly created index segments.
CLASS:PUBLIC STATUS:CONFIRMED CATEGORIES:Open source search URL:https:/fosdem.org/2015/schedule/2015/schedule/event/apache_lucene_5/ LOCATION:UA2.114 (Baudoux) ATTENDEE;ROLE=REQ-PARTICIPANT;CUTYPE=INDIVIDUAL;CN="Uwe Schindler":invalid:nomail END:VEVENT BEGIN:VEVENT METHOD:PUBLISH UID:3477@FOSDEM15@fosdem.org TZID:Europe-Brussels DTSTART:20150131T145000 DTEND:20150131T153500 SUMMARY:Searching over streams with Luwak and Apache Samza DESCRIPTION:Traditional searches take the form of individual queries over large mostly-stable corpuses of documents. In this talk, we'll show how we invert this paradigm to allow for searching over streams of documents by combining Samza, a distributed stream-processing framework, with Luwak, a library for efficiently running large numbers of queries over individual documents.
CLASS:PUBLIC STATUS:CONFIRMED CATEGORIES:Open source search URL:https:/fosdem.org/2015/schedule/2015/schedule/event/searching_over_streams_with_luwak_and_apache_samza/ LOCATION:UA2.114 (Baudoux) ATTENDEE;ROLE=REQ-PARTICIPANT;CUTYPE=INDIVIDUAL;CN="Alan Woodward":invalid:nomail END:VEVENT BEGIN:VEVENT METHOD:PUBLISH UID:3449@FOSDEM15@fosdem.org TZID:Europe-Brussels DTSTART:20150131T154500 DTEND:20150131T161500 SUMMARY:EBISearch - Biological data search engine DESCRIPTION:EBISearch is a text search engine providing access to biological dataresources hosted at EMBL-EBI. We will speak about the history of thisengine, the infrastracture used, some statistics and future plans.
CLASS:PUBLIC STATUS:CONFIRMED CATEGORIES:Open source search URL:https:/fosdem.org/2015/schedule/2015/schedule/event/ebisearch_biological_data_search_engine/ LOCATION:UA2.114 (Baudoux) ATTENDEE;ROLE=REQ-PARTICIPANT;CUTYPE=INDIVIDUAL;CN="Nicola Buso":invalid:nomail END:VEVENT BEGIN:VEVENT METHOD:PUBLISH UID:3463@FOSDEM15@fosdem.org TZID:Europe-Brussels DTSTART:20150131T163000 DTEND:20150131T171500 SUMMARY:The Typed Index DESCRIPTION:Besides issues of scaling your search, there are very important aspects concerning search-quality that should not be neglected. Search-quality is mainly controlled by analysis and query parsing. I will talk about frequent problems concerning Lucene analysis and I will show on several examples that usually one analyzer (normal form) is not sufficient, even in a mono-lingual environment but especially in multilingual environments. I will show how our typed index approach allows to solve many of these problems and how we plan to properly treat even mixed-language documents based on this approach.
CLASS:PUBLIC STATUS:CONFIRMED CATEGORIES:Open source search URL:https:/fosdem.org/2015/schedule/2015/schedule/event/the_typed_index/ LOCATION:UA2.114 (Baudoux) ATTENDEE;ROLE=REQ-PARTICIPANT;CUTYPE=INDIVIDUAL;CN="Christoph Goller":invalid:nomail END:VEVENT BEGIN:VEVENT METHOD:PUBLISH UID:3472@FOSDEM15@fosdem.org TZID:Europe-Brussels DTSTART:20150131T172000 DTEND:20150131T180500 SUMMARY:Querying your datagrid with Lucene, Hadoop and Spark DESCRIPTION:Key/Value stores rely on a simple data model represented by a map, where each key appears once. Using such a structure does not necessarily mean giving up on query expressiveness and capability.This talk will demonstrate what Infinispan can do to empower your analytics needs, from directly running Lucene Queries in a cluster to Hadoop Map Reduce and Spark.
CLASS:PUBLIC STATUS:CONFIRMED CATEGORIES:Open source search URL:https:/fosdem.org/2015/schedule/2015/schedule/event/querying_your_datagrid_with_lucene,_hadoop_and_spark/ LOCATION:UA2.114 (Baudoux) ATTENDEE;ROLE=REQ-PARTICIPANT;CUTYPE=INDIVIDUAL;CN="Gustavo Fernandes":invalid:nomail END:VEVENT BEGIN:VEVENT METHOD:PUBLISH UID:3400@FOSDEM15@fosdem.org TZID:Europe-Brussels DTSTART:20150131T180500 DTEND:20150131T185000 SUMMARY:ELK, making sense of your data (not just for logs!) DESCRIPTION:Elasticsearch, together with his cousins Logstash and Kibana, provide you with a great environment to analyse your data. In this talk we’re going to get inside the eclipse project, analysing what going on, their bugs, etc. If you were wondering what can ELK do for you, this is your talk.
CLASS:PUBLIC STATUS:CONFIRMED CATEGORIES:Open source search URL:https:/fosdem.org/2015/schedule/2015/schedule/event/elk,_making_sense_of_your_data_not_just_for_logs!/ LOCATION:UA2.114 (Baudoux) ATTENDEE;ROLE=REQ-PARTICIPANT;CUTYPE=INDIVIDUAL;CN="Pere Urbon-Bayes":invalid:nomail END:VEVENT BEGIN:VEVENT METHOD:PUBLISH UID:3495@FOSDEM15@fosdem.org TZID:Europe-Brussels DTSTART:20150131T185000 DTEND:20150131T190000 SUMMARY:Closing Remarks DESCRIPTION: CLASS:PUBLIC STATUS:CONFIRMED CATEGORIES:Open source search URL:https:/fosdem.org/2015/schedule/2015/schedule/event/search_closing/ LOCATION:UA2.114 (Baudoux) ATTENDEE;ROLE=REQ-PARTICIPANT;CUTYPE=INDIVIDUAL;CN="Leslie Hawthorn":invalid:nomail END:VEVENT END:VCALENDAR