Brussels / 2 & 3 February 2019

schedule

Super-speedy scoring in Lucene 8


Lucene 8 will have some remarkable speed-ups when it comes to querying across large datasets. In this talk I will describe how this has been implemented, from new data structures through to changes in the scoring API, and the trade-offs required to make them possible.

  • Lightning overview of the structure of an inverted index, showing how current queries are executed and documents scored
  • Extension of an inverted index to include Impacts (single level, multi-level)
  • Given impacts, show how a scorer can skip large blocks of documents
  • Restrictions on Similarity implementations to make this possible (scores must increase with docfreq, must be greater than 0, etc)

Speakers

Alan Woodward

Links