Indexing and querying over versioned text

Current Information Retrieval systems use inverted index structures for effcient query processing. Due to the extremely large size of many data sets, these index structures are usually kept in compressed form, and many techniques for optimizing compressed size and query processing speed have been proposed. In this work, we focus on versioned document collections, that is, collections where each document is modified over time, resulting in multiple versions of the document. Such examples include Wikipedia and historical web pages stored in Internet Archive. We categorize our work into the following serveral layers.