Offline search

Program for experimenting with search in offline in MapReduce paradigm. Processing of 10 queries lasts for 30 minutes on 500 machines cluster. Idea of the search is based on linear pass through all texts with using of big amount of nodes, in parallel. Classic search approach allows to search much more faster(seconds), but it needs special structure – inverted index. Building of inverted index tooks time. So, offline search is a reasonable tradeoff between flexibility of development and speed of search.

Leave a Reply