SAILER: An Effective Search Engine for Unified Retrieval of Heterogeneous XML and Web Documents
- Guoliang Li(Tsinghua University)
- Jianhua Feng(Tsinghua University)
- Jianyong Wang(Tsinghua University)
- Xiaoming Song(Tsinghua University)
- Lizhu Zhou(Tsinghua University)
This paper studies the problem of unified ranked retrieval of heterogeneous XML documents and Web data. We propose an effective search engine called Sailer to adaptively and versatilely answer keyword queries over the heterogenous data. We model the Web pages and XML documents as graphs. We propose the concept of pivotal trees to effectively answer keyword queries and present an effective method to identify the $top$-$k$ pivotal trees with the highest ranks from the graphs. Moreover, we propose effective indexes to facilitate the effective unified ranked retrieval. We have conducted an extensive experimental study using real datasets, and the experimental results show that Sailer achieves both high search efficiency and accuracy, and outperforms the existing approaches significantly.
Inquiries can be sent to: