skip to main content
10.1145/1989284.1989322acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
research-article

Rewrite rules for search database systems

Published: 13 June 2011 Publication History

Abstract

The results of a search engine can be improved by consulting auxiliary data. In a search database system, the association between the user query and the auxiliary data is driven by rewrite rules that augment the user query with a set of alternative queries. This paper develops a framework that formalizes the notion of a rewrite program, which is essentially a collection of hedge-rewriting rules. When applied to a search query, the rewrite program produces a set of alternative queries that constitutes a least fixpoint (lfp). The main focus of the paper is on the lfp-convergence of a rewrite program, where a rewrite program is lfp-convergent if the least fixpoint of every search query is finite. Determining whether a given rewrite program is lfp-convergent is undecidable; to accommodate that, the paper proposes a safety condition, and shows that safety guarantees lfp-convergence, and that safety can be decided in polynomial time. The effectiveness of the safety condition in capturing lfp-convergence is illustrated by an application to a rewrite program in an implemented system that is intended for widespread use.

References

[1]
T. Arts and J. Giesl. Termination of term rewriting using dependency pairs. Theor. Comput. Sci., 236(1-2):133--178, 2000.
[2]
J. Bear, D. J. Israel, J. Petit, and D. L. Martin. Using information extraction to improve document retrieval. In TREC, pages 367--377, 1997.
[3]
A. B. Cherifa and P. Lescanne. Termination of rewriting systems by polynomial interpretations and its implementation. Sci. Comput. Program., 9(2):137--159, 1987.
[4]
N. Dershowitz. Orderings for term-rewriting systems. Theor. Comput. Sci., 17:279--301, 1982.
[5]
N. Dershowitz. Termination. In RTA, volume 202 of Lecture Notes in Computer Science, pages 180--224. Springer, 1985.
[6]
R. Fagin, B. Kimelfeld, Y. Li, S. Raghavan, and S. Vaithyanathan. Understanding queries in a search database system. In PODS, pages 273--284. ACM, 2010.
[7]
O. Fissore, I. Gnaedig, and H. Kirchner. A proof of weak termination providing the right way to terminate. In ICTAC, volume 3407 of Lecture Notes in Computer Science, pages 356--371. Springer, 2004.
[8]
J. V. Guttag, D. Kapur, and D. R. Musser. On proving uniform termination and restricted termination of rewriting systems. SIAM J. Comput., 12(1):189--214, 1983.
[9]
M. A. Hearst. Direction-based text interpretation as an information access refinement. In P. S. Jacobs, editor, Text-Based Intelligent Systems: Current Research and Practice in Information Extraction and Retrieval, pages 257--274. Erlbaum, Hillsdale, 1992.
[10]
P. S. Jacobs. Introduction: Text power and intelligent systems. In P. S. Jacobs, editor, Text-Based Intelligent Systems: Current Research and Practice in Information Extraction and Retrieval, pages 1--8. Erlbaum, Hillsdale, 1992.
[11]
F. Jacquemard and M. Rusinowitch. Closure of hedge-automata languages by hedge rewriting. In RTA, volume 5117 of Lecture Notes in Computer Science, pages 157--171. Springer, 2008.
[12]
E. Kandogan, R. Krishnamurthy, S. Raghavan, S. Vaithyanathan, and H. Zhu. Avatar semantic search: a database approach to information retrieval. In SIGMOD Conference, pages 790--792. ACM, 2006.
[13]
S. Kaplan. Conditional rewrite rules. Theor. Comput. Sci., 33:175--193, 1984.
[14]
D. Knuth and P. Bendix. Simple word problems in universal algebra. In J. Leech, editor, Computational Problems in Abstract Algebra, pages 263--297. Pergamon Press, 1970.
[15]
D. König. Theorie der Endlichen und Unendlichen Graphen: Kombinatorische Topologie der Streckenkomplexe. Akad. Verlag, Leipzig, 1936.
[16]
K. Korovin and A. Voronkov. Orienting rewrite rules with the Knuth-Bendix order. Inf. Comput., 183(2):165--186, 2003.
[17]
R. Krishnamurthy, Y. Li, S. Raghavan, F. Reiss, S. Vaithyanathan, and H. Zhu. SystemT: a system for declarative information extraction. SIGMOD Record, 37(4):7--13, 2008.
[18]
D. S. Lankford. On proving term rewriting systems are Noetherian. Technical report, Mathematics Department, Louisiana Tech. University, Ruston, 1979.
[19]
D. D. Lewis. Text representation for intelligent text retrieval: A classification-oriented view. In P. S. Jacobs, editor, Text-Based Intelligent Systems: Current Research and Practice in Information Extraction and Retrieval, pages 179--197. Erlbaum, Hillsdale, 1992.
[20]
S. Lucas. Context-sensitive computations in confluent programs. In PLILP, volume 1140 of Lecture Notes in Computer Science, pages 408--422. Springer, 1996.
[21]
F. Neven. Automata theory for XML researchers. SIGMOD Record, 31(3):39--46, 2002.
[22]
B. Pang and L. Lee. Using very simple statistics for review search: An exploration. In Proceedings of COLING: Companion volume: Posters, pages 73--76, 2008.
[23]
Y. Qiu and H.-P. Frei. Concept based query expansion. In SIGIR, pages 160--169. ACM, 1993.
[24]
F. Reiss, S. Raghavan, R. Krishnamurthy, H. Zhu, and S. Vaithyanathan. An algebraic approach to rule-based information extraction. In ICDE, pages 933--942. IEEE, 2008.
[25]
Terese. Term Rewriting Systems, volume 55 of Cambridge Tracts in Theoretical Computer Science. Cambridge University Press, 2003.
[26]
H. Zhu, S. Raghavan, S. Vaithyanathan, and A. Löser. Navigating the intranet with high precision. In WWW, pages 491--500. ACM, 2007.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PODS '11: Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
June 2011
332 pages
ISBN:9781450306607
DOI:10.1145/1989284
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 June 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. rewriting
  2. search database system

Qualifiers

  • Research-article

Conference

SIGMOD/PODS '11
Sponsor:

Acceptance Rates

PODS '11 Paper Acceptance Rate 25 of 113 submissions, 22%;
Overall Acceptance Rate 642 of 2,707 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)0
Reflects downloads up to 06 Nov 2024

Other Metrics

Citations

Cited By

View all

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media