skip to main content
article
Free access

A probabilistic relational model and algebra

Published: 01 September 1996 Publication History

Abstract

Although the relational model for databases provides a great range of advantages over other data models, it lacks a comprehensive way to handle incomplete and uncertain data. Uncertainty in data values, however, is pervasive in all real-world environments and has received much attention in the literature. Several methods have been proposed for incorporating uncertain data into relational databases. However, the current approaches have many shortcomings and have not established an acceptable extension of the relational model. In this paper, we propose a consistent extension of the relational model. We present a revised relational structure and extend the relational algebra. The extended algebra is shown to be closed, a consistent extension of the conventional relational algebra, and reducible to the latter.

References

[1]
BARBARA, D., GARCIA-MOLINA, H., AND PORTER, D. 1992. The Management of Probabilistic Data. IEEE Trans. Knowl. Data Eng. 4, 5 (Oct.), 487-502.
[2]
Bt~~KLE,% B. P. ANi) PE?a~, F. E. 1983. Information-Theoretical Characterization of Fuzzy Relational Databases. IEEE Trans. Syst. Man Cybern. 13, 1 (Jan./Feb.), 74-77.
[3]
BUCKLES, B. P. AND PETRY, F.E. 1984. Extending the Fuzzy Database with Fuzzy Numbers. lnf Sri. (New York) 34, 145-155.
[4]
C^v^H,o, R. AND PWT^gELU, M. 1987. The Theory of Probabilistic Databases. In Proceedlngs of the 13th VLDB Conference. Brighton, Eng., 71-81.
[5]
CODD, E. F. 1979. Extending the Database Relational Model to Capture More Meaning. ACM Trans. Database Syst. 4, 4 (Der.), 397-434.
[6]
CODD, E. F. 1990. The Relational Model for Database Management: Version 2. Addison- Wesley, Reading, Mass.
[7]
DATE, C J. 1986. Relational Database: Selected Writings. Addison-Wesley, Reading, Mass.
[8]
D~v, D., BARROt, T. M. ^,~t) S^n^gla, A.N. 1995. Logical Design of Temporal Databases: A Decision Theoretic Approach. Working paper. Louisiana State University.
[9]
J^~NES, E. T. 1968. Prior Probabilities. IEEE Trans. Syst. Sri. Cybernetics. 4, 3 (Sept.), 227-241.
[10]
JErl~REy, R. 1983. The Logic ofDecision. University of Chicago Press, Chicago, Ill.
[11]
KLm, G. J. AND FOLGER, T. A. 1988. Fuzzy Sets, Uncertaint~, and Information. Prentice Hall, Englewood Cliffs, N.J.
[12]
L1PSKI, W., J~. 1979. On Semantic Issues Connected with Incomplete Information Databases. ACM Trans. Database Syst. 4, 3 (Sept.), 262-296.
[13]
M^IrR, D. 1983. The Theory of Relational Databases. Computer Science Press, Rockville, MD.
[14]
MENDELSON, H. AND SAHARIA, A. N. 1986. Incomplete Information Costs and Database Design. ACM Trans. Database Syst. 11, 2 (June), 159-185.
[15]
PE^RL, J. 1986. Fusion, Propagation, and Structuring in Belief Networks. Artif Intell. 29, 241-288.
[16]
PEARL, J. 1989. Probabilistic Semantics for Nonmonotonic Reasoning: A Survey. In Proeeedings of the First Conference on Principles of Knowledge Representation and Reasoning. Morgan Kaufmann, 505-516.
[17]
PR^DE, H. ^NU TESTI~:M^LE, C. 1984. Generalizing Database Relational Algebra for the Treatment of Incomplete or Uncertain Information and Vague Queries. lnf Sri. INew York~ 34, 115-143.
[18]
R^Jt~, K. V. S. V. N. ^sri M^JUMDAR, A. K. 1988. Fuzzy Functional Dependencies and Lossless Join Decomposition of Fuzzy Relational Database Systems. ACM Trans. Database Syst. 13, 2 (JuneL 129-166.
[19]
SNODGR^SS, R.T. 1987. The Temporal Query Language TQuel. ACM Trans. Database Syst. 12, 2 IJune), 247-298.
[20]
Tss,~c;, F. S. C, CH~r~, A. L. P. ^NI~ Y^N(~, W.-P. 1993. Answering Heterogeneous Database Queries with Degrees of Uncertainty. Distributed and Parallel Datobases: An International Journal. 1, 3 (July), 281-302.
[21]
WONG, E. 1982. A Statistical Approach to Incomplete Information in Database Systems. ACM Trans. Database OEvst. 7, 3 (Sept.), 470-488.
[22]
ZEMANKOVA, M. AND KANDEL, A. 1985. Implementing Imprecision in Information Systems. Inf Sri. (New Yorkl 37, 107-141.

Cited By

View all

Recommendations

Reviews

Edward Y. Lee

The authors summarize work to develop a more robust and complete relational database model, the probabilistic relational model (PRM), for handling uncertainty in the data values. The first section is an introduction. Section 2 examines previous research dealing with data uncertainty. Section 3 gives a description of the PRM that conforms to the first normal form (1nf) with a consistent extension of the traditional relational algebra and is reducible to the latter. Section 4 provides a more detailed discussion of the relational algebra for the PRM. Section 5 discusses some of the properties of the new relational algebra, and incompleteness in the joint probability of objects is treated in section 6. Section 7, the conclusion, suggests future directions for research. The main audience for this paper is relational database researchers who have dealt with related topics. More practical users of relational databases may wish to skip sections 4 through 6, reading the remaining sections for an overview of research on probabilistic relational models. The authors do not indicate that any practical applications have been developed using their PRM. The example they provide along with their explanation of the model sounds like a possible application to an employee database, but its practicality is not demonstrated. The summary of previous research into the PRM in section 2 is fairly complete. The authors point out some of the pitfalls of other models and some of the extensions others have attempted in order to obtain a closed model for the probabilistic relational algebra. One of the authors' main objections to the other models is that they cannot express the model in 1nf, so it becomes harder to prove the closure properties of the extension and relational algebraic operations. In section 3, the authors provide more details about their model and point out the major differences between it and other models. They do not allow two tuples with the same value for all nonprobability attributes to be present in a relation. In their representation, certainty about an object is a special case where the probabilities associated with the key value for that object add up to exactly one. The extension of the relational algebra is uni sorted, and the only valid object in the algebra is a relation. Sections 4 and 5 give a more detailed theoretical proof and verification of the extended probabilistic relational algebra. They also introduce the coalescence-PLUS and coalescence-MAX operations on value-equivalent tuples to satisfy the deterministic assumption. In section 6, to satisfy the incomplete distribution of the attributes of an object and the interpretation of the null value, the authors introduce the concepts of conditionalization and the n th moment as related to the selection and natural-joint operations, so that such extended operations can only be applied to matched tuples on non-null attribute values. With this additional extension to the PRM, the authors claim that the closure proof of their model is complete. As part of the summary of their PRM in section 7, the authors indicate that one of the restrictions on the model is that only discrete joint probability distributions are allowed. However, since most business data are collected in representations or in the forms of discrete distributions, and most continuous distributions can be converted to discrete distributions, the restriction is not a severe limitation. Issues that should be explored in future research include storage structure, access path and query optimization, the need to develop a nonprocedural query language, and the belief revision of probabilistic data. The authors are in the process of looking into these and other issues so that a prototype probabilistic database management system can be developed. The paper's 21 references represent a fairly complete sample of related work.

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Database Systems
ACM Transactions on Database Systems  Volume 21, Issue 3
Sept. 1996
132 pages
ISSN:0362-5915
EISSN:1557-4644
DOI:10.1145/232753
  • Editor:
  • Won Kim
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 September 1996
Published in TODS Volume 21, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. data incompleteness
  2. data uncertainty
  3. probabilistic relation
  4. probability calculus
  5. relational algebra
  6. relational model

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)186
  • Downloads (Last 6 weeks)28
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media