skip to main content
10.1145/375663acmconferencesBook PagePublication PagesmodConference Proceedingsconference-collections
SIGMOD '01: Proceedings of the 2001 ACM SIGMOD international conference on Management of data
ACM2001 Proceeding
Publisher:
  • Association for Computing Machinery
  • New York
  • NY
  • United States
Conference:
SIGMOD/PODS01: ACM SIGMOD International Conference on Management of Data Santa Barbara California USA May 21 - 24, 2001
ISBN:
978-1-58113-332-5
Published:
01 May 2001
Sponsors:
Recommend ACM DL
ALREADY A SUBSCRIBER?SIGN IN

Reflects downloads up to 27 Dec 2024Bibliometrics
Abstract

No abstract available.

Article
Efficient computation of Iceberg cubes with complex measures

It is often too expensive to compute and materialize a complete high-dimensional data cube. Computing an iceberg cube, which contains only aggregates above certain thresholds, is an effective way to derive nontrivial multi-dimensional aggregations for ...

Article
On computing correlated aggregates over continual data streams

In many applications from telephone fraud detection to network management, data arrives in a stream, and there is a need to maintain a variety of statistical summary information about a large number of customers in an online fashion. At present, such ...

Article
Iceberg-cube computation with PC clusters

In this paper, we investigate the approach of using low cost PC cluster to parallelize the computation of iceberg-cube queries. We concentrate on techniques directed towards online querying of large, high-dimensional datasets where it is assumed that ...

Article
Outlier detection for high dimensional data

The outlier detection problem has important applications in the field of fraud detection, network robustness analysis, and intrusion detection. Most such applications are high dimensional domains in which the data can contain hundreds of dimensions. ...

Article
Bit-sliced index arithmetic

The bit-sliced index (BSI) was originally defined in [ONQ97]. The current paper introduces the concept of BSI arithmetic. For any two BSI's X and Y on a table T, we show how to efficiently generate new BSI's Z, V, and W, such that Z = X + Y, V = X - Y, ...

Article
Space-efficient online computation of quantile summaries

An ∈-approximate quantile summary of a sequence of N elements is a data structure that can answer quantile queries about the sequence to within a precision of ∈N.

We present a new online algorithm for computing∈-approximate quantile summaries of very ...

Article
Probe, count, and classify: categorizing hidden web databases

The contents of many valuable web-accessible databases are only accessible through search interfaces and are hence invisible to traditional web “crawlers.” Recent studies have estimated the size of this “hidden web” to be 500 billion pages, while the ...

Article
Data bubbles: quality preserving performance boosting for hierarchical clustering

In this paper, we investigate how to scale hierarchical clustering methods (such as OPTICS) to extremely large databases by utilizing data compression methods (such as BIRCH or random sampling). We propose a three step procedure: 1) compress the data ...

Article
Mining needle in a haystack: classifying rare classes via two-phase rule induction

Learning models to classify rarely occurring target classes is an important problem with applications in network intrusion detection, fraud detection, or deviation detection in general. In this paper, we analyze our previously proposed two-phase rule ...

Article
Efficient evaluation of XML middle-ware queries

We address the problem of efficiently constructing materialized XML views of relational databases. In our setting, the XML view is specified by a query in the declarative query language of a middle-ware system, called SilkRoute. The middle-ware system ...

Article
Filtering algorithms and implementation for very fast publish/subscribe systems

Publish/Subscribe is the paradigm in which users express long-term interests (“subscriptions”) and some agent “publishes” events (e.g., offers). The job of Publish/Subscribe software is to send events to the owners of subscriptions satisfied by those ...

Article
Adaptable query optimization and evaluation in temporal middleware

Time-referenced data are pervasive in most real-world databases. Recent advances in temporal query languages show that such database applications may benefit substantially from built-in temporal support in the DBMS. To achieve this, temporal query ...

Article
Optimizing multidimensional index trees for main memory access

Recent studies have shown that cache-conscious indexes such as the CSB+-tree outperform conventional main memory indexes such as the T-tree. The key idea of these cache-conscious indexes is to eliminate most of child pointers from a node to increase the ...

Article
Locally adaptive dimensionality reduction for indexing large time series databases

Similarity search in large time series databases has attracted much research interest recently. It is a difficult problem because of the typically high dimensionality of the data.. The most promising solutions involve performing dimensionality reduction ...

Article
Main-memory index structures with fixed-size partial keys

The performance of main-memory index structures is increasingly determined by the number of CPU cache misses incurred when traversing the index. When keys are stored indirectly, as is standard in main-memory databases, the cost of key retrieval in terms ...

Article
Automatic segmentation of text into structured records

In this paper we present a method for automatically segmenting unformatted text records into structured elements. Several useful data sources today are human-generated as continuous text whereas convenient usage requires the data to be organized as ...

Article
Efficient and effective metasearch for text databases incorporating linkages among documents

Linkages among documents have a significant impact on the importance of documents, as it can be argued that important documents are pointed to by many documents or by other important documents. Metasearch engines can be used to facilitate ordinary users ...

Article
Independence is good: dependency-based histogram synopses for high-dimensional data

Approximating the joint data distribution of a multi-dimensional data set through a compact and accurate histogram synopsis is a fundamental problem arising in numerous practical scenarios, including query optimization and approximate query answering. ...

Article
STHoles: a multidimensional workload-aware histogram

Attributes of a relation are not typically independent. Multidimensional histograms can be an effective tool for accurate multiattribute query selectivity estimation. In this paper, we introduce STHoles, a “workload-aware” histogram that allows bucket ...

Article
Global optimization of histograms

Histograms are frequently used to represent the distribution of data values in an attribute of a relation. Most previous work has focused on identifying the optimal histogram (given a limited number of buckets) for a single attribute independent of ...

Article
Improving index performance through prefetching

This paper proposes and evaluate Prefetching B+-Trees (pB+-Trees), which use prefetching to accelerate two important operations on B+-Tree indices: searches and range scans. To accelerate searches, pB+-Trees use prefetching to effectively create wider ...

Article
Efficient and tumble similar set retrieval

Set value attributes are a concise and natural way to model complex data sets. Modern Object Relational systems support set value attributes and allow various query capabilities on them. In this paper we initiate a formal study of indexing techniques ...

Article
PREFER: a system for the efficient execution of multi-parametric ranked queries

Users often need to optimize the selection of objects by appropriately weighting the importance of multiple object attributes. Such optimization problems appear often in operations' research and applied mathematics as well as everyday life; e.g., a ...

Article
Query optimization in compressed database systems

Over the last decades, improvements in CPU speed have outpaced improvements in main memory and disk access rates by orders of magnitude, enabling the use of data compression techniques to improve the performance of database systems. Previous work ...

Article
SPARTAN: a model-based semantic compression system for massive data tables

While a variety of lossy compression schemes have been developed for certain forms of digital data (e.g., images, audio, video), the area of lossy compression techniques for arbitrary data tables has been left relatively unexplored. Nevertheless, such ...

Article
A robust, optimization-based approach for approximate answering of aggregate queries

The ability to approximately answer aggregation queries accurately and efficiently is of great benefit for decision support and data mining tools. In contrast to previous sampling-based studies, we treat the problem as an optimization problem whose goal ...

Article
Materialized view selection and maintenance using multi-query optimization

Materialized views have been found to be very effective at speeding up queries, and are increasingly being supported by commercial databases and data warehouse systems. However, whereas the amount of data entering a warehouse and the number of ...

Article
Generating efficient plans for queries using views

We study the problem or generating efficient, equivalent rewritings using views to compute the answer to a query. We take the closed-world assumption, in which views are materialized from base relations, rather than views describing sources in terms of ...

Article
Optimizing queries using materialized views: a practical, scalable solution

Materialized views can provide massive improvements in query processing time, especially for aggregation queries over large tables. To realize this potential, the query optimizer must know how and when to exploit materialized views. This paper presents ...

Article
Dynamic buffer allocation in video-on-demand systems

In video-on-demand (VOD) systems, as the size of the buffer allocated to user requests increases, initial latency and memory requirements increase. Hence, the buffer size must be minimized. The existing static buffer allocation scheme, however, ...

Contributors
  • RMIT University
  • University of California, Irvine

Index Terms

  1. Proceedings of the 2001 ACM SIGMOD international conference on Management of data

    Recommendations

    Acceptance Rates

    SIGMOD '01 Paper Acceptance Rate 44 of 293 submissions, 15%;
    Overall Acceptance Rate 785 of 4,003 submissions, 20%
    YearSubmittedAcceptedRate
    SIGMOD '194308820%
    SIGMOD '184619020%
    SIGMOD '1541510626%
    SIGMOD '1442110725%
    SIGMOD '133727620%
    SIGMOD '122894817%
    SIGMOD '033425315%
    SIGMOD '022404218%
    SIGMOD '012934415%
    SIGMOD '002484217%
    SIGMOD '972024221%
    SIGMOD '962904716%
    Overall4,00378520%