Natural language processing

Applied Filters

People

Publications

Conferences

Publication Date

26 Results for: Book/Issue: SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrievalEdit SearchSave SearchRSS

Searched The ACM Guide to Computing Literature (3,843,204 records)|Limit your search to The ACM Full-Text Collection (775,260 records)

Showing 1 - 20of26 Results

Filters

Select All

Export Citations Save to Binder

per page:

Recency

Article
September 2001
AUTINDEX: an automatic multilingual indexing system
- Bärbel Ripplinger,
- Paul Schmidt
SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrievalPage 452https://doi.org/10.1145/383952.384093
1
276
Metrics
Total Citations1
Total Downloads276
Last 12 Months1
Last 6 weeks1
Get Access
Article
September 2001
Query clustering using content words and user feedback
SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrievalPages 442–443https://doi.org/10.1145/383952.384083

Query clustering is crucial for automatically discovering frequently asked queries (FAQs) or most popular topics on a question-answering search engine. Due to the short length of queries, the traditional approaches based on keywords are not suitable for ...
16
652
Metrics
Total Citations16
Total Downloads652
Last 12 Months2
Last 6 weeks0
Get Access
Article
September 2001
Automatic web search query generation to create minority language corpora
SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrievalPages 432–433https://doi.org/10.1145/383952.384072

The Web is a valuable source of language specific resources but collecting, organizing and utilizing this information is difficult. We describe CorpusBuilder, an approach for automatically generating Web-search queries to collect documents in a minority ...
5
346
Metrics
Total Citations5
Total Downloads346
Last 12 Months0
Last 6 weeks0
Get Access
Article
September 2001
Generic topic segmentation of document texts
- Marie-Francine Moens,
- Rik De Busser
SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrievalPages 418–419https://doi.org/10.1145/383952.384065

Topic segmentation is an important initial step in many text-based tasks. A hierarchical representation of a texts topics is useful in retrieval and allows judging relevancy at different levels of detail. This short paper describes research on generic ...
9
576
Metrics
Total Citations9
Total Downloads576
Last 12 Months1
Last 6 weeks0
Get Access
Article
September 2001
Query-biased web page summarisation: a task-oriented evaluation
SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrievalPages 412–413https://doi.org/10.1145/383952.384062

We present a system that offers a new way of assessing web document relevance and new approach to the web-based evaluation of such a system. Provisionally named WebDocSum, the system is a query-biased web page summariser that aims to provide an ...
11
433
Metrics
Total Citations11
Total Downloads433
Last 12 Months0
Last 6 weeks0
Get Access
Article
September 2001
Interactive phrase browsing within compressed text
- Raymond Wan,
- Alistair Moffat
SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrievalPages 410–411https://doi.org/10.1145/383952.384061
1
328
Metrics
Total Citations1
Total Downloads328
Last 12 Months0
Last 6 weeks0
Get Access
Article
September 2001
Structure and content-based segmentation of speech transcripts
- Dulce Ponceleon,
- Savitha Srinivasan
SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrievalPages 404–405https://doi.org/10.1145/383952.384041

algorithm for the segmentation of an audio/video source into topically cohesive segments based on automatic speech recognition (ASR) transcriptions is presented. A novel two-pass algorithm is described that combines a boundary-based method with a ...
7
288
Metrics
Total Citations7
Total Downloads288
Last 12 Months2
Last 6 weeks0
Get Access
Article
September 2001
Quantifying the utility of parallel corpora
SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrievalPages 398–399https://doi.org/10.1145/383952.384037

Our English-Chinese cross-language IR system is trained from parallel corpora; we investigate its performance as a function of training corpus size for three different training corpora. We find that the performance of the system as trained on the three ...
9
305
Metrics
Total Citations9
Total Downloads305
Last 12 Months7
Last 6 weeks0
Get Access
Article
September 2001
Anchor text mining for translation extraction of query terms
SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrievalPages 388–389https://doi.org/10.1145/383952.384031

This paper presents an approach to automatically extracting the bilingual translations of many Web query terms through mining the Web anchor texts. Some preliminary experiments are conducted on using 109,416 Web pages containing both Chinese and English ...
4
500
Metrics
Total Citations4
Total Downloads500
Last 12 Months1
Last 6 weeks0
Get Access
Article
September 2001
Searcher performance in question answering
SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrievalPages 375–381https://doi.org/10.1145/383952.384028

There are many tasks that require information finding. Some can be largely automated, and others greatly benefit from successful interaction between system and searcher. We are interested in the task of answering questions where some synthesis of ...
8
504
Metrics
Total Citations8
Total Downloads504
Last 12 Months4
Last 6 weeks1
Get Access
Article
September 2001
High performance question/answering
- Marius A. Pasca,
- Sandra M. Harabagiu
SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrievalPages 366–374https://doi.org/10.1145/383952.384025

In this paper we present the features of a Question/Answering (Q/A) system that had unparalleled performance in the TREC-9 evaluations. We explain the accuracy of our system through the unique characteristics of its architecture: (1) usage of a wide-...
76
1,171
Metrics
Total Citations76
Total Downloads1,171
Last 12 Months19
Last 6 weeks3
Get Access
Article
September 2001
Exploiting redundancy in question answering
SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrievalPages 358–365https://doi.org/10.1145/383952.384024

Our goal is to automatically answer brief factual questions of the form ``When was the Battle of Hastings?'' or ``Who wrote The Wind in the Willows?''. Since the answer to nearly any such question can now be found somewhere on the Web, the problem ...
118
939
Metrics
Total Citations118
Total Downloads939
Last 12 Months12
Last 6 weeks1
Get Access
Article
September 2001
Finding topic words for hierarchical summarization
SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrievalPages 349–357https://doi.org/10.1145/383952.384022

Hierarchies have long been used for organization, summarization, and access to information. In this paper we define summarization in terms of a probabilistic language model and use the definition to explore a new technique for automatically ...
99
1,407
Metrics
Total Citations99
Total Downloads1,407
Last 12 Months19
Last 6 weeks4
Get Access
Article
September 2001
Topic segmentation with an aspect hidden Markov model
- David M. Blei,
- Pedro J. Moreno
SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrievalPages 343–348https://doi.org/10.1145/383952.384021

We present a novel probabilistic method for topic segmentation on unstructured text. One previous approach to this problem utilizes the hidden Markov model (HMM) method for probabilistically modeling sequence data [7]. The HMM treats a document as ...
104
1,256
Metrics
Total Citations104
Total Downloads1,256
Last 12 Months24
Last 6 weeks5
Get Access
Article
September 2001
A study of smoothing methods for language models applied to Ad Hoc information retrieval
- Chengxiang Zhai,
- John Lafferty
SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrievalPages 334–342https://doi.org/10.1145/383952.384019

Language modeling approaches to information retrieval are attractive and promising because they connect the problem of retrieval with that of language model estimation, which has been studied extensively in other application areas such as speech ...
944
3,496
Metrics
Total Citations944
Total Downloads3,496
Last 12 Months31
Last 6 weeks3
Get Access
Article
September 2001
A meta-learning approach for text categorization
- Wai Lam,
- Kwok-Yin Lai
SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrievalPages 303–309https://doi.org/10.1145/383952.384011

We investigate a meta-model approach, called Meta-learning Using Document Feature characteristics (MUDOF), for the task of automatic textual document categorization. It employs a meta-learning phase using document feature characteristics. Document ...
25
833
Metrics
Total Citations25
Total Downloads833
Last 12 Months4
Last 6 weeks0
Get Access
Article
September 2001
Enhanced topic distillation using text, markup tags, and hyperlinks
SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrievalPages 208–216https://doi.org/10.1145/383952.383990

Topic distillation is the analysis of hyperlink graph structure to identify mutually reinforcing authorities (popular pages) and hubs (comprehensive lists of links to authorities). Topic distillation is becoming common in Web search engines, but the ...
82
929
Metrics
Total Citations82
Total Downloads929
Last 12 Months5
Last 6 weeks0
Get Access
Article
September 2001
Automatic generation of concise summaries of spoken dialogues in unrestricted domains
- Klaus Zechner
SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrievalPages 199–207https://doi.org/10.1145/383952.383989

Automatic summarization of open domain spoken dialogues is a new research area. This paper introduces the task, the challenges involved, and presents an approach to obtain automatic extract summaries for multi-party dialogues of four different genres, ...
32
576
Metrics
Total Citations32
Total Downloads576
Last 12 Months2
Last 6 weeks0
Get Access
Article
September 2001
On feature distributional clustering for text categorization
SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrievalPages 146–153https://doi.org/10.1145/383952.383976

We describe a text categorization approach that is based on a combination of feature distributional clusters with a support vector machine (SVM) classifier. Our feature selection approach employs distributional clustering of words via the recently ...
85
1,162
Metrics
Total Citations85
Total Downloads1,162
Last 12 Months7
Last 6 weeks0
Get Access
Article
September 2001
A study of thresholding strategies for text categorization
- Yiming Yang
SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrievalPages 137–145https://doi.org/10.1145/383952.383975

Thresholding strategies in automated text categorization are an underexplored area of research. This paper presents an examination of the effect of thresholding strategies on the performance of a classifier under various conditions. Using k-Nearest ...
250
1,737
Metrics
Total Citations250
Total Downloads1,737
Last 12 Months36
Last 6 weeks1
Get Access

Applied Filters

People

Names

Institutions

Authors

Publications

Proceedings/Book Names

All Publications

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

Results

AUTINDEX: an automatic multilingual indexing system

Query clustering using content words and user feedback

Automatic web search query generation to create minority language corpora

Generic topic segmentation of document texts

Query-biased web page summarisation: a task-oriented evaluation

Interactive phrase browsing within compressed text

Structure and content-based segmentation of speech transcripts

Quantifying the utility of parallel corpora

Anchor text mining for translation extraction of query terms

Searcher performance in question answering

High performance question/answering

Exploiting redundancy in question answering

Finding topic words for hierarchical summarization

Topic segmentation with an aspect hidden Markov model

A study of smoothing methods for language models applied to Ad Hoc information retrieval

A meta-learning approach for text categorization

Enhanced topic distillation using text, markup tags, and hyperlinks

Automatic generation of concise summaries of spoken dialogues in unrestricted domains

On feature distributional clustering for text categorization

A study of thresholding strategies for text categorization