Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleAugust 2009
Phrase clustering for discriminative learning
We present a simple and scalable algorithm for clustering tens of millions of phrases and use the resulting clusters as features in discriminative classifiers. To demonstrate the power and generality of this approach, we apply the method in two very ...
- research-articleAugust 2009
Distant supervision for relation extraction without labeled data
Modern models of relation extraction for tasks like ACE are based on supervised learning of relations from small hand-labeled corpora. We investigate an alternative paradigm that does not require labeled corpora, avoiding the domain dependence of ACE-...
- research-articleAugust 2009
A polynomial-time parsing algorithm for TT-MCTAG
This paper investigates the class of Tree-Tuple MCTAG with Shared Nodes, TT-MCTAG for short, an extension of Tree Adjoining Grammars that has been proposed for natural language processing, in particular for dealing with discontinuities and word order ...
- research-articleAugust 2009
An optimal-time binarization algorithm for linear context-free rewriting systems with fan-out two
Linear context-free rewriting systems (LCFRSs) are grammar formalisms with the capability of modeling discontinuous constituents. Many applications use LCFRSs where the fan-out (a measure of the discontinuity of phrases) is not allowed to be greater ...
- research-articleAugust 2009
Learning context-dependent mappings from sentences to logical form
We consider the problem of learning context-dependent mappings from sentences to logical form. The training examples are sequences of sentences annotated with lambda-calculus meaning representations. We develop an algorithm that maintains explicit, ...
-
- research-articleAugust 2009
Coordinate structure analysis with global structural constraints and alignment-based local features
We propose a hybrid approach to coordinate structure analysis that combines a simple grammar to ensure consistent global structure of coordinations in a sentence, and features based on sequence alignment to capture local symmetry of conjuncts. The ...
- research-articleAugust 2009
k-best A* parsing
A* parsing makes 1-best search efficient by suppressing unlikely 1-best items. Existing k-best extraction methods can efficiently search for top derivations, but only after an exhaustive 1-best pass. We present a unified algorithm for k-best A* parsing ...
- research-articleAugust 2009
A comparative study of hypothesis alignment and its improvement for machine translation system combination
Recently confusion network decoding shows the best performance in combining outputs from multiple machine translation (MT) systems. However, overcoming different word orders presented in multiple MT systems during hypothesis alignment still remains the ...
- research-articleAugust 2009
Confidence measure for word alignment
In this paper we present a confidence measure for word alignment based on the posterior probability of alignment links. We introduce sentence alignment confidence measure and alignment link confidence measure. Based on these measures, we improve the ...
- research-articleAugust 2009
Better word alignments with supervised ITG models
This work investigates supervised word alignment methods that exploit inversion transduction grammar (ITG) constraints. We consider maximum margin and conditional likelihood objectives, including the presentation of a new normal form grammar for ...
- research-articleAugust 2009
A non-contiguous tree sequence alignment-based model for statistical machine translation
The tree sequence based translation model allows the violation of syntactic boundaries in a rule to capture non-syntactic phrases, where a tree sequence is a contiguous sequence of subtrees. This paper goes further to present a translation model based on ...
- research-articleAugust 2009
Robust approach to abbreviating terms: a discriminative latent variable model with global information
The present paper describes a robust approach for abbreviating terms. First, in order to incorporate non-local information into abbreviation generation tasks, we present both implicit and explicit solutions: the latent variable model, or alternatively, ...
- research-articleAugust 2009
Dialogue segmentation with large numbers of volunteer internet annotators
This paper shows the results of an experiment in dialogue segmentation. In this experiment, segmentation was done on a level of analysis similar to adjacency pairs. The method of annotation was somewhat novel: volunteers were invited to participate over ...
- research-articleAugust 2009
SMS based interface for FAQ retrieval
Short Messaging Service (SMS) is popularly used to provide information access to people on the move. This has resulted in the growth of SMS based Question Answering (QA) services. However automatically handling SMS questions poses significant challenges ...
- research-articleAugust 2009
Semi-supervised cause identification from aviation safety reports
We introduce cause identification, a new problem involving classification of incident reports in the aviation domain. Specifically, given a set of pre-defined causes, a cause identification system seeks to identify all and only those causes that can ...
- research-articleAugust 2009
Incorporating information status into generation ranking
We investigate the influence of information status (IS) on constituent order in German, and integrate our findings into a loglinear surface realisation ranking model. We show that the distribution of pairs of IS categories is strongly asymmetric. ...
- research-articleAugust 2009
Dependency based Chinese sentence realization
This paper describes log-linear models for a general-purpose sentence realizer based on dependency structures. Unlike traditional realizers using grammar rules, our method realizes sentences by linearizing dependency relations directly in two steps. ...
- research-articleAugust 2009
Case markers and morphology: addressing the crux of the fluency problem in English-Hindi SMT
We report in this paper our work on accurately generating case markers and suffixes in English-to-Hindi SMT. Hindi is a relatively free word-order language, and makes use of a comparatively richer set of case markers and morphological suffixes for ...
- research-articleAugust 2009
A graph-based semi-supervised learning for question-answering
We present a graph-based semi-supervised learning for the question-answering (QA) task for ranking candidate sentences. Using textual entailment analysis, we obtain entailment scores between a natural language question posed by the user and the ...
- research-articleAugust 2009
Modeling latent biographic attributes in conversational genres
This paper presents and evaluates several original techniques for the latent classification of biographic attributes such as gender, age and native language, in diverse genres (conversation transcripts, email) and languages (Arabic, English). First, we ...