A Polynomial Algorithm for the Inference of Context Free Languages

Authors Clark, Alexander and Eyraud, R{'e}mi and Habrard, Amaury Year 2008 Abstract We present a polynomial algorithm for the inductive inference of a large class of context free languages, that includes all regular languages. The algorithm uses a representation which we call Binary Feature Grammars based on a set of features, capable of representing richly structured context free languages as well as some context sensitive languages. More precisely, we focus on a particular case of this representation where the features correspond to contexts appearing in the language. [Read More]

Polynomial Identification in the Limit of Substitutable Context-free Languages

Authors Clark, Alexander and Eyraud, R'emi Year 2007 Abstract This paper formalises the idea of substitutability introduced by Zellig Harris in the 1950s and makes it the basis for a learning algorithm from positive data only for a subclass of context-free languages. We show that there is a polynomial characteristic set, and thus prove polynomial identification in the limit of this class. We discuss the relationship of this class of languages to other common classes discussed in grammatical inference. [Read More]

PAC-Learning Unambiguous NTS Languages

Authors Clark, Alexander Year 2006 Abstract Non-terminally separated (NTS) languages are a subclass of deterministic context free languages where there is a stable relationship between the substrings of the language and the non-terminals of the grammar. We show that when the distribution of samples is generated by a PCFG, based on the same grammar as the target language, the class of unambiguous NTS languages is PAC-learnable from positive data alone, with polynomial bounds on data and computation. [Read More]

Partially Distribution-Free Learning of Regular Languages from Positive Samples

Authors Clark, Alexander and Thollard, Franck Year 2004 Abstract Regular languages are widely used in NLP today in spite of their shortcomings. Efficient algorithms that can reliably learn these languages, and which must in realistic applications only use positive samples, are necessary. These languages are not learnable under traditional distribution free criteria. We claim that an appropriate learning framework is PAC learning where the distributions are constrained to be generated by a class of stochastic automata with support equal to the target concept. [Read More]

{PAC}-learnability of Probabilistic Deterministic Finite State Automata

Authors Clark, Alexander and Thollard, Franck Year 2004 Abstract We study the learnability of Probabilistic Deterministic Finite State Automata under a modified PAC-learning criterion. We argue that it is necessary to add additional parameters to the samplecomplexity polynomial, namely a bound on the expected length of strings generated from any state,and a bound on the distinguishability between states. With this, we demonstrate that the class of PDFAs is PAC-learnable using a variant of a standard state-merging algorithm and the Kullback-Leibler divergence as error function. [Read More]

Combining Distributional and Morphological Information for Part of Speech Induction

Authors

Clark, Alexander

Year

2003

Abstract

In this paper we discuss algorithms for clustering words into classes from unlabelled text using unsupervised algorithms, based on distributional and morphological information. We show how the use of morphological information can improve the performance on rare words, and that this is robust across a wide range of languages.

link

local copy of pdf

citation in bibtex

Shallow Parsing Using Probabilistic Grammatical Inference

Authors Thollard, Franck and Clark, Alexander Year 2002 Abstract This paper presents an application of grammatical inference to the task of shallow parsing. We first learn a deterministic probabilistic automaton that models the joint distribution of Chunk (syntactic phrase) tags and Part-of-speech tags, and then use this automaton as a transducer to find the most likely chunk tag sequence using a dynamic programming algorithm. We discuss an efficient means of incorporating lexical information, which automatically identifies particular words that are useful using a mutual information criterion, together with an application of bagging that improve our results. [Read More]

Unsupervised Induction of Stochastic Context Free Grammars with Distributional Clustering

Authors Clark, Alexander Year 2001 Abstract An algorithm is presented for learning a phrase-structure grammar from tagged text. It clusters sequences of tags together based on local distributional information, and selects clusters that satisfy a novel mutual information criterion. This criterion is shown to be related to the entropy of a random variable associated with the tree structures, and it is demonstrated that it selects linguistically plausible constituents. This is incorporated in a Minimum Description Length algorithm. [Read More]

Inducing Syntactic Categories by Context Distribution Clustering

Authors

Clark, Alexander

Year

2000

Abstract

This paper addresses the issue of the automatic induction of syntactic categories from unanno- tared corpora. Previous techniques give good results, but fail to cope well with ambiguity or rare words. An algorithm, context distribution clustering (CDC), is presented which can be naturally extended to handle these problems.

link

local copy of pdf

citation in bibtex