Computational Learning of Syntax

Authors

Clark, Alexander

Year

2017

Abstract

Learnability has traditionally been considered to be a crucial constraint on theoretical syntax; however, the issues involved have been poorly understood, partly as a result of the lack of simple learning algorithms for various types of formal grammars. Here I discuss the computational issues involved in learning hierarchically structured grammars from strings of symbols alone. The methods involved are based on an abstract notion of the derivational context of a syntactic category, which in the most elementary case of context-free grammars leads to learning algorithms based on a form of traditional distributional analysis. Crucially, these techniques can be extended to work with mildly context-sensitive grammars (and beyond), thus leading to learning methods that can in principle learn classes of grammars that are powerful enough to represent all natural languages. These learning methods require that the syntactic categories of the grammars be visible in a certain technical sense: They must be well characterized either by the sets of symbols that they generate or by the sets of contexts in which they can appear. However, there are still significant gaps between these theoretical results and their direct implementation as models of language acquisition; I discuss how these remaining problems can be overcome.

Links

local copy of pdf

citation in bibtex

Tutorial