Learning Grammars: Identification in the Limit (Gold)
Each learning trial is a string labeled grammatical or ungrammatical.
All possible strings are available to the learner, but they are presented
in no particular order.
The learner makes hypotheses about the underlying grammar, modifying them
as new strings become available.
Learning does not bring about changes to the structures involved
in learning, thus altering what could be learned: stationarity.
A grammar is identified "in the limit" if the hypothesis no longer
changes (though the learner need not be aware of this).
Note: this is a very strong criterion; it assumes that learning
stops.
Grammars up to and including context-sensitive grammars
can be identified
in the limit (by enumerating the grammars, each time stopping at
the first one which is consistent with the inputs up to that point),
but negative evidence is required
for all languages with an infinite number of possible sentences
in order to constrain the space of hypotheses.
The Problem
Children do learn language, though not necessarily in Gold's
sense.
Children seem not to get negative evidence.
Parents, except uptight upper-middle-class Americans, rarely correct
their children's grammar; parents are interested in the
comprehensibility and the truth of their chidren's utterances
When children's grammar is corrected, this seems to have
little effect
How Could Language Be Learned?
Language identification in the limit is too strict, and children
don't learn grammar in isolation anyway.
There is an inductive bias: universal grammar, consisting
of specialized modules and, on one account,
of parameters which are set on the basis of the input.
Children start with parameters, for example, the
one that specifies whether a language is head-initial or head-final.
Input tells the child which way to set the parameter, and lots
of other properties of the language fall into place automatically.
Problem: How does the child map the aspects of the parameter
(subject, head, etc.) onto the input?
When learning, children apply the subset principle: they
always set parameters in such a way that the grammar will be
maximally small.
Problem: Children do over-generalize.
In learning words, children are guided by innate constraints, for
example, whole-object, taxonomic, and mutual exclusivity
assumptions.
Children actually do get negative evidence, though perhaps
indirectly.
When grammatical errors interfere with comprehensibility, there
is negative feedback, though not normally about what the error was.
Perhaps correction does help children when it comes at the
right time in the learning process.
Semantics constrains space of hypotheses.
Children use innate assumptions about what words and structures
can mean to guide them in making grammatical hypotheses (spelled
out in great detail by Pinker).
Pinker's approach requires a level of "semantic structure"
intermediate between conceptual and syntactic structure,
innate linking rules mapping syntactic onto semantic structure,
constraints on which semantic structures are relevant to
grammar (e.g., aspects of motion, location, force, object type).
Stationarity does not hold; learners can build on representations
learned early on (Elman, Sejnowski).
The input is constrained in particular ways to simplify the task.
Caregivers may (unconsciously) simplify the input,
tailor the input to the learner, "teach" language in one way or another.
Problem: Attitudes towards language acquisition vary dramatically
from culture to culture. In some societies adults rarely address
young children at all or do not seem to simplify.
Learners are doing something other than hypothesizing grammars.
They are learning phrasal patterns of varying degrees of
schematicity and ways of combining these to form sentences.
They are learning higher order transition probabilities between
words.