2018 Friday Poster 6744

Friday, November 2, 2018 | Poster Session I, Metcalf Small | 3pm

Syntactic distributions affect the emergence of nouns in the earliest stages of syntactic acquisition
N. Lester, F. Moscoso del Prado Martin

Children, even infants, are marvelously sensitive to the distributional properties of the linguistic input they receive. For example, direct experimentation has revealed that children as young as eight-months track statistical regularities at the level of phonemes (Saffran, et al., 1996). These effects have been extended to regularity in morphological distributions based on the subjective judgments of age of acquisition of adults: words that show more diverse distributions across inflectional variants are reported to have been learned earlier (Baayen, et al., 2006). Corpus research confirms these findings for languages with highly complex inflectional paradigms: children begin to produce more words from inflectionally complex paradigms only once their speech shows adult-like inflectional distributions (Stoll et al., 2012). However, less research has examined how syntactic distributions of words in adult speech impact how early those words are acquired.

In the present study, we address this gap by testing whether syntactic (i.e., fully abstract) distributions likewise support acquisition in the earliest stages of language development. We study how two aspects of these distributions, diversity and atypicality, influence the first appearance of nouns in naturally produced child speech. We base these measures on the distributions of words across the set of syntactic dependencies that are codified in a dependency grammar formalism (Honnibal & Johnson, 2015). Dependency grammar treats syntax as typed binary relationships between words. For example, the cat consists of a single dependency det that binds the (as modifier) to cat (as head). We define diversity and typicality with respect to these dependency relations. Diversity is defined using the information-theoretic measure of conditional entropy. Conditional entropy allows us to measure the information carried by dependency relations given (i.e., removing) any contributions of the lexical co-distributions (which are known to capture a good deal of confounding semantic information). Atypicality is measured using the Jensen-Shannon Divergence, which captures how similar the syntactic distribution of a given noun is to the aggregate behavior of all nouns.

We compute these measures for the nouns that appear for the first time in the speech of children taken from a densely sampled corpus of child speech. We calculate the measures from a subsample of the British National Corpus. We use a time-stratified Cox Proportional Hazard Regression (Smolik, 2014) to model the likelihood of the emergence of words over time. We find that words with more diverse syntactic distributions are produced earlier from the ages of 1;8 to 2;2. Independently of the diversity effect, atypical nouns were produced later from the ages of 1;10 to 2;7. The time courses of the effects of diversity and atypicality overlap with a slight offset: the latter follows, but endures beyond, the former.

These findings provide first evidence that the de-lexicalized syntactic distributions of words impact the acquisition of words. Moreover, they suggest that typicality effects emerge only once a threshold of syntactic diversity has been surpassed. Thus, lexical items may only enter the arena of syntactic categorization once they have been finely discriminated via experience with variable syntactic contexts (see Baayen, et al., 2011).

References

Baayen, R. H., Milin, P., Filipović-Đurđević, D., Hendrix, P. & Marelli, M. (2011). An amorphous model for morphological processing in visual comprehension based on naive discriminative learning. Psychological Review, 118, 438-482.

Baayen, R. H., Feldman, L. B., & Schreuder, R. (2006). Morphological influences on the recognition of monosyllabic monomorphemic words. Journal of Memory and Language, 55, 290-313.

Honnibal, M. & Johnson, M. (2015). An improved non-monotonic transition system for dependency parsing. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (pp. 1373-1378). Lisbon, Association for Computational Linguistics.

Saffran, J. R., Aslin, R. N., & Newport, E. L. (1996). Statistical learning by 8-month-old infants. Science, 274, 1926-1928.

Smolik, F. (2014). Noun imageability facilitates the acquisition of plurals: Survival analysis of plural emergence in children. Journal of Psycholinguistic Research, 43, 335-350.

Stoll, S., Bickel, B., Lieven, E., Paudyal, N. P., Banjade, G., Bhatta, T. N., Gaenszle, M., Pettigrew, J., Rai, I. P., Rai, M., & Rai, N. K. (2012). Nouns and verbs in Chintang: Children’s usage and surrounding adult speech. Journal of Child Language, 39, 284-321.