|
This lab is due October 14. To do this lab you will need access to the CHILDES data and the CLAN analysis program. Both can be downloaded freely. CLAN is available for Mac and Windows computers. On this page, you will find information on:
(The deisgn of this lab assignment is due to Martha McGinnis, U. Calgary)
The lab assignment comes in six parts. Where things are hihglighted in RED, these are things you should be handing in.
Part 1: Use CLAN to determine MLU. Use the mlu command to determine the MLU for Nina's transcripts. You can do this with the following CLAN command, which will save the results in a file called "mlu-nina.txt" in the Output directory. mlu +t*CHI nina* > mlu-nina.txt
Part 2: Record Nina's age and MLU for each file 01-19. Open the file "mlu-nina.txt" and print it out. Printing at least 2-up is recommended, there's a lot of wasted space. Open each transcript file from 01 through 19. (Note: there is no file nina08.cha.) At the top of each file, Nina's age in that file is recorded. Write down Nina's age for the transcript next to the MLU for that transcript on your printout.
Part 3: Use CLAN to determine word frequencies For two representative samples, we will use CLAN to determine the frequency with which each word in the transcript appears. To do this, we use the freq command. It works very much like the mlu command described above. We will run freq on nina10.cha and nina19.cha, and you can use the following commands to do this. freq +t*CHI nina10.cha > freq-nina10.txt freq +t*CHI nina19.cha > freq-nina19.txt After having done this, you will have two lists of words and numbers (one from file 10, one from file 19). We will look at each, and pick a regular verb that occurs the most often from each file. I found that eat and see seemed to be equally popular verbs in the nina10.cha file. Somewhat arbitrarily, we'll look at eat (see is complicated by the fact that it often occurs as "See?", which properly lacks a subject). I discounted have because it can be an auxiliary (and auxiliaries behave differently), also an unnecessary complication for what we are trying to do. In the nina19.cha file, I picked get as the verb to look at. It's a common verb, not as popular as irregular go, but go is involved in some auxiliary uses like have. Want would be an reasonable verb to pick as well, but it isn't even as interesting to look at as get.
Part 4: Use CLAN to look at subject drop in a small sample of two files Part 4a. Search the transcripts for the examples. Having picked a common verb from each file, what we're going to do is look at each time the verb is used in the transcript and count how often it appears with a subject. To do this (you may want to look at the combo notes), use the following CLAN commands. combo +t*CHI -w2 +s"eat*" nina10.cha > selected-nina10.txt combo +t*CHI -w2 +s"get*" nina19.cha > selected-nina19.txt Make sure you know why it does what it does. This will give you two files (selected-nina10.txt and selected-nina-19.txt), which contain the child utterances containing the verbs you've picked and the two lines preceding each. Part 4b. Count up the totals. Now, go through each example and decide which of the following categories it falls under. Be sure to read the "exclusion" criteria carefully.
Part 4c. Describe what you found Create a 2 x 3 table of results (2 rows and 3 columns) like the one below. Fill in the overt and null subject numbers for each file. In the third column, add together the total number of overt and null subjects (the sum of the first two columns), and then use divide the number of overt subjects by the result (and then multiply by 100).
Describe your results. Does the percentage of dropped subjects decrease as Nina gets older?
Part 5: Use CLAN to study Nina's use of subject drop in wh-questions Search Nina's transcripts 01 through 19 for occurrences of the following wh-words: who, what, where, when, how, why, whose, which. You should create two output files, one for transcripts 01 through 09, and one for transcripts 10 through 19.
Go through your two output files in detail. For each output file, tally up and record how many utterances fall into each of the following four classes:
Create a 2 x 3 table of results (2 rows and 3 columns) like the one below. Let the first row represent Nina's early transcripts (01-09) and the second row represent her later transcripts (10-19). This works just like the table from before. Let the first column represent the number of utterances in class C for each set of transcripts, and the second column represent the number of utterances in class D for each set of transcripts.
For the third column of your table, calculate the percentage of these (non-subject wh-word) utterances that have a dropped subject, by adding the class C and class D amounts for each set of transcripts together, then dividing the class C amount by the result and multiplying by 100 (that is, 100 * C / (C+D)). Put the resulting percentage of dropped subjects for each set of transcripts in the third column. Describe your results. Does the percentage of dropped subjects decrease as Nina gets older?
Part 6: Discuss the comparison with Valian's (1991) results. Consider the tables below, from O'Grady (1997), based on data from Valian (1991). Valian (1991) reports on percentages of dropped subjects in general, not just in wh-questions. Describe first how your results on subject drop for eat and get (in part 4) compare with Valian's results (shown below). Pay particular attention to group of children whose age and/or MLU matches the transcript you are looking at. Did you find more or less what Valian found? Now, let's compare the overall subject drop rate with what you found to be Nina's rate of subject drop in wh-questions (in part 5). Are subjects dropped more often or less often in wh-questions? Does this comparison support the hypothesis that Topic Drop accounts for some cases of subject drop in child English? Respond, and explain your answer.
O'Grady, William (1997). Syntactic Development. Chicago: University of Chicago Press. Valian, Virginia (1991). Syntactic subjects in the early speech of American and Italian children. Cognition 35:105-22. Comments on combo: CLAN includes a relatively powerful searching tool called combo. I will outline a couple of points here, although you should probably refer to the CLAN manual for more information. An example of the combo command is given below: combo +t*CHI +w2 -w2 +s"what^my" nina* > whatmy.txt This command says:
This will look for "what" immediately followed by "my" in any of the nina files, returning something like this: *** File "Moxie:CLAN:suppes:nina19.cha": line 254. *CHI: I want to play with you here . *CHI: look what my got . *CHI: look (1)what (1)my got . *MOT: I see what you got . *MOT: what did you get ? You can see that we used the "^" character in the search string. This character means "immediately followed by", so what we searched for was "what" immediately followed by "my". In these search strings there are several other special characters that you can use.
You can combine these in various ways to get useful effects. A couple of common things you might use are:
Some example combo commands are:
Instead of typing in the thing you are searching for each time, you can also use a "search" file. This is a text file that contains the things you want to search for. An example search file might look like this (searching for first person pronouns).
If you save this file as "search-1pron.txt" in your Working directory, then you could do the search with the following combo command, where the @ tells combo to look in your file for the list of things to search for. combo +t*CHI +w2 -w2 +s@search-1pron.txt nina* > pron1-nina.txt . |