# Assignment 3
rm(list=ls(all=TRUE)) # clear memory
# (1) In the Brown corpus of written English, the words "horrible", "horrifying", and "horrid" occur 13 times, 3 times, and 1 time respectively. The same words occur 9 times, 4 times, and 6 times respectively in the LOB corpus of British English. Test whether the frequencies observed in the Brown corpus differ from the ones observed in, and expected from, the LOB corpus.
# (a) Formulate the text and statistical hypotheses for this study.
# (b) Explore/summarize the data and represent them graphically.
# (c) Compute the required statistical test and briefly summarize the result.
# The file <201_03-04_uh(m).csv> contains data on disfluencies.
# - column 1: the number of the data point
# - column 2: the sex of the speaker
# - column 3: which disfluency was produced
# - column 4: whether the disfluency was produced in a monolog or a dialog
# - column 5: how long the disfluency was (in ms)
# - column 6: in which position in the sentence the disfluency showed up
# (2) You want to test whether the lengths of the disfluencies are normally distributed.
# (a) Formulate the text and statistical hypotheses for this study.
# (b) Explore/summarize the data and represent them graphically.
# (c) Compute the required statistical test and briefly summarize the result.
# (3) You want to test whether the frequencies of the three disfluencies in the sample allow you to say that they are equally frequent in general.
# (a) Formulate the text and statistical hypotheses for this study.
# (b) Explore/summarize the data and represent them graphically.
# (c) Compute the required statistical test and briefly summarize the result.
# (4) In a sample of a corpus, the frequencies of the disfluency markers uh and uhm were counted before content words and function words. The following distribution was obtained:
# before content word before function word
# uh 19 32
# uhm 38 15
# (a) Formulate the text and statistical hypotheses for this study.
# (b) Explore/summarize the data and represent them graphically.
# (c) Compute the required statistical test and briefly summarize the result.
# (5) You are not satisfied with the result in Table 29 and you want to do it again but this time provide the subjects with a third possible answer, "intermediately acceptable".
# (a) Formulate the text and statistical hypotheses for this study.
# (b) Compute the required statistical test and briefly summarize the result.