Tuesday, December 29, 2015

Word Association - A Study of Russian Language Learners

My paper for the Lexis module of MA TESOL. Do contact me if you are interested in the actual responses from learners or my references.


1.0 Introduction
“Alright, alright,” said my advanced student after long debate, “If you say it’s ‘to fulfil an ambition’ then I’ll write ‘to fulfil an ambition’, but I just don’t get it. An ambition is something that, by definition, just cannot be fulfilled.”
Teaching lexis is much more than simply teaching words. We are instructed to present new items to our students in terms of meaning, pronunciation and form, and specifically in that order, but ‘knowing’ a word by no means ends there. It would be virtually impossible to present all the possible meanings of a word at the same time, and even if we could, it would certainly only confuse our students with much unnecessary information, but even then we would be omitting a huge aspect of what it means to ‘know’ a word: the associations connected to it.
In the case of my advanced student quoted above, the difficulty in learning this collocation is that the word ‘ambition’, as he is translating it from his native Russian, is always something in the future and therefore ceases to be an ambition as soon as it has been fulfilled. This is an interesting example of how associations in the mother tongue may carry over to items in the target language, and thus provide a justification for teaching connotation alongside meaning, pronunciation and form.
I actively encourage my students to make their own associations with lexis to help them memorise new items, so this project was very interesting to me as a means to discover which links students actually use to retrieve their knowledge.

2.0 The Mental Lexicon
At what point can it be said that a student ‘knows’ a word? Definitions may differ, but what many agree on is the necessity for the knowledge of associations with the target word (Richards in Meara 2009; Deese in Coleman 1964; Nation in Schmitt and Meara 1997).
1. The spoken form of a word.
2. The written form of a word.
3. The grammatical behavior of the word.
4. The collocational behavior of the word.
5. How frequent the word is.
6. The stylistic register constraints of a word.
7. The conceptual meaning of a word.
8. The associations a word has with other related words.
 (Nation 1990:31 quoted in Schmitt and Meara 1997)
Three of the eight points on Nation’s list (3, 4 and 8) are related to associations: grammatical, collocational and connotational. If a teacher is to ensure that a word is truly ‘known’, then he has to make sure target vocabulary is both presented and learnt in relation to these factors.
Additionally, once a word has been ‘learnt’, what evidence is there that it will be actively used? Previously, theorists have spoken of a receptive-productive cline and of moving words from the active to the passive; however Paul Meara (2009) proposes an alternative to the storage of vocabulary with his graph theory, which suggests that words are stored as part of a complex network in which one word provides (or likewise, fails to provide) ‘access’ to another, rather than being at a particular point on a continuum.
Of course, we cannot expect a learner to be able to make all the associations that a native speaker may. As McCarthy states, ‘The language learner has a formidable task in emulating the complexities of L1 storage in an L2 and, naturally, the L2 mental lexicon will have to develop from a few initial strands to a the goal of labyrinthine connections between words’ (1990:42).
Nonetheless, as teachers, we have to ensure that we provide and utilize those ‘initial strands’ because ‘seeing patterns in associations can help teachers present new vocabulary and evaluate student comprehension’ (Sökmen 1993:135).
Besides the role that associations play in knowing a word, they also contribute to rapid retrieval. McCarthy makes this clear when he states that, from the point of view of the learner, there can be great frustration in failing to recall the exact word required (it’s on the tip of my tongue!), and also from the point of view of the teacher:
‘If a language learner cannot actively use a particular word when it is needed, without too much mental searching, then we might feel that we are dealing with an incomplete knowledge of the word, or at the very least we will want to distinguish between receptive knowledge and productive knowledge’ McCarthy (1990:43).
But how can this ability be assessed?

3.0 The Word Association Test
If the mental lexicon is seen as a network, then ‘one of the most accessible and most easily understood methods of studying the structure of semantic relationships in bilingual lexicons is the use of word associations’ (Meara 1980:13). Meara points out that such studies are particularly useful to teachers of English, due to the high levels of associational stereotypy, ie. native speakers exhibit a tendency to produce the same response to a given stimulus.

Word association tests (WATs) may have different forms, but typically involve the student responding to an aural stimulus word by writing down the first word which comes to mind. Alternatives to this have been Coleman (1964), who provided 5 context words before the association was to be made, tests requiring multiple responses and even interlingual associations.
Sökmen (1993:137) neatly summarises the conclusions drawn from previous studies.
‘Past word association research with second language learners leads us to expect that nouns are most likely to solicit nouns (Ludwig, 1984), and verbs will get more varied responses (Ruke-Dravina, 1971). Previous research also indicates that beginners have fewer primary responses because their lexicons are small and less organized (Meara, 1978). Advanced students have more synonyms and contrast words (Soudek, 1981). Regarding age and education, Riegel's (1968) study shows that older and more educated students have fewer primary responses.’

Both Sökmen and Meara criticize WATs for focusing on words as individual items and thus dealing with vocabulary breadth, rather than depth, or a combination of the two. As Meara states, tests on the word-level are a case of ‘not being able to see the wood for looking at the trees’ (2009:74). Nevertheless, they remain the most popular method of testing, perhaps due to their ease of administration.
Besides variation in the administration of the test, responses have also been classified in different ways. Typically, and for the purposes of this study, responses are classified as paradigmatic, syntagmatic, phonological (or clang) and encyclopaedic.

3.1 Paradigmatic
Paradigmatic, or choice, associations involve the replacement of one word by another of the same word class. They can further be classified as co-ordination (words of the same level of generality, eg. husband – wife), hyponymy (superordinates and subordinates eg. vegetable – cabbage) and synonymy (near synonyms, eg. beautiful - pretty).

3.2 Syntagmatic
Syntagmatic associations refer to both collocations and colligations, and seem to be the association of choice for advanced learners. ‘Since they have more words in their lexicons and more detailed word clusters, it appears they are less likely to reIy on a contrast association’ (Sökmen 1993:146). Schmitt and Meara (1997:18) suggest that learners master grammatical and semantic associations but may never master the connected collocational aspects.

3.3 Phonological Associations
These involve a response which has neither semantic nor grammatical link to the cue-word, but which bears some phonological similarities to it. Meara describes this category as ‘phonetically motivated rather than semantically motivated’ (1980:17) and claims that they occur much more often at low levels. High occurrences of phonological, or clang associations, could be due to instruction using the keyword technique, which encourages learners to associate target vocabulary with similar-sounding words in the L1, albeit with an unrelated meaning (Meara 1980). Fitzpatrick (2007) reports assumptions in studies that as proficiency rises, there is a move away from phonological associations.

3.4 Encyclopaedic Associations
These refer to ‘personal knowledge about the world’ (Khazaeenezhad and Alibabaee 2013:110), including affective associations. Previous studies show that the response is dependent not only on the cue-word, but also, naturally, on the respondent himself. In Sokmen’s (1993:146) study, she found that beginners were unlikely to give affectual responses.
An interesting aspect of this category are interlingual associations, which, according to Meara are ‘associations which are made in a language which is not the same as the one in which the stimulus word was presented’ (1980:17). Meara was referring to participants recording their results in another language to the cue-words, but I believe this also applies to an L2 cue which evokes an L1 association, which is then translated into the L2 and recorded. This could be because the student does not know that the L2 cue does not have the same connotation, or because the L1 association with this word is simply so strong that it overrides any other associations.

4.0 The Study
4.1 Choice of Cue-Words
According to McCarthy’s stipulations, 8 items of vocabulary were chosen to include a variety of parts of speech, as shown in the table below (Figure 1). I had intended to include one adverb, but was restricted by the size of vocabulary of my A1 level participants. All 8 cue-words necessarily appear in the A1 textbook so I could be sure students were familiar with these words, but in selection I tried to choose items which had multiple contexts and collocations in order to give higher level students as much freedom as possible to their associations.
Figure 1 also shows the frequency of the items based on the number of results on the Internet using the Google search engine. For the purposes of this study, pineapple was chosen as the required low-frequency word. According to the Collins Cobuild online dictionary, 18 out of the 26 languages given translate pineapple as ananas, and four further languages have a very different-sounding translation, leaving very little chance of L1 influence in understanding this word in comparison with other common fruit such as orange or banana. Therefore, in this case just as with the other seven test words, students need to understand the word in order to retrieve an association, and if the association is phonological, then it will not simply be the translation of pineapple in the L1. Studies mentioned in Fitzpatrick (2007) suggest that high-frequency cue-words result in a narrower range of responses than low-frequency items. In this study, we can expect that pen and blue will yield less variety than pineapple.
Traditionally cue words have been chosen for tests based on frequency. The most famous example of a frequency-based word association list is that used in the Kent-Rosanoff test of 1910, but as Schmitt and Meara note, ‘word frequency is not a reliable index’ (1997:25). McCarthy (1980) states that there has been little agreement on which cues to use. 
As already stated, all eight test items are common words which even students at A1 are familiar with. These are presented without a context and thus without an ‘anchor word’ to help direct meaning. McCarthy quotes Moore and Carling’s (1982:196) definition of anchor words as ‘words of low semantic variability which {…} narrow down the meaning options’ (1990:44). The absence of such anchors allows complete freedom to the retrieval process, thereby ensuring as far as possible unbiased results.
Sökmen (1993:137) names previous studies which indicate that nouns evoke nouns, while verbs result in a wider variety of responses. Both run and stand were deliberately chosen because they belong to both parts of speech, ie. they are both verbs and nouns. At lower levels, however, learners are likely to perceive them as verbs.

Table 1: 8 Cue-Words

Classification
Cue-word
Justification
Frequency
1.
Function word
in
Usually the first preposition learnt, preposition of place, also features highly in collocations, idioms, phrasal verbs. Can be mistaken orally for ‘inn’.
25,270,000,000
2.
Common object (noun)
table
A word used in virtually every English lesson, yet also representing other objects besides being a synonym for desk.
1,990,000,000
3.
Common object (noun)
pen
Taught in the first lessons of English, yet may take many different forms.
469,000,000
4.
Adjective
blue
Common colour deliberately chosen because of the many associations connected to the Russian word ‘голубой.
14,970,000,000
5.
Verb / noun
run
Common verb, especially familiar to young learners, also part of phrasal verbs, literally and metaphorically. Also a noun.
2,080,000,000
6.
Verb / noun
stand
Every student is familiar with the phrasal verb ‘stand up’, yet it also features as a noun at higher levels and is part of numerous other phrasal verbs.
1,510,000,000
7.
Adjective
strong
Initially learnt as an adjective describing physical strength, but can be used in many contexts where the Russian 'сильный cannot.
1,550,000,000
8.
Low frequency word
pineapple
Tropical fruit often encountered in early stages of textbooks yet less common in itself in Europe.
83,500,000

4.2 Choice of Participants
I was primarily interested in the mental lexicon of my own learners, and so 25 students were chosen at each of the CEFR levels I currently teach: A1, B1 and C1. A further 25 non-native teachers volunteered to take part in the test, totalling 100 participants. All the A1 students are children between 6 and 14; 9 of the B1 level participants and 17 of the C1 participants are teenagers.

4.3 Conducting the Test
The WAT was carried out among the students as part of a regular lesson, while teachers were interviewed individually. Stimuli were given orally only, even when students asked for clarification. ‘Blue’ and ‘in’ could be interpreted as ‘blew’ and ‘inn’ due to the lack of context, and I was interested in seeing which word came to the participants’ minds first. Respondents recorded their answers in a designated form (Appendix A).

4.4 Limitations of the study
The inclusion of an adverb as a stimulus word would have been interesting as I encourage the use of adverbs wherever possible and it would be seen whether students have taken this on board. Adverbs collocate with both adjectives and verbs, so this would have been a further area of investigation. However, my A1 students do not know any adverbs except ‘very’, which I felt was so general a word as to not reveal any patterns at all, and so an adverb was omitted.
During testing it became clear that my numbering system (from left to right) went against the natural logic of most of my students, who filled in their boxes from top to bottom, then realized their mistake and had to re-number their responses. In future, I would have a single row of 8 boxes to avoid confusion.
Classification of responses is problematic, as there is a good deal of overlap.  Respondents were not asked why they had chosen their association; indeed, it is doubtful whether they would be able to say. Where there seemed to be no obvious reason for the response, the encyclopaedic category was chosen, as it was assumed that the learners had some personal association which the researchers could not know about.
It must be remembered that the choice of stimuli was subjective and based on my knowledge both of my students and of their L1. Fitzpatrick warns us that ‘the findings of any word association study are {…} dependent to a great extent on the selection of cue words’ (2007:323). She also states that we must ‘be wary of concluding that word association behaviour is independent of task language’ (2007:320). Therefore, any results can only be considered relevant to these groups of Russian-speaking participants learning English.

5.0 Results
5.1 Trends by Cue-Word
Predictably, the greatest number of emotive associations resulted from the word pineapple, which has little semantic variability, few collocations, no colligations, and for the majority of people has a strong positive association.
The most frequent response to pen was pencil, and here it is difficult to say whether the association was that of a lexical set (paradigmatic) or phonological, so results have been included twice to show both possibilities. In general, it is assumed that the link is paradigmatic, but scores in brackets show the number of phonological links if pencil is categorised as such.
Results do not confirm the hypothesis that higher-frequency words result in more homogenous responses, as pineapple, the lowest-frequency word in this study, produced fewer responses than commoner words such as table.
It can be seen that nouns produced more paradigmatic responses than other parts of speech, with the greatest number coming from the word pen. Blue gave the highest number of syntagmatic responses.
Finally, there is no evidence to suggest that nouns result in a narrower range of responses than verbs.
Table 3 shows the two commonest responses by cue-word and level. Only frequencies higher than 2 were included in the table; those instances where only one response is present indicate that all other responses were present only once or twice.
The most significant result is the homogeneity between levels. It seems to suggest that, whatever the student’s proficiency, the main associations are nonetheless very similar.

Table 2: Responses by Cue-word

Range of responses
Paradigmatic Responses
Syntagmatic Responses
Phonological Responses
Encyclopaedic Responses
In
40
51
40
11
3
Table
43
49
37
1
12
Pen
27
65
28
44 (41 = pencil)
4
Blue
25
22
71
1
6
Run
45
34
56
1
7
Stand
43
28
57
2
6
Strong
44
21
65
1
3
Pineapple
39
39
38
14
19

Table 3: Top Responses by Cue-word

A1
B1
C1
NNS
In
On (5), box (3)
Out (8), on (4)
Out / home (4)
Out (8), room (4)
Table
Chair (5), desk (3)
Chair (6), school (3)
Chair (7)
Chair / desk (6)
Pen
Pencil (9), school (4)
Pencil (15)
Pencil (8)
Pencil (9), write / blue (3)
Blue
Sky (8), sea (4)
Sky (9), pen (3)
Sky (11)

Sky (11), red (3)
Run
Jump (5), fast (3)
Fast (4), walk (3)
Away (6), park (3)
Fast (7), jump (5)
Stand
Up (11), sit (4)
Sit (9)
By (3)

Up (10), sit (4)
Strong
Man (9), enough (3)
Weak (8), hard (3)
Muscles (4), man (3)
Weak (13), man (3)
Pineapple
Fruit (4), eat / yellow (3)
Apple (8), fruit (4)
Yellow / apple (3)
Yellow (6), juice (4)

5.2 Trends by Language Proficiency
Fitzpatrick (327) suggests that the more proficient a learner, the greater the range of responses. This study does not support this statement, as the widest range of responses were from C1.
Syntagmatic responses account for just over half of all given, with the greatest number among C1 level students, and, in stark contrast, the fewest among B1, who gave well over half their responses as paradigmatic associations.
The high frequencies of syntagmatic and encyclopaedic responses at C1 suggest a high level of creativity and semantic associations among these learners.

Table 4: Responses by Proficiency

Range of responses
Paradigmatic Responses
Syntagmatic Responses
Phonological Responses
Encyclopaedic Responses
A1
105
74  
103
5 (+ 9 = pencil)
14
B1
99
118
67
10 (+ 15 = pencil)
11
C1
137
53
119
14 (+ 8 = pencil)
26
NNS
98
75
113
4 (+ 9 = pencil)
9
Total
306
320
402
33 (+41 = pencil)
60


5.3 Noteworthy Responses
Some responses worthy of discussion are, firstly, those associated with the word blue. The Russian translation of this word (голубой) is slang for homosexual, and we can see that the L1 association is so strong that it carries over into the L2 and gives us gay, Viktor (a homosexual student in the group) and guy, which I believe to be a misspelling or misunderstanding of the word gay. There are no such occurrences among the teachers but there are five at all levels of students, suggesting a powerful L1 association and confirming Carter (1998), who states that lexical choice marks the evaluation of the speaker.
One teacher who participated in my test had been lamenting the lack of blu-tac in Russia just before we administered the test, so her response to blue was tac, showing that fresh associations may occur first. Participating on another day, or without my presence, her response would almost certainly be different. This reinforces Fitzpatrick (2007), who notes that the respondent brings just as much to the task as the stimulus offers in itself, and also Schmitt and Meara (1997), who claim that there is no direct connection between the frequency of the item and the chances of it being ‘retrieved’.

6.0 The Task
6.1 Mental Links
Some of the most frequent responses may be categorised as what Schmitt and Meara call ‘classroom English’ (1997:32):
                                                           Pen – pencil, school
                                                           Pen – red (from teachers only!)                                                       
Table – desk
                                                           Stand - up
This illustrates both the beneficial result of repeated, frequent exposure to vocabulary and the strength of links made when an item is presented visually. No doubt pen and table were initially introduced by pointing at them in front of a student’s eyes, rather than in the context of a text, or by definition, or by any other common presentation method.

We also see that lexical sets help students to retrieve words:
                                                           Pen (school object) – pencil, paper
In (preposition) – out, on
Pineapple (fruit) – apple, mango
Run (verb of motion) – jump, walk

Where possible, the superordinate was named, eg. pineapple – fruit. Perhaps a failing of the choice of cue-words is that there was little opportunity for students to give a subordinate and so there are only isolated examples such as pen – fountain (pen). This shows that words are arranged in the mental lexicon in their lexical sets, as opposed to a random list or alphabetical order. It also suggests that cue-words give a response at the same level of generality or less general responses (the superordinate) and are less likely to result in the subordinate. Meara’s evidence supports this in his example:
                                                           Sleet – rain
but      Rain ≠ sleet                                        Meara (2009:61)

Fitzpatrick claims that ‘high-frequency cues will lead to more homogenous responses’ (2007:323). This, however, is not borne out in my research. As seen in Table 2, the highest frequency word in resulted in 40 different responses, while the low-frequency pineapple resulted in 39. The scattered nature of the responses indicates that each learner makes his own association, be it grammar-based, meaning-based or affective.
Finally, Fitzpatrick names research showing that native speakers vary wildly in their response type and asks whether, with increasing proficiency, learners are ‘moving towards a response profile which represents a “native-speaker norm”’ (2007:327). The results of this study indicate the widest variety of responses at C1, yet with A1 also demonstrating quite a range of responses (Table 4). Regardless of range, the top two responses for most cue-words nonetheless show many similarities (Table 3).

6.2 Phonological Similarities
Results do not confirm the hypothesis that low-level learners make more phonological associations than more proficient students. In this study, A1 students made almost the same number of phonological associations as the non-native teachers, while the greatest number were by B2 students. If we assume that the cue - response penpencil is a phonological rather than paradigmatic link, there is little change to the overall trend.
In two cases, the NNS misunderstood the cue-word ‘in’ as ‘inn’. I suggest that the greater the size of the listener’s vocabulary, the greater the possibility of phonological associations. Low-level learners may simply not know enough similar-sounding words to be able to make this link.

6.3 Characteristic Responses
The overwhelming result for pen was pencil, but it is difficult to decide whether the classification of this response should be paradigmatic (co-ordination) or phonological. In Aitchison’s Reith lecture, she mentions the bathtub effect, whereby the beginning of a word is easier to recall than the middle segment, which could be playing a role here..
Meara’s (1980) statement that English has high associational stereotypy and therefore a limited range of responses is not clearly proven. The Kent-Rosanoff test also used table as a stimulus, and found that 78% of respondents gave chair. In this study, we see only 24% giving this response, although it is nonetheless the most frequent.
We certainly see that, among paradigmatic responses, co-ordination is the most frequent response type, with answers such as blue - black.

7.0 Conclusion
There are very few useful conclusions which can be drawn from a study of this small scale. When only 8 cue-words can produce over a hundred different responses, any inference can, at best, apply only to these test subjects. As Sinclair explains, ‘lexical patterns are very difficult to observe because they are realized by a large vocabulary of infrequent words, and so it is not easy to work out the recurrent patterns’ (2004:165).
The wide variety of responses in many cases supports Fitzpatrick’s assertion that individuals have a preferred word association response profile (2007:328), and we may also agree with her tentative suggestion that prediction is virtually impossible since each learner has thus to be analysed separately.
Nonetheless, one significant pattern is the tendency among low-level learners to give paradigmatic rather than syntagmatic responses. Widdowson (1989) mentions that a speaker is deemed ‘competent’ by his knowledge of lexical chunks. The implication for teachers of low-level students is therefore that words should be taught in phrases as soon as possible in order to enable fluency. ‘Overconcentration on learning single words may hinder the development of the L2 phrasal lexicon and deny the opportunities this gives for rapid retrieval and fluent, connected speech’ (McCarthy 1990:45).

Overall, literature suggests that ‘seeing patterns can help teachers present vocabulary’ (Sökmen 1993:135), therefore there is clearly a call for large-scale research in the future.