Second Language Vocabulary Acquisition, linguistic context and vocabulary task design
Rob Waring.

Part 1: A quick review of some Second Language Vocabulary Acquisition Research.
All words are not equal.
Research has shown that the vocabulary learning task facing the second language learner is not as insurmountable as many learners and instructors may think. A second language learner with a vocabulary of 5000-6000 words could be labelled an advanced learner. Of the 54,000 word families in a large English dictionary less than 20,000 are known by native speakers (Goulden, Nation and Read, 1990) and represent an average native speaking adult's vocabulary size. Moreover, of these 54,000 word families only 3,000 of them cumulatively cover 95-98% of any general text leaving 51,000 word families of low frequency of occurrence which our learners do not need to worry about so much. Clearly the more of these low frequency words they know the better off they are, however the return for learning the most important higher frequency and more useful 3000 word families is substantial.

The first 100 most frequent words take up about 50% of all words one will ever meet whereas the first 1000 take up about 70% and the first 2000 about 80-85% depending on the text. However, the Longman Lancaster Corpus cites 3000 words are needed to cover 80% of written text. These figures differ due to sampling differences and how one defines a 'word'. Researchers from The British National Corpus Spoken Corpus cite recent research that shows a 600 word vocabulary will cover 80% of spoken text (Summers, 1995).

The necessity of building a startup vocabulary quickly.
One of the major problems facing second language learners is that they understand very little especially at the beginning stages. Often this is a result of insufficient vocabulary knowledge (Laufer, 1992). What our beginners need is a sizable startup vocabulary from which they can decipher the patterns of the language and enable language use. This in turn enables vocabulary knowledge to grow which reactivates the cycle of enabling language use. Clearly one without the other will not benefit our learners. Other evidence for this position comes from research into L1 and L2 discourse analysis which shows that learners move from the holophrasic level to phrases and quasi-sentences only after several hundred words have been learned. Whether this is a cause and effect phenomenon is uncertain however. In addition, coping strategies such as paraphrasing, restructuring, rephrasing and so on will be easier with more vocabulary to work with. Research into the Longman Corpus of learner errors shows that learners avoid making lexical errors, tending to use more familiar words than experiment with more unknown and possibly more appropriate words. Summers (1995, p. 27) suggests this may be due to insufficient lexical knowledge. If this is so, a wider and larger vocabulary would go some way to alleviate this problem. Finally my experience tells me that learners who wish to master a language expect to learn vast amounts of new words. It would seem best to capitalize on this from the start.

So how do we approach the task of teaching a startup vocabulary?

There are several options. One is to allow our learners to learn incidentally through context, which I shall return to later. Another alternative is teach all the startup vocabulary in class word by word, but using this method would not only take up so much time it could only ever hope to introduce a few hundred words in a single year. Another option is to use one of the many commercially available vocabulary workbooks. This seems fine until one looks carefully to see that they are rarely available at the beginner levels and mostly do not recycle the vocabulary previously learned leaving a cemetery of dead words behind. In addition it seems that the authors have not read the vocabulary acquisition literature. Clearly we need a program which takes up minimum class time, but allows for out of class study which combines fast and effective learning of new words while allowing fluency to develop. One such method could be to use word cards made from word lists in tandem with extensive listening and reading with most of the word done out of class.

Despite direct, intentional learning of words out of context being an 'unnatural' way to study and seemingly based on word to word correspondence and translation there are still many advantages of direct learning. Direct learning especially in conjunction with mnemonic techniques is a very efficient way to learn many new words in a short space of time. Of the dozens of studies comparing learning from context with direct learning, direct learning (even rote learning) showed that direct learning was superior not only in speed of learning but in retention as well as amount of words learned (see Nation 1990 for a review). Only one study showed close gains to direct learning, but retention was better for direct learning. Research has shown (Meara, 1995) that an average learner using direct learning techniques can start to learn 30-40 new words in an hour and good learners up to 70 or more (of course we cannot learn all the associations, spelling, pronunciation, shades of meaning and so on at one meeting). Such mnemonic devices include the keyword method, (see Pressley, Levin and Delaney, 1982; Pavio and Desrocher, 1981; for comprehensive reviews) or the hookword method (Kelly, 1989). These are only two of several direct learning strategies that may be employed. Other possible direct learning techniques could include form and meaning associations, analysis of affixes and roots and so on. Learning intentionally from 'pregnant' contexts currently being researched in Holland also show promising signs. Pregnant contexts are contexts rich enough to allow a learner to guess the word meaning on the first meeting. The reader may note that rote learning is one of the least successful of the direct learning techniques.

One common way to order, sequence and recycle the learning is by using word cards written by the learner herself. These are small pieces of paper with the word or phrase to be learned on one side with a definition / translation / picture or whatever on the other. The idea being to look at one side and remember the other. Nation (1990) suggests this as a very useful tool and very popular among learners as their learning is personalized, organized and quantifiable. In addition they can map their progress and see if they have met their learning targets. Note that I do not propose that these word cards be put in a 'WordBank' or envelope as a way to tackle only new words met in class, I propose that it could be used for words from word lists as well as words met in the standard course textbook. WordBank systems will not suit all learners nor learning styles but whatever system of direct learning is adopted it must be principled and systematic with vocabulary learning targets set along the way.

What about learning from context. Isn't that useful?
Some researchers say that most words are learned from context (Nagy, 1985; Sternberg, 1987) and it is generally accepted as a very useful and productive way to learn words. The problem is that beginners need a basic vocabulary before they can even start to learn from context as they have insufficient knowledge and the text is too dense with unknown and partly known words. Laufer (1989) demonstrated that to effectively guess new words from context there has to be at least 95% coverage. Coady and Nation (1988) suggest a figure of one word in fifty or 98% coverage. I would therefore recommend that guessing from context be left as a later strategy when the learner has enough knowledge base from which to work. This level may be at around 1000 word families or even later. This would mean starting from an emphasis on direct learning moving on to strategies based more on incidental learning such as guessing from context. Learning from word lists is a conscious intentional strategy whereas learning from context is usually incidental to the task at hand, and seeks to aid learners in deepening their knowledge of already known words. They are different animals and serve different purposes.

Is that all?
Certainly not. Words learned in isolation will remain that way until the knowledge is connected to prior knowledge and can be retrieved fluently and efficiently. Parallel to the direct learning of words would be a principled program of extensive reading and listening to be done out of class or in the Self Access Centre to allow the learner to meet the words in context and deepen their understanding of the partly learned words. It would also give opportunities to develop lexical access speed to allow the learner to move form the word-bound level to the 'chunking' level where they not only process words faster they comprehend more.

What do we do with less useful words?
Basically leave them until the important words are well-learned first. This does not mean ignoring them altogether, but the reward for learning an average low frequency word is not as much as that of a higher frequency or more useful word. Therefore these words could be glossed or at least not dwelt on. Of course low frequency words which the learner will meet again in say an EAP context should not be ignored.

What will this startup vocabulary look like and where can I get it?
First and foremost it will have to match the needs of your learners. A good amount of the items would probably be of general service with the remainder catering for specific need. The more learnable the words the better - a good source are loan words borrowed in to the L1 from English. Compilers should choose words which are of the widest use, for example go covers walk run, fly and so on. Common sense reigns here.

Many word lists and frequency databases are available. One of the more recent is the latest edition of the COBUILD dictionary (1995) which now ranks words according to frequency although there is no list of these words in the dictionary. Words in bands 5 and 4 are of the most importance and should be learned first. The COBUILD words are helpful as they are taken from written and spoken text. Another list would be West's General service List (1953) which although dated and only based on a written corpus is still one of the best. Hindmarsh's Cambridge English Lexicon (1981) categorizes words into 7 bands of usefulness. Although based on teacher intuition and other word list studies it is not a bad guide for practicing teachers. From these lists word study lists can be made to give to your learners to study at home, say 20 words per class hour at the initial stages which will be regularly tested and recycled.

So what exactly should we be doing?
In appendix 1 there is a summary table which may be of assistance in terms of summarizing the position taken here.

Some problems and solutions?
A problem with this is that many people just see it as too radical - too far from what they do now. Some teachers say that it won't work with my learners they are just too lazy. That's fine as long as they understand that they cannot say nor understand much until this vocabulary is known especially if the teacher uses only English in class. Other teachers say it's too much work to set up. True, but you only need to set it up once and it's always there to call on. Probably your DTO already has a Self Access Centre and the word lists are easy to find.

I feel it would be more effective if this learning is organized and structured at the DTO level rather than at the initiative of one single teacher so that uniformity in the DTO could be maintained. This does not rule out experimenting with the idea however. I'd be very happy to hear from any DTO that wishes to experiment with this.

Part 2: Research into linguistic context in Kyoto.

The acquisition of a word depends not only on the number of times a word is met, but also on the quality and type of task design as well the quality of processing of the input by the learner. This input must involve meeting the word in many different contexts, both linguistic and topical. It must also focus the learner on processing the words deeply by making connections and associations with previous vocabulary knowledge.

Basic research question.
The study looked at whether meeting a word the same number of times in the same form (same exposure condition) leads to better retention than meeting the same base word the same number of times in different linguistic contexts (derivative/inflectional condition). To put it another way will meeting helped three times be retained better than meeting three derived or inflected forms such as helping, unhelpful and helpless.
18 learners at Kyoto DTO and 76 learners from a university in Japan were used as subjects for this study. As Kyoto is a small DTO there was an insufficient sample of subjects who were able to help in the experiment, therefore the research was also done with learners who were being taught in the 'British Council way' and using BC recommended texts. 34 learners took the lower-intermediate tests and the posttest and 38 took the advanced tests and posttest.


Three separate sets of work sheets were used for learners at two differing levels of ability - lower intermediate and advanced. Each set comprised an initial and post vocabulary quiz, the difference between which served as one measure of any gains that had been made. These tests were constructed to check if they knew the synonyms or meanings from a list of 6 items following Nation's vocabulary test design (1990). Distractors were included from previous work in the text. The words to be tested were selected from a section of their text which the learners would probably not know. This was important as the research question can only be tested if the learners do not know the words. The initial test served as a check to see if any word knowledge had been gained. The posttest (a surprise test taken 2 weeks later) served to see how much knowledge was retained.

In each group there were two instruments. Half the class received one and half the class received the other. One half of each group at each level received a text where the tested words appeared in the same form and part of speech three times, for example value, value and value. The other half of the group received an identical text except that the tested words were met in different linguistic form, for example value, valuable and valueless. Very small changes to the text were made to allow for these words to be used. A complete list is in appendix 2. Comprehension of the text was checked. The learners had to write their definition of the target words and try to write 2 or 3 sentences using the word in a context different from that in the text. Instructions were given for the subjects to try to make sentences that can explicitly showed their knowledge. No time limit was imposed. These sentences and definitions were rated by 2 raters (and a third on disagreements) for native like use and demonstration of understanding of the words. Target words known in the pretest were ignored in the analysis. The posttest had two parts, one a vocabulary recognition task similar to the pretest and two production tasks - definitions and use in sentences.

Data and discussion.
Data analysis revealed that there was a slight effect for the same exposure condition for the lower-intermediate learners but there was a significant effect for the derivative/inflectional condition for the advanced learners on the initial test. In the posttest taken 2-3 weeks later the same slight significant differences between the two conditions were found. This means that the lower-intermediate learners posted better scores on the tests if they met the new word in the same context than in different contexts. The advanced learners however did the opposite. They showed a significant effect for a wider view of the base word.

Several points can be made from this. From this data it seems that more advanced learners can take advantage of their morphological knowledge in order to guess new words and can more readily produce derivatives and inflections than less advanced learners. Secondly, it seems that the lower level learners cannot as easily see connections with other members of the word family and still see each word as a single entity rather than as a member of a word family. More advanced learners are able to take advantage of their morphological knowledge and tend to see words in word families more often than lower-intermediate learners. Thirdly, the learning of word parts may aid lower level learners to overcome this word-boundedness. Advanced learners also receiving word part instruction may benefit even more although this should be left until a further study before we can draw any conclusions. Fourthly, while we can to some extent determine the vocabulary the learners will meet, we cannot always determine the vocabulary they acquire (although see Joe's research mentioned below). From the present study we could focus the lower-intermediate learner on repetitions of the same linguistic form, but require a more advanced learner to use a greater range of words from the word family.

The study undertaken is small scale, which tested only a few words with only 3 repetitions and a few derivatives and inflections and the results only barely significant statistically. Further research is needed to confirm these results before the above can be viewed with any surety.

Part 3: Some parting sundry comments on vocabulary task design.

Compilers of word lists should be aware of the research into interference theory. This states that learning new words from the same semantic group discourages learning rather then encourages it (Higa 1963; Tinkham, 1993 and Waring, 1995 forthcoming). The reason is that the learners not only have to learn the words, they must also learn to keep them apart. Deepening vocabulary knowledge through associations and semantic maps does not seem to cause a problem however. It seems that most vocabulary textbook writers have not read this research.

Joe's (1993a, 1993b) research in New Zealand has shown that we can to some degree determine the words that our learners will use if we require them to use them in tasks. If the words are optional to the task then usually learners tend to avoid them and prefer words they know well.

Exercises designed to build knowledge of word parts are very useful but should not be misused. Teachers should be aware of the dangers of asking learners to use word parts to guess the meanings of words as more often than not they are mislead. A better strategy is the get the learner to use context clues first, decide on a possible meaning and then use word parts to confirm the guess. This is also true for asking learners to guess from multiple choice lists. More often than not learners mischoose items.

Goulden, R., I. S. P. Nation, and J. Read. 1990. How large can a receptive vocabulary be?
Applied Linguistics 11 (4): 341-363.

Hindmarsh, R. l980. Cambridge English Lexicon. Cambridge University Press. Cambridge.

Joe, A. 1993a. What effects do text based tasks and background learning have on incidental vocabulary learning? English Language Institute, Victoria University, Wellington, New

Joe, A. 1993b. Text based tasks and incidental vocabulary. English Language Institute Victoria University Wellington, New Zealand.

Kelly, P. 1989. Utilization of the hookword method for the learning of Polish vocabulary: A personal investigation. ITL:Review of Applied Linguistics. 85 (86): 123-142.

Laufer, B. 1992. How much lexis is necessary for reading comprehension? In Arnaud, P. J. L. and H. Bejoint (Eds.). Vocabulary and Applied Linguistics. Macmillan, London.

Meara, P. 1995. The importance of an Early Emphasis on L2 Vocabulary. The Language Teacher. 19 (2): 8-10.

Nagy, W. E., P. Herman. and R. C. Anderson. l985. Learning words from context. Reading Research Quarterly. 20: 233-253.

Nation, I. S. P. 1990. Teaching and learning vocabulary. New York: Newbury House Harper Row.

Nation, I. S. P. and J. Coady. 1988. Vocabulary and reading. In Carter and McCarthy (Eds.) Vocabulary and Language Teaching. Longman, London.

Nation, I. S. P. and R. Waring. In preparation. The goals of Vocabulary Learning : Quantative studies. To appear in McCarthy, M. and N. Schmitt (Eds.) Second Language Vocabulary. Cambridge University Press, Cambridge.

Paivio, A. and A. Desrochers. l98l. Mnemonic techniques in second-language learning. Journal of Educational Psychology. 73 (6): 780-795.

Pressley, M., J. R. Levin and H. Delaney. l982. The mnemonic keyword method. Review of Educational Research. 52 (1): 6l-9l.

Sternberg, R. J. 1987. Most vocabulary is learned from context. In McKeown, M. G. and M. E. Curtis. (Eds.). The Nature of Vocabulary Acquisition. Lawrence Erlbaum Associates. Hillsdale: New Jersey. 89-105.

Summers, D. 1995. Vocabulary learning. Do dictionaries really help? The Language Teacher. 19 (2): 25-28.

Waring, R. (Forthcoming) Semantic groups and Interference theory. KIYO.

West, M. l953. A General Service List of English Words. Longman, Green &;Co. London.
Rob Waring has been a teacher at Kyoto DTO since 1991.

Appendix 1

Appendix 2

The words tested.

The words in the first column were repeated three times in the same exposure condition. All the words were tested in the derivative/inflection condition. You may note the high percentage of inflections compared to derivatives.


Huddle Huddled Huddling
Frequent Frequenting Frequented
Mourn Mourning Mourned
Scale Scaled Scaling
Lavish Lavishly Lavishingly
Dismantle Dismantling Dismantled
Hoist Hoisted Hoisting
Scrounged Scrounger Scrounging
Impenetrable Impenetrably Impenetrating

Lower Intermediate

passion passionless passionate
deal with dealt with dealings
concept conception conceptualize
isolate isolating isolated
vanish vanishing vanished
courteous incourteous courteousness
manage managed managing
slam slamming slammed
charming charmed charms

Contact Info:
Rob Waring
Notre Dame Seishin University, 2-16-9 Ifuku-cho, Okayama, Japan 700
Tel 086 252 1155 Fax 255 7663 Home 086 223 0341
Email: waring @

Return to Main menu of papers