Notre Dame Seishin University, Okayama, Japan.



In this journal Tinkham (1993) in two experiments found that learning words grouped in semantic sets interferes with the learning of words. Tinkham found that if learners are given words which share a common superordinate concept (such as words for clothes) in list form, they are learned slower than words which do not have a common superordinate concept. This finding suggests that we should not give wordlists to our learners which have words that come from the same semantic set, but should be asking them to learn words semantically unrelated to each other. The present study, a close replication of Tinkham's, used Japanese words paired with artificial words and found a main effect against learning semantically related words at the same time, replicating Tinkham's findings. It can be tentatively concluded from these two papers that presenting students with wordlists of new words in semantic clusters, rather than in unrelated word groups, can interfere with learning. Following a discussion of the research design and some of its limitations, there is some comment on current research methodology.




It is common practice in many current second language coursebooks to introduce words in semantic groups. For example, learners are asked to learn 'parts of the body' in Fast Forward 1 Unit 6 (Black, McNorton, Malderez and Parker, 1986); 'clothes' in The New Cambridge English Course 1 Unit 9 (Swan and Walter, 1990) ; 'foods' in Headstart Beginner Unit 5 (Beaven, 1995); 'jobs' in Headway Elementary Unit 3 (Soars and Soars, 1993) and so on. More discussion and examples of this can be found in Tinkham (1993). These are often presented as a set of words (semantic clusters) and share a common superordinate (headword). There seems to be a pervasive belief among coursebook writers that doing so will aid vocabulary building and lexical associations in particular. This belief appears to be founded in methodology rather than on research.

In a first language study by McGeoch and McDonald (1931) it was found that if words to be learned were too similar, it interfered with learning. The poorest performance occurred with synonyms (which were the subject of this experiment). This finding and others, led to the formation of 'Interference Theory'. This theory states that when words are being learned at the same time, but are too 'similar' or share too many common elements, then these words will interfere with each other thus impairing retention of them. The degree of interference increases with the degree to which the interfering material becomes more similar to the material already learned. Anecdotal evidence for this can be found in many classrooms. For example, some learners confuse the days of the week, or the months get mixed up, as they can 'brother' and 'sister', or 'twelve' and 'twenty', or other words from closely related semantic groupings. This phenomenon may be a result of learning these semantically related words at the same time.

Extensive research into interference theory (see Baddeley, 1990 pp. 246-253, for a review; see also Higa 1963, 1965) has shown that memory traces often compete with each other. 'Interference Theory' suggests that if new words are to be presented to learners, they should be not presented in groups that share a common head word or superordinate concept. For example, 'clothes' words such as jacket, shirt and sweater should not be presented to learners as a group because the learning load is increased. The learner not only has to learn the new words, but as the words are so similar (they share the same superordinate concept) the learner will often confuse them and additionally will have to learn to keep the words apart, thus increasing the learning effort required. Instead, words should be presented in unrelated sets (words which do not have a common superordinate), such as frog, car and rain. Frog has the superordinate 'animal / amphibian', car has the different superordinate of 'types of transport' and rain has 'types of weather'. This will be taken up again in the discussion section.

Tinkham (1993) in this journal, investigated interference effects for word learning in two studies. In both studies, subjects listened to lists of English words paired with imaginary words, which comprised a set of words. The English half of each word pair was presented in a mixed order and the subject had to remember and say the imaginary (L2) half of the pair within a set time. Their task was to try to learn the word pairs in as few trials as possible. The criterion for learning was met when a set of words had been learned (I shall return to discuss the criterion for learning in the discussion section). The present study followed the same procedure as the Tinkham study and are detailed later. The first of Tinkham's 2 studies found that 3 semantically related words mixed with 3 semantically unrelated words together in a list of six words were learned more slowly than the unrelated words. The second study found that 6 unrelated words took less trials to learn than 6 unrelated words.

The study demonstrates that presenting new words in semantically related sets interferes with learning and thus is better to not to present words which are semantically linked at the same time. This finding goes against generally accepted opinion that says that learning words in semantic sets benefits rather than interferes with learning. It was my intention therefore to check Tinkham's findings by replicating the study. Therefore the present study was undertaken for 3 reasons, firstly to check Tinkham's findings on Interference Theory in relation to list learning. Secondly, I wanted to see if the same effects occurred with Japanese subjects rather than conducting the experiment in English as Tinkham had done. If the same effects were found for subjects with a different L1, then the results may demonstrate some generalizability to other languages. Finally, as there is a natural tendency for researchers not to fully disclose all the nitty gritty details and problems that occurred along the way with their studies, I wanted to replicate the study to learn about the replication process as there are so few replications to learn from in our field, and pass on some of my findings.


The present study involves two experiments which very closely followed Tinkham's procedure. The first experiment checks the learning of 3 related word-pairs (clothes) mixed with 3 unrelated word-pairs making one test of 6 pairs. The second experiment consists of two tests both of 6 words. One test involved 6 related word-pairs (fruit) and the other 6 unrelated word-pairs. In both experiments the intention was to find out which of the two sets were learned first - the related or unrelated words. It is assumed that the more time it takes to learn a given set of word-pairs means that it shows more difficulty in learning.
Some problems following the exact format of the original study.
The selection of L1 words. To be faithful as a replication, the format of Tinkham's study was followed as closely as possible using the L1 words he selected for his study. The original words / concepts Tinkham used in Experiment 1 were kept in the present study except that they were translated into Japanese. Experiment 2 presented more of a problem as the words Tinkham had chosen in the 'fruit' category were not all prototypically equal (see Aitchison, 1994 chapter 5 for a fuller discussion in this area). That is, the fruits chosen for the original study are not typical examples of fruit found in Japan, where the present study was conducted, and some of the fruit words had to be changed to more typical examples so as to not favour some words as opposed to others. For example, many Japanese have never heard of nectarines. If the subjects had heard this word, they may not have known the Japanese word for nectarine, or because of frequency effects, may have recalled it slower than a more typical example of a fruit found in Japan such as apples, bananas and oranges. Therefore my intention was to find words which would be equally easy to access in Japanese and control for prototypicality effects. Several native speakers of Japanese were interviewed to determine the most typical examples of fruits one would find in Japan and Tinkham's list of fruit was revised in light of this. The list of words for both studies is outlined in the appendix. The words in bold type indicate those changed from the original study.

The selection of L2 words. Some of the artificial (L2) words in the original study were very close in spelling and pronunciation to existing Japanese words and some in fact were so close that they could have been easier to remember than others. For example, 'kaisher' sounds like the Japanese word 'kaisha' meaning 'company'. Therefore, in the present study new words were made up following the guidelines in the original study concerning syllables, consonant cluster, stress and so on in order to make the words more equal in their ease or difficulty of learning
Experiment 1.
Intention. 20 subjects were required to learn six word-pairs in Experiment 1. The word-pairs were Japanese nouns matched with imaginary words. Three of the words shared a common superordinate concept of 'clothes' - jacket, shirt and sweater ('jyaketo', 'shaatsu' and 'seeta') and are labelled the 'related words'. The other three words did not share a common superordinate concept and are labelled the 'unrelated words' (frog, car and rain - 'kaeru', 'kuruma' and 'ame'). The intention of this experiment was to determine which of two sets of words were learned faster.
The subjects.
18 native speaking Japanese and 2 non-natives with advanced proficiency in Japanese were used as subjects in the experiment. The subjects were either of my acquaintance or were studying at educational institutions in Japan. All subjects were volunteers. Almost all have a university level education and range in age from 18 to the mid sixties. The diversity of the subjects was not a factor in the experiment as it was a within-subjects design.
The instrument
. A trials to criterion test was administered to see which of the two sets were learned completely, before the other set - the related or the unrelated words. The L1 Japanese words to be tested were assigned an artificial L2 word corresponding to the L1 word. For example, 'seeta' (sweater) was assigned 'blaikel'. Several criteria were used by Tinkham for the choice of these artificial words so as to maintain phonological variation. These guidelines were followed in this study.
- All have two syllables.
- Three words received initial stress and the others final stress.
- Three words end in a vowel.
- Two words begin with a vowel
- One word contains a consonant cluster.
- One word contains a vowel diphthong.


Form. In order not to show an effect for some artificial L2 words being easier to learn than others, half the subjects received the word-pairs as per the original order as shown in the appendix on Form A, the other half of the subjects received reversed word-pairs on Form B. That is, the artificial words originally paired with the related words in Form A were assigned to the unrelated words in Form B. Similarly, the unrelated words in Form B were assigned to the related words in Form A. The forms were alternated between subjects with equal numbers receiving either form.

Word order
. Random word presentation order was not assigned for each of the trials. This was done to offset the possibility that the subject would meet a word-pair consecutively and thus give the subject a learning advantage over the other word-pairs. Therefore, the order of the words was changed at every trial so that one word-pair at the end of one trial was not repeated at the start of the next trial. Similarly, the order of the pairs from trial to trial was also changed for each of the trials so as not to create a serial effect. Therefore, sometimes a word-pair would be first on one trial and fourth on the next trial and so on.

Data collection. The test was administered individually and orally and followed Tinkham's original. The learners first heard all the L1 - artificial L2 word-pairs before the trials commenced. During the trials the L1 word was said, then there was a gap of three seconds before the sound of a bell and then the artificial L2 word corresponding to other half of the word-pair was said, thus the subject heard each word-pair and was given a chance to learn at each trial. There was a 2 second pause between items within trials and a 5 second pause between trials. For example, the subjects heard 'ame' (the L1 word) then there was a gap of three seconds before which the subject was required to remember and say the L2 artificial word (in this case 'uchen'), then they heard the sound of the bell and then the corresponding L2 artificial word 'uchen' was given. Then there was a gap of two seconds before the next pair until all words in a set of 6 word-pairs had been met. This set of six word-pairings constituted one trial. The sequence then commenced again with a different word-pairing order as outlined above.

The criterion for learning the word-pair was to correctly say the word corresponding to the L1 word before the bell. Exact pronunciation was not demanded as some of the subjects could not easily distinguish some sounds. For example, if a subject said 'ithpa' instead of 'ifpa' it was accepted if their pronunciation was consistently close to the original over 2 trials. The criterion for learning a set of words (either the related or unrelated words) was met when the subject was able to say all the words in a set before the bell in a single trial. When the criterion for one set, either related or unrelated had been met, the number of trials taken was silently recorded. The test continued until the second set of words had all been learned and that number of trials was recorded. The subjects were then instructed that the test had finished.

Before the test, the subjects were told that it was the words that were being tested and not the person. They were given an example and a practice test of three words so that they could become comfortable with the procedure. No discussion of the reason for the experiment was given until after the test was administered.

The means from experiment 1 are presented below in Table 1.

Table 1. Trials taken to learn a set completely.
Semantic relatedness
Form Related Unrelated
Form A 11.8 (5.0) 6.4 (2.5)
Form B 10.3 (3.4) 7.9 (3.0)
Overall 11.3 (4.2) 7.2 (2.8)

* Standard deviations are in parenthesis.

Data for the two variables (form and relatedness) were analyzed using a 2 (relatedness vs unrelatedness) x 2 (Form A vs Form B) multivariate analysis of variance (MANOVA) with relatedness as a within-subjects variable and form as a between-subjects variable. A significant main effect was found for relatedness F (1, 18) = 18.9, p .001. That is, unrelated words required fewer trials to learn than did related words. No effect was found for form F (1, 18) p .85 or for the interaction between form and relatedness F (1, 18) p 0.21. The results indicate that the subjects learned three word-pairs that were related more slowly than three word-pairs that were unrelated. Very roughly it was half as difficult again to learn a related set compared to an unrelated set. The results will be discussed below.

Experiment 2.

Experiment 1 consisted of 2 sets of 3 word-pairs. Experiment 2 consisted of two sets of 6 word-pairs to be learned (see the appendix). One set of the words shared a common superordinate concept (fruit) and were labelled the 'related words'. The other six words did not share a common superordinate concept and were labelled the 'unrelated words'. The intention of this experiment was to determine which of the two sets of words were learned faster.

The subjects. The same subjects were used as in Experiment 1.

The instrument. The trials to criterion test outlined in Experiment 1 was administered to see which of the two sets were learned first. The same conditions for pronunciation, form and word ordering within trials were also met as in Experiment 1.

Controls. As in Experiment 1, word-pairs in one set were reversed in the other set, thus two forms were used. The word-pairings in Experiment 2 were also altered so they did not appear consecutively nor in the same order. To ensure there was no effect for order (that is the related word before the unrelated words) the two sets were alternated. The related test was administered first on half the occasions and was administered second on the other half.

Data collection. The same data collection procedures were used in as in Experiment 1. After all the experiments were over, the subjects were interviewed about their learning strategies and their comments on the experimental design.

The descriptive statistics are presented in Table 2. Data for the three variables (form, order and relatedness) were analyzed using a 2 (relatedness vs unrelatedness) x 2 (form A vs form B) x 2 (related words first vs related words second) multivariate analysis of variance (MANOVA) with relatedness as a within-subjects variable and form and order as between-subjects variables.

Table 2. Experiment 2 - Trials taken to learn a set completely.

Semantic relatedness
Order of Related Unrelated

Related First
8.0 (3.0) 6.5 (2.6)

Related Second 11.2 (5.0) 6.5 (2.8)

* Standard deviations are in parenthesis.

A significant main effect was found for relatedness F (1, 16) = 9.3, p 0.01. No significant effect was found for form F (1, 16) = 0.88 p 0.36 or order F (1, 16) = 1.8, p 0.19. The results indicate that the related words took significantly more time to learn than did the unrelated words. As in experiment 1 it was nearly half as difficult again to learn the related set as compared to the unrelated set.


In this section there will be a discussion of the findings from the experiments followed by a discussion of the learning styles employed by the learners. Following some comment of the limitations of the study, there will be comment on the experimental design.

Findings from the data.

The results of both experiments show that presenting new words that share a common superordinate in a set of words to learn, does interfere with learning. This holds for words learned together in a mixed way (Experiment 1) or if given as two separate lists (Experiment 2). EFL teachers, and coursebook writers in particular, should take note of these results to ensure that learners do not meet new words that have been grouped semantically.

It should be noted that some learners took a considerable amount of time to learn the words, while others took less. There was considerable variation within individuals. In the first experiment, one subject learned all 3 unrelated words in one trial but took 6 trials to learn the unrelated words. In the second experiment she took 23 trials to learn the related ones. There was also considerable variation between subjects with some subjects taking approximately the same number of trials to finish, whereas others took double or triple the time to finish. Several subjects commented on how difficult it was to learn the related words as they felt that they were all jumbled up. It was not difficult for them to remember the artificial words, but it seemed difficult to remember which words formed a pair.

A common finding in Experiment 1 was the production of incorrect words taken from within the same set rather than from outside it. That is, if 'seeta' (sweater) was paired with 'kawvas' and shaatsu' (shirt) was 'nalo' and 'jyaketo' (jacket) was 'uchen', the wrong word would come from the same group - for example 'seeta' was given as 'uchen' rather than a word from the unrelated set. This phenomenon occurred 25% of the time for the 'related' set, but only 5% of the time for the 'unrelated' set.

Learning style / strategies
After the experiments were over, each subject was interviewed about their learning style and their thoughts on the experimental design. Most of the subjects reported using a mnemonic device, visual imagery or a phonetic connection to try to remember the words. Some learners found the unrelated set of words easier to learn because many of the objects were in the room or outside the window of the room where the data were being collected. One learner said it was easy to learn these words as she only had to look at the object - shoe, television, sky, mountain, rain (it was raining at the time). Interestingly, no one commented on the ease of learning of the 'clothes' words despite the ready availability of sweaters, jackets and shirts. In further experiments of this nature we should be cognizant of having the objects that are being tested in the room or in the line of vision.

Some limitations.
There are several limitations to the generalizability of the effects found. Firstly, there seem to be some limits on interference itself. Researchers have found that the occurrence of interference depends on the type of stimulus material. When meaningful passages are used rather than lists of words or nonsense syllables, no interference effects are found (Haberlandt, 1994 p. 211). In the present study the words were learned in lists and this effect may not hold for words learned from, say unintentional learning such as from reading. This would need to be confirmed experimentally.

Secondly, very few words were tested. Additionally, the words were learned aurally and the effect may not hold for learning from written information. This can easily be tested by asking subjects to do the same experiments by using a computer program which would allow the subject to see the words for a limited time. A variation on this could be to require the subjects to write the word rather than only say them.

Thirdly, the testing was on the productive use of the words - the subjects were given the L1 word and had to produce the artificial L2 word. Another experiment could test if the effect also occurred receptively - that is, hearing the L2 word first and having to say the L1 word. In this case it would probably take fewer trials as the target word is already known and therefore the number of word pairs may have to be increased to make the statistical analysis reliable (for example, 10 word-pairs instead of 6).

Fourthly, there are limits on the trials-to-criterion method whereby a condition was met when all the words in a semantic set had been produced correctly in one trial. In the first experiment it was often found that one set would be successfully provided (say the unrelated words) and as the learner was trying to learn the other set, some of the first set which had already been checked by the researcher as learned, were forgotten temporarily. This calls into question whether the words had in fact been learned as it seems proactive interference was taking place. It could have been that the learner was guessing when the set was successfully said. Sometimes the subjects were surprised when the session ended with all the words being checked as learned, as they had not felt they knew all the words properly and may have needed one or more trials to be sure. It would have been better to conform to the standard procedure used in the psychological literature of 2 wholly correct consecutive trials.

Fifthly, the artificial words that were used were strictly controlled. This means that one cannot generalize these findings to natural languages without qualifications being made. This is especially true because the words were chosen by Tinkham to counterbalance effects for word shape. However, this created a problem in that the artificial words are less homogenous, in terms of graphotactics and length distribution, than the Japanese ones. In future studies where artificial words are being used, researchers should strive to create words with a similar consonant - vowel structure to the L1 against which they are being tested, to avoid these problems. It would be possible, of course, to use a natural language of which the subjects had no knowledge. This in turn creates problems and a language would need to be found where the learning for each of the words would be similar, in terms of words with different stress patterns, the number of syllables, the prototypicality of these words in the L1 and so on. This would be a difficult task to undertake especially for related words, but would need doing to balance the words against each other for learnability effects. Nevertheless the attempt should be made to offset the artificiality of the words used in these experiments.

Sixthly, it seems that there is no clear definition of what semantic relatedness might mean. In this experiment words were chosen to show unrelatedness, however this may not be as simple to do as it first seems. It would not be too difficult to come up with a plausible scenario for 'frog', 'rain' and 'car' which were words used in Experiment 1. It is clear that some words fit neatly into 'closed' sets, such as days of the week, the months and so on. On the other hand, more 'open' sets such as 'kitchen utensils' or 'vegetables' can have rather looser borders, where items are disputed as to whether they should or should not be classified together. For example, does one classify an electric mixer, a knife sharpener, an egg slicer or a cutting board as 'kitchen utensils? In future studies of this sort there is a need for clear definitions of terms before research is commenced.

Seventhly, it is not clear that this trials to criterion measure is so straightforward. The number of learning trials in Experiment 2 is fewer than for Experiment 1 despite the subjects having to learn the same number of words in both. It may be that task learning effects were affecting the data. That is, the subjects got better at doing this kind of task as the experiments progressed.

Lastly, it is not clear whether these same effects will hold for learners who already have part of the semantic set being tested. For example, if learners already knew 10 words from the 'clothes' semantic group and were being asked to learn some more words they would be adding to, rather than setting up, a new semantic set in the L2 (assuming the 'clothes' semantic network in the L1 was unavailable in the L2). That is, the learner does not already have a target language network set up to add the new words to. The effects found in this study therefore may be restricted to beginning learners rather than intermediate ones as the beginning learner has to set up semantic and vocabulary knowledge networks in the L2 into which the words must be put. An intermediate learner would probably already know many words from the semantic groups and when presented with new words may only need to add new words to an existing store, rather than create a new one from scratch. This question awaits a future study.

Comment on the experimental design.
The experimental design was somewhat stressful for the subjects as they were constantly under time pressure and their thinking was interrupted by a bell. Several of the subjects commented on this, some enjoyed the challenge, while others found it interfering. The dislike of it by some subjects and the indifference to it by others may have been a factor in explaining individual variation in performance. A set of 6 words took the average subject 5-8 minutes to complete, when added to explanations and practice tests, it made an average total of over 40 minutes per subject of intense stressful concentration. Larger word sets of say 10 or 12 words would have taken too long to administer in one experiment. Researchers wishing to use this procedure may be better advised to make it into two separate experiments held on different occasions. Alternatively, the learning could some other way such as by word cards (flashcards), measuring the time taken to learn the sets completely.

The researcher was faced with several problems. The first was consistency of marking. The researcher had considerable trouble assessing whether a word had been 'correctly' produced. Secondly, there was very little time for the researcher to assess a word and score it correctly, as assessment was done concurrently. Thirdly, it was often difficult to determine whether a word had been correctly supplied before or after the bell as sometimes both occurred simultaneously. All these problems for the researcher left large margins for error.

Some subjects had mastered 5 words rather quickly, but took several more trials to get the sixth. One subject took 10 additional trials to get the sixth word. This may go some way in explaining the huge variations between individuals as it was not clear whether it was the word itself or interference that was restricting the learning of the sixth word. Was that last word so important? A more flexible design would have measured the learning of these 5 items through a more sensitive measure earlier than was done here.


Some words form closed semantic sets such as the numerals, days of the week, the months and so on. It may be impractical to ask our students to learn words from these sets one at a time as learners probably expect to learn them as a set. The more open semantic sets, by contrast, are often grouped together in coursebooks into such groups as the colours, patterns, foods, vegetables, words for emotions and so on which learners are expected to learn together. It is these semantically linked open sets that are a potential problem for teaching and learning. Given the results of these two studies, teachers should try to avoid presenting learners new words in semantically related sets.

One would naturally expect a lesson in a coursebook on shopping for clothes to have a list or pictures of clothes. Clearly we cannot prevent all semantically related words from appearing in coursebooks together. This is because a list of words in a particular unit for some learners may be a trigger for recognition or recall from previous learning, or for other learners it may constitute a list of new words to be learned. However given the above, it might be advisable to mix these words into a thematic rather than semantic arrangement instead. For example, sweater, changing room, try on, cash register, wool, navy blue, striped and so on may not show the same interference effects as scarf, tie, coat, pants and skirt. The teacher could present the words so that they were met in several different contexts over several lessons. The list of words in the coursebook could then be used as a revision list rather than as a starting point.

Alternatively, small numbers of words from several lists that appear in a coursebook could be list learned in a series of sessions leading up to the unit in question. To do this the teacher could make a master list of the words to be learned that appeared in the coursebook (some coursebooks already have such lists at the back of their book). The master list of words then could be broken into several lists with words from different units and semantic sets mixed together which would be learned as one list at a time for say, homework. These mixed lists would be programmed in such a way that the learner will have met all the words from a given unit before starting the unit. The words listed or grouped in the unit would therefore act as a revision or summary list facilitating recognition and recall.


The findings from these two papers will surprise the many EFL teachers who believe that presenting words in semantic groups to learners will benefit learners as it will help the them to build semantic networks and relationships. Certainly, semantic relationships, thematic relationships and word associations are important and are worth developing, but in doing this we should be cognizant of not presenting words to our learners in such a way that creates an environment for interference effects, especially at the initial stages of learning languages. This point does not seem to have been acknowledged by the many language teachers, course designers and coursebook writers despite the clear results of 4 years ago. Indeed, the current crop of coursebooks, some of which were mentioned in the introduction and were published since the original article, still have a tendency to introduce words for the first time in semantic sets.

Despite the clear results found here, a word of caution must be sounded. The experimental design of these studies had its problems and was tightly controlled to benefit the researcher, not the learner, and thus it dilutes the real world application of the results found. While there are benefits to doing tightly controlled studies, we should be aware that the more tightly controlled it is, there is a possibility that the results it generates might not fully apply to the dynamic classroom.

In addition, we must be aware of the research tradition within which we work. Ochsner (1979) made the distinction between nomothetic and hermeneutic traditions of inquiry. The dominant experimental or nomothetic pre-paradigm that exists in SLA, attempts to control for variables and attempts to explain the classroom, learners and learning from the results of experiments. The intention of experimental or quantitative research is to look for a single reality or truth. The qualitative or hermeneutic research tradition seeks to discover about the classroom in naturalistic, interpretive or qualitative terms allowing for multiple realities. This is not a polar distinction, but a continuum of degree. Tinkham's study clearly fits squarely into the former. However, as the variables were tightly controlled for in this experiment, it renders them somewhat ungeneralizable for our classrooms. Therein lies its weakness.

It is clear from this replication that we can replicate studies and find similar results if the same procedures, instruments and designs are used, as was done here, especially if the studies are tightly controlled. However, what does this tell us? Does it tell us that because the exact same results have been found, that the results are to be applied immediately to our classrooms. Or does it tell us to look further and deeper to find other evidence and from different perspectives to support our conclusions? As Meara (1996, p. 38-9) has pointed out, due to the lack of generally accepted guidelines for original research and replications, we are in danger of fragmenting our efforts so that we cannot collect these efforts into a coherent whole. Meara exemplifies the nomothetic view by saying that we need a 'challenging combination of real-world constraints and rich theory'. I agree, but I would go further to add that we need not only acceptable guidelines and agreed on standards of measurement within the nomothetic tradition, but also we need guidelines for qualitative studies looking at the same aspects of language from a wider interpretive view within the hermeneutic tradition. But more importantly what we need is balance. The dominant nomothetic tradition is the SLA default and the hermeneutic tradition is seen very much as the poor relation. Experimental data without its interpretive complement does not make a coherent view and without a coherent view we cannot move forward as Meara has suggested (see Ochsner, 1979 and Markee, 1994 for further discussions in this area). If we can come up with explanations of what happens in SLA and come to a general consensus we might be able to form a workable paradigm to work within. Then the coursebook writer might listen to the researcher. At the moment our work is often fragmented and does not complement each other, therefore the coursebook writer does not know who to listen to.

Nevertheless, one of the cardinal principles of research is replicability as it is one of the prime means of establishing credibility for the work we do, whether qualitative or quantitative. Research is made replicable by keeping records of exactly what was done, and why, and at each phase. If another researcher follows this recipe then he or she ought to come up with the same or similar results. In so many studies, replication is impossible due poor reporting. Tinkham's study was easy to replicate as it detailed the procedure and all facets of the study were there for me to follow. It is not hard to find studies that are irreplicable due to a lack of procedural information and a lack of good solid reporting. We cannot go forward along the nomothetic half of the path towards a general theory of SLA unless we provide a basis for establishing credibility for our work within this tradition until we report and detail our studies so that others may follow in our footsteps if need be. The same goes for the hermeneutic tradition. We must provide more than a toehold for others to follow and lead the way for those new to the field and those who are learning to do research by replicating the work of others.

I am not calling for all research to be replicated in order to generate credibility for the work we do (as Markee remarks (1994, pp. 97-98) we do not need to open a statistically significant number of hearts to discover that they all pump blood - just one will do). I am calling for detailed reporting of all studies and especially those studies which could benefit from replication. Tinkham's study was a prime example of a study that needed to be replicated. His results challenged the generally accepted view that introducing words in semantic sets benefitted the learner. The study needed to be replicated to check that it was not an aberration - it was not - its limitations notwithstanding.

It is not uncommon in any field of study for there to be conflicting results from experimental work and even from replications of original research. But why is this so? Does this arise from attempts to interpret the results from studies that were not similar in the first place, or were irreplicable anyway because of poor reporting? Or is this because we confuse the tradition to which they belong and interpret them with the wrong glasses on?

The author would like to thank Paul Meara and the University of Wales, Swansea Vocabulary Research Group, Paul Nation and an anonymous reviewer for their comments on earlier drafts, and Thomas Tinkham for assisting with the initial setting up of the research.


AITCHISON. J. (1994) Words in the mind: An introduction to the mental lexicon. Oxford: Blackwell
BADDELEY, A. (1990) Human Memory. Hillsdale. N.J. : Lawrence Erlbaum.
BEAVEN, B. (1995) Headstart Beginner. Oxford: Oxford University Press.
BLACK, V., McNORTON M., MALDEREZ, A. and PARKER S. (1986) Fast Forward 1 Oxford: Oxford University Press.
HABERLANDT, K. (1994) Cognitive Psychology. Needham Heights, Mass. : Allyn & Bacon.
HIGA, M. (1963) Interference effects of intralist word relationships in verbal learning. Journal of Verbal Learning and Verbal Behavior 2, 170-175.
HIGA, M. (1965) The psycholinguistic concept of "difficulty" and the teaching of foreign vocabulary. Language Learning 15, 167-179.
MARKEE, N. (1994). Toward an ethnomethodological respectification of second-language acquisition studies. In Tarone, E. Gass, S. And Cohen A. Research Methodology in Second Language Acquisition, Hillsdale: Lawrence Erlbaum.
MEARA, P. (1996) The Classical Research in L2 Vocabulary Acquisition. In G. Anderman and M. Rogers. Words words words. Cleveland: Multilingual Matters.
McGEOCH. A. and McDONALD W. T. (1931) Meaningful relation and retroactive inhibition. American Journal of Psychology 43, 579-88.
OCHSNER. R. (1979) A poetics of second language acquisition. Language Learning 29, 53-80.
SOARS, L. and SOARS, J. (1993) Headway Elementary . Oxford: Oxford University Press.
SWAN, M. and WALTER, C. (1990) The New Cambridge English Course. Cambridge University Press: Cambridge.
TINKHAM, T. (1993) The effect of semantic clustering on the learning of second language vocabulary. System. 21 (3), 371-380.


Below are the words tested in original word order. Words in bold type indicate a change from the original study. The reasons for these changes were outlined above. The underline indicates stress.

Experiment 1

Original study Present study
L1 L2 L1 L2
shirt moshee shatsu (shirt) kilme
jacket umau jyaketo (jacket) ifpa
sweater blaikel seeta (sweater) blaikel
rain achen ame (rain) uchen
car nalo kuruma (car) nalo
frog kawvas kaeru (frog) kawvas

Experiment 2

Unrelated words
Original study Present study
L1 L2 L1 L2
mountain awnai yama (mountain) ejaut
shoe tosel kutsu (shoe) tostrel
flower manzeek hana (flower) padeen
mouse kunop nezumi (mouse) kunop
sky efoo sora (sky) efoo
television chengee terebi (television) chengee

Related words
Original study Present study
L1 L2 L1 L2
pear okess meron (melon) ijos
apple nuga ringo (apple) denga
apricot beloot ichigo (strawberry) esmek
plum kaisher budoo (grape) pairnya
peach eckly momo (peach) uldon
nectarine depai mikan (orange) nakew

Contact Info:
Rob Waring
Notre Dame Seishin University, 2-16-9 Ifuku-cho, Okayama, Japan 700
Tel 086 252 1155 Fax 255 7663 Home 086 223 0341
Email:Rob Waring

Return to Main menu of papers