Chapter 2
Literature review
2.1 What is observation?
2.1.1. Observation in scientific research
Repeated reference refers observation as a method of data collection and a process involving representations and recordings in which reality is depicted. Techniques of observation are not themselves new: they have been used in scientific research for studying the behaviour of men and animals. Anthropologists, sociologists and psychologists were concerned primarily with describing ‘observable behaviours and activities’ (Seliger and Shohamy 1989:118) with the ‘systematic recording in objective terms of behaviour in the process of occurring’ (Jersild and Meigs 1939), and describing these in their entirety from beginning to end.
One could treat observation as a familiar and natural phenomenon that does not need any definition. Hutt and Hutt (1974) give no definition of observation in their book ‘Direct observation and Measurement of Behaviour’. The definition of general observation is given by Wright (1960:71) ‘research methods… rest upon direct observation as a scientific practice that includes observing and recording and analysis of naturally occurring events and things’. According to Wright (1960:71) observation is direct as no arrangements stand between the observer and the observed, and the records are usually compiled immediately after the observation. In a review article, Weick (1968:360) defines an observational method in more elaborative way as ‘the selection, provocation, recording and encoding of that set of behaviours and setting organism’ ‘in situ’ which is consistent with empirical aims’.
So, the characteristic features of observation as a scientific method I can define as there should be a limited amount of information to be collected; the data should be recorded systematically and analysed over a period of time; the data should be congruent with the aims; the observation session must be planned; and, finally, the observation and analysis must be objective.
2.1.2. Approaches to observation in the language classroom studies
Observation in the language classroom is treated either as a research procedure for in -service professional development or as a learning tool for pre-service teachers. Hargreaves (1980:212) suggests that the 1970s were a ‘notable decade’ for classroom studies thanks to the number of projects and the wide range of methodological approaches, and he identified ‘three great traditions’ of studying classrooms - systematic observation, ethnographic observation and sociolinguistic studies. Sociolinguistics studies the aspects of linguistics applied toward the connections between language and society. These aspects are not of prime interest for pre-service classroom observation that is why I do not dwell upon this approach in this paper.
Hammersley (1986:47) proposes that systematic observation and ethnography are treated as ‘self-contained and mutually exclusive paradigms’. The further description of both of these approaches supports this idea. Croll (1986:5) illustrates some fundamental aspects of systematic observation as follows: explicit purposes which are worked out before data collection; explicit and rigorous categories and criteria for classifying phenomena; data should be presented in quantitative form to be analysed with statistical techniques; any observer should record a particular event in an identical fashion to any other. Ethnographic approach involves a complete cycle of events that occur within the interaction between the society and environment. Lutz (1986:108) defines ethnography as ‘a holistic, thick description of the interactive process involving the discovery of important and recurring variables in the society as they relate to one another, under specific conditions, and as they affect or produce certain results and outcomes in the society’. So, systematic observation is described as highly eclectic studies of an event with pre-specified categories and detailed analysis is presented in quantitative manner whereas ethnography describes and interprets events holistically in their naturally occurring contexts. More detailed characteristics of systematic and ethnographic approaches are provided in Chapter 2.3.
2.2. Observation as a problem
2.2. 1. Classifications of errors in the process of observation
There is always the possibility of error in the observation process. Fassnacht (1982:43) reviews Campbell’s (1958) classifications of errors in representing data in psychological and social studies. Some of these errors frequently occur when making judgements and primarily concern language behaviour:
a) error of central tendency
b) error of leniency or generosity
c) primacy or recency effect
d) halo effect
e) logical error
A first error occurs in using a rating scale. Hollingworth (1910) called the effect ‘central tendency’ in a series of judgements about objectivity of quantifiable stimuli, when the large stimuli are underestimated and the small ones overestimated.
An error of leniency or generosity could arise in making favourable verbal judgements using personality scales. Fassnacht (1982:40) clarifies that in the personality scales a number of questions relating to one particular personality trait are drawn together and the answers to these questions are given in the form of ‘yes’, ‘no’, ‘sometimes’, ‘often’ which might not reflect objective reality.
A third error occurs as a result of the order in which perceptual events happen. The problem is that in behaviour testing the first impression could have a distorting effect on later data collection and thus lead to errors. Bailey (1990:218) admits that in diary keeping, events that are embarrassing or painful when they occur ‘often lose their sting after weeks of reflection’.
A fourth error, halo effect, is described by Mandl (1971) when the evaluator ‘has the tendency when judging a personality trait to be influenced by a general impression or a salient characteristic’.
Logical errors or error of theory reveals due to the theoretical assumptions of the observer. It is now widely accepted that observation is always ‘theory-laden’ (Phillips 1993:62). He continues that observations can not be ‘pure’, free from the influence of background theories or hypotheses or personal hopes and desires. Ratcliffe (1983:148) supports this assumption in that ‘most research methodologists are now aware that all data are theory-, method-, and measurement-dependent’. As Bailey (1990:226) suggests in conducting 'pure research' it is better to avoid reading the research literature in the field, to keep from biasing the results.
2.2.2. The problem of ‘observable’ items
The item ‘observable’ in the definition given by Seliger and Shohamy (1989:118) mentioned above emphasizes the problem of what items to be treated as observable in classroom setting. Thus, Smith and Geoffrey (1968) make valid assertions criticising systematic observation systems:
The way the teacher poses his problems, the kind of goals and sub-goals he is trying to reach, the alternatives he weighs … are aspects of teaching which are frequently lost to the behavioural oriented empirical who focuses on what the teacher does to the exclusion of how he thinks about teaching. Smith and Geoffrey (1968:96)
McIntyre and Macleod (1986:14) generalize the problem of observable items and limitation of data obtained through systematic observation claiming that there is ‘no direct evidence on the actions of participants which are not overt’. The detailed criticism of systematic observation is given in Chapter 2.6.2.
2.2.3. Data recording problems
The problem of accurate recording
Data collection, description procedures face problems of the accuracy and explicitness of records. ‘The crucial problem is to be able to render interpretable the process of events and behaviour as it occurs naturally’ (McKernan 1996:60).
Hutt and Hutt (1970:34) emphasise the difficulty of accurate description of the behaviour. They emphasize the problem with the vocabulary choice in that there are many thousands of words which describe motor and language behaviour but ‘unfortunately, the words are injunctive concepts, learned by usage rather than by definition’ (Hutt and Hutt 1970:34). Other than that, it is frequently found that some definitions are over encompassing in that they cover patterns of behaviour for which ordinary language has two or more terms. Lofland and Lofland (1995:93) recommend employing behaviouristic and concrete vocabulary rather than abstract adjectives and adverbs, which are based on paraphrase and general recall.
The problem of objective recording
Another problem with the written commentary to be discussed is the problem of objectivity. All researchers agree that the data are often subjective, reflect personal impressions, inferential and interpretative. Events may not be viewed the same way by different observers. ‘It is common to find that witnesses to an accident give differing accounts of what happened’ (Lofland 1995:127).
Eisner (1993:49) defines objectivity as being ‘fair, open to all sides of the argument’. He considers that to reduce subjectivity the observer must achieve correspondence not only in what s/he perceives or understands but how she or he represents it. Schaffer (1982:75) continuous the problem of vocabulary choice saying that there are some aspects of reality which can be described fairly objectively and those which can only be described subjectively, and ‘it is difficult to know where the borderline between objectivity and subjectivity lies’. Scheurich (1997:161) doubts in ‘the very existence of gross material reality’. He claims that research mainly addresses interpretation of meaning or constructions of ‘reality’.
To sum the problems with data recording I can suggest that an observer may describe and interpret an event in subjective way due to personal bias, theoretical assumptions, s/he can experience difficulty in the choice of an object/behaviour to observe and words to record an event in accurate and explicit way.
2.2.4. The choice of an approach to observation
An observer faces the dilemma in choosing systematic or ethnographic approaches. The main problem of ethnographical approach lies in its very nature – it is so broad that it demands a highly trained observer to do a competent and reliable observation. ‘An untrained observer may be overwhelmed by the complexity of what goes on and not be able to focus on important events in the classroom’ (Day 1990:44). Pre-specified coding systems in systematic observation are exclusively concerned with ‘what can be categorized or measured’ (Simon and Boyer 1974). Thus they may distort or ignore the qualitative features which they claim to investigate. At the same time limiting the attention of the observer can help improve reliability.
2.3. Reliability and Validity
2.3.1 Types of reliability
Reliability and validity are the most important criteria for assuring the quality of the data collection procedures. The criterion of reliability provides information on whether the data collection procedure is ‘consistent and accurate’ (Seliger and Shohamy 1989:185). The researchers suspect that observers may unintentionally impose their own biases and impressions on the observed situation. Seliger and Shohamy (1985:185) claim that for different types of data collection procedures different types of reliability are relevant. Thus they determine for the ethnographic approach the following types:
a) inter-rater reliability (to examine to which different observers agree on the data collected from the observation);
b) test-retes reliability (to check stability of data collection over time);
c) regrounding (to repeat the data collection and compare both results);
d) parallel form (to examine to which extent two versions of the same data collection procedure are really collecting the same data)
To assure reliability different methodologists suggest involving at least two observers to carry a ‘sequential analysis’ (Becker 1970:79), or to achieve ‘inter-observer agreement’ (Croll 1986:150). The idea of the former procedure is to carry out the analysis concurrently with data collection in the sense that ‘one may ‘step back’ from the data, so as to reflect on their possible meaning’ (Fielding 2001:158). Thus further subsequent data gathering will direct the observer either to abandon or pursue the original hypothesis. In the later procedure two observers look at the same events from different locations to categorise these events and compare the outcomes. Using systematic schemes with pre-specified categories they refine, or ‘index’ (Fielding 2001:159) the definitions and categories of observation by ‘applying in a consistent manner the procedures for data selection, collection, grouping, inclusion, exclusion etc.’ (Simpson and Tuson 1995:65).
2.3.2 Types and evidences of validity
Just as there are different types of reliability, Seliger and Shohamy (1989: 102) suggest that there are different types of validity which provide ‘evidence’ for validity. Thus, their typology of ‘evidences’ of validity comprises
a) evidence on content validity which demonstrates appropriateness of data collection against the content to be measured;
b) criterion validity which provides an indication as to whether the instrument can be measured against some other criterion and compared with the previous results (concurrent validity), and whether the procedure is capable of foretelling certain behaviour (predictive validity);
c) construct validity which examines whether the data collection procedure is a good representation of and consistent with current theories underlying the variable being measured.
Chaudron (1988:24) gives another term to the content validity and suggests ‘treatment validity’ which relates to the process component of process-product study and demonstrates that the treatment was in fact implemented and that it was identifiable different from whatever it was being compared with.
For the results of the second language research Seliger and Shohamy (1989:104) identify internal and external validity. They propose that a study has internal validity if the outcomes of the observational data can be directly and unambiguously attributed to the treatment that is applied to the observed group, and that the interpretation of these data is not dependent on the subjective judgement of an individual researcher. Internal validity in this sense relates to three areas: ‘representativeness, retrievability, and confirmability of the data’ (Seliger and Shohamy 1989:104). External validity involves the extent to which the findings of a study can be generalized and applied to another situation and the categories of the study are treated as basic, applied, and practical.
To achieve evidences of validity items or questions of an instrument must be analyzed in the process of data collection. A researcher or observer should obtain information on whether the items are of ‘low-inference’ or ‘high-inference’ (Long 1980), too easy or too difficult, and whether the items are phrased and easily understood by the respondents. All these aspects are recommended to examine in the pilot phase of the research that is likely to be proved by evidences from a variety of sources, such as additional questionnaire data from pupils or teachers, interviews, surveys. Another way of examining the validity of observation is to ask colleagues to study the categories and to define the purpose of the observation. Simpson and Tuson (1995:65) treat this method as a useful check on face validity. Thus to achieve reliable and valid observation an evaluator should take into account the spatial location of an observer, engage more than one observer, involve ‘low-inference’ categories that do not require complex interpretation and check agreement of key aspects against independent studies.
2.4. Items of observation
2.4.1 The importance of items
In so far the language classroom observation ‘does not simply mean watching classes’ (Wallace 1991:123). An observer may record either very narrowly defined data such as a specific speech act, or more general kinds of language learning activity such as turn-taking, group work.
Any scientific research or observation is characterised by terms as ‘structured’, ‘organised’, ‘methodical’, and ‘systematic’. To follow these characteristics any data collection obtains a structure or format, and guided by some questions or variables. Croll (1986:55) notifies a variable as a basic unit that represents the process by which a concept of interest is turned into a set of working definitions whereby the results of observation or some other data collecting process can be categorized and measured.