Technical Report 322, 1988, Basser Department of Computer Science, University of Sydney
Dance by its nature is about people, emotions and aesthetics. Any data about dance is, in the first instance, bound to be subjective.
A variety of techniques have been evolved in science for the assessment of information which enhance the objectivity of the results. These can be applied to data from dance. They include: the design of sampling methods, the design of data collection methods, statistical analysis, and pilot studies. These techniques can enhance the reliability and accuracy, and hence the objectivity, of the results.
The data for study must be obtained from some group or population of subjects. Depending on the study, the population may be everyone in the world, or the students in some particular class, or any appropriate group of people. If that group is small enough, all of the subjects can be studied; but sometimes the group is too large, or the time available too short, and then a subset of the population must be selected for study.(Barr, 1953, 158).
A difficulty then arises: choosing subjects who are representative of the population. There may easily be overlooked indirect relationships between the selection procedure of the subjects and the information being sought.
To avoid this, two techniques may be useful.
One is to select the subjects from a complete list of the population by a random procedure. This can be done using tables of random numbers (Abramowitz, 1965, 991). Alternatively, a pseudo-random number generator can be used (Pollard, 1979, 236).
The other technique is systematic selection. If the size of the population is 'n', and 'm' subjects are required for study, then every 'p'th member of the complete list is chosen, where p = n/m (Barr, 1953, 163)
A further problem arises when some chosen subject cannot be reached. Leaving incomplete coverage or choosing a replacement are both particularly likely to lead to bias (Barr, 1953, 163).
Four basic methods are available for obtaining data:
(1) The learned and critical literature can provide valuable data for research on dance; indeed for studies extending back beyond living memory, there is little alternative. Any study attempting to provide a baseline of more than a few decades will need to use the literature.
(2) Questionnaires can be useful if the subjects can read and write (Burroughs, 1975, 106). The preparation of questionnaires has many traps for the unwary. One problem is simply that of persuading the subjects to answer and return them. The likelihood of this can be improved by making the questionnaire short, by personal contact with the subjects, by using a suitable covering letter, by making the questionnaire look attractive, and by enclosing a return paid envelope (Burroughs, 1975, 106).
The questions themselves need careful wording. It is important to avoid complexity, ambiguity, bias, and causing offence (Barr, 1953, 67).
(3) Interviews have similar problems to questionnaires: The wording of questions must again avoid complexity, ambiguity, bias and causing offence. The personal contact in an interview can encourage subjects to be more forthcoming than when using a questionnaire, but interviews are more time consuming. The interviewer can also clarify questions if they are not understood, but care must be taken not to invalidate comparison of answers between the different subjects (Burroughs, 1971, 105).
(4) Observation can derive data from either controlled or natural situations. Data from experiments in controlled situations may appear to give the more objective data, as this most closely approaches the techniques in those most objective of sciences: physics and chemistry. However in dance, where both the subjects and the observer are people, complex interactions can occur, which can distort the results. Even if the observer is obscured by for example a one-way mirror, the results can be distorted by the nervousness or other emotional reactions of the subjects.
This effect on the subjects of the fact that they are being observed can be reduced by using natural situations for the observations. This however carries the ethical question of how right it is to observe people without informing them of the fact that their privacy is being invaded (Burroughs, 1971, 99).
Four different types of data are obtained by any of the above methods of data collection:-
Data of type (d) must be converted into one of the other types for analysis. This is done by summarising it into categories, which must be exhaustive, mutually exclusive, and independent (Burroughs, 1975,44).
Data of types (a) and (b) may often be treated the same way, in which case it may be described as ordered.
Measured data is typically affected by a number of parameters, only a few of which are themselves available for measurement. The effects of the unmeasured parameters is conveniently assumed to be random and is called noise. Statistical analysis consists of trying to discern relationships between measured parameters despite the noise. The four basic methods used for statistical analysis are:
(i) Significance tests are useful when an unranked but accurate parameter 'q' affects the population on q. The dependence of the population can be judged by a chi-squared test (Burroughs, 1975, 266). Its interdependence on another parameter 'p' that is ranked but noisy can be judged using the means and standard errors of 'p' for each value of 'q'. These can be compared by 't' and 'F' tests, which give the likelihood that the null hypothesis is true i.e., that 'q' is irrelevant (Burroughs, 1975, 167).
(ii) Regression coefficients are useful when one or more ordered but noisy parameters (dependent variables) are to be correlated with one or more ordered but accurate parameters (independent variables) (Barr, 1953, 257). The regression does not have to be linear in terms of the independent variable(s), but does have to be linear in the coefficients. The procedure depends on fitting a mathematical form to the data which minimises the sum of square deviations of the noisy values from the curve.
(iii) Correlation coefficients are useful when there are two parameters which are both ordered and noisy affecting the population (Barr, 1953, 283). The coefficient is a measure of the angle between regression lines obtained by assuming each parameter in turn is noise free. A coefficient near 1.0 between two parameters typically means that they are either a cause and effect pair, or that both have a common cause. The decision about which is cause and which is effect requires the elucidation of the mechanisms of dependence, which is quite a different study. With noisy data, this can be impossible to determine objectively.
(iv) Analysis of variance is a generalisation of the correlation coefficient to the case where more than two ordered noisy parameters are measured. (Burroughs, 1975, 168). It studies how combinations of parameters interact with the population.
When some issue needs to be investigated, and the methods of sampling, data collection and analysis have all been tentatively decided upon, it is still desirable to perform a pilot study. This can reveal unexpected problems, and allow redesign of the investigation to overcome them. For example in one study of how students rate teachers versus how fast students advanced, a correlation coefficient of 0.05 was obtained (Barr, 1953,321). This was unexpected.
Another advantage of doing a pilot study is that by appropriate scaling, the amount of effort required for the full study can be more accurately assessed. A comparison between available effort and this assessed effort may indicate a need for the re-design of the study.
Abramowitz M, and Stegun, I.A.,
Handbook of Mathematical Functions, Dover, New York, 1965.
Barr, A.S., Davis, R.A, and Johnson, P.O.,
Educational Research and Appraisal, Lippicott, Chicago, 1953.
Design and Analysis in Educational Research, Educational Review, Birmingham 2nd Edn: 1975.
Pollard, J. H.,
A Handbook of Numerical and Statistical Technique, Cambridge University Press, 1979.
(updated 14 October 2003)