I've just read Ben Goldacre's paper on RCTs in education. An interesting response by Rebecca Allen seems broadly in favour of it, but I'm interested in a couple of points raised, which I'll bring up along with my own reaction here.
The question is asked in Rebecca's blog, why is it necessary for someone like Ben Goldacre to have to say anything at all? Why isn't this obviously the way forward? Which takes me back to a course I did on educational research through the Open University, where Goldacre's position is broadly characterised as positivist: moving away from beliefs that are based in tradition or superstition, or held simply because of the charisma of the speaker, and towards positions that are based on evidence gained through scientific procedures, principally RCTs as they provide objectivity and replicable procedures.
This position was said to have been in and out of favour over time, and was not presented to me as being as solid as Goldacre suggests. However, despite going over and over the reading, I could never quite work out, why not? What is the argument against RCTs?
Intepretivism was presented as the opposing viewpoint to positivism. It argues that people cannot be studied in the same way as natural phenomena (e.g. the effectiveness of drugs on the body) because whatever we observe is at best indicative of what we are interested in measuring. As a mathematics teacher, I am trying to teach children to understand mathematics, not how to pass exams, but it's the exams we use to measure their achievement. So interpretivism says that we need to do more than just observe these objective kinds of measures; we need to make sense of how people understand and interpret their world, so that we can get closer to seeing what is really going on. To aid the understanding of these interpretations, interpretivists argue we should have more rich, qualitative research instead of simple quantitative measures.
A simple example of this is as follows: young children were given a test which required them to identify "the animal which can fly" from an elephant, a bird, and a dog. Lots of children ticked 'elephant'. This would suggest that they hadn't achieved whatever outcome we were hoping for, but in fact, talking to children afterwards, it was found that they were referring to Dumbo.
I'm not sure if this strikes me as an argument against RCTs altogether, or just the requirement that we design tests properly. I don't think an RCT advocate would say that we couldn't check that children's understanding of tests coincided with our own before completely trusting the results.
It also seems tough to draw firm conclusions just from qualitative data. One interpretivist strategy would be to interview subjects with as few fixed boundaries as possible, so that the researcher is not imposing their reality on the subject. But this seems self defeating - we are actually introducing more layers for misinterpretation, the researcher has his own perception of things, along with the subject. We would presumably be looking for common themes in responses, but Goldacre makes a point in Bad Science that if you do an experiment, generate that data, and then look through to find out what trends you can observe, you'll always find something of apparent statistical significance.
Although I can't see them as insurmountable, there are certainly problems in trying to measure things like educational achievement in a way that works with RCTs. But not all interventions are of this nature. As a mathematics teacher I'm most interested in test scores as the outcome, but there are also plenty of measures which are more objective - school attendance rates, teenage pregnancies, incidence of self harm and so on, which would be relevant to other interventions such as those concerning student wellbeing.
So this is all very epistemological, and I'm still not sure I fully understand the interpretivist criticism of positivism. There are some other more practical issues with RCTs raised.
Rebecca's post states, quite validly I think, that you need some way to devise good interventions that are likely to work. This is, I think, where qualitative evidence would be useful - interviewing relatively small numbers of subjects in a less structured manner. But I see no reasons that this should not lead to the design of an intervention that can be tested using RCTs.
One such reason given is that of external validity - can you be sure that an intervention in one set of circumstances will work in another? Rebecca's post states that "the challenge of external validity cannot be underestimated in educational settings...validity declines as we try to implement the policy in different settings and over different time frames." There seems to me to be quite an assumption here, that may apply to some kinds of intervention more than others. Her example regarding student motivation is quite believable, but I wonder how much the way children learn algebra really changes across time and culture? I would certainly want to see some evidence that this is the case before dismissing what could be a very powerful way to assess the effectiveness of different interventions. We can't just use external validity to dismiss any empirical evidence out of hand.
I've made some other comments on Rebecca's blog, most of which I think relate to practical difficulties rather than fundamental, theoretical ones. I'm still bothered that I don't think I've fully grasped the interpretivist critique of positivism.