Monday, March 27, 2006

Paper Problems

One of the more annoying types of papers is what I like to refer to as the "What The Heck Were They Thinking" (WTHWTT) paper. These papers will have bizarre protocols, an odd selection of what types of controls they will utilize, where they apply said controls, and make far-reaching claims based on their data.

The most annoying parts are:
A: They still get published in prestigious journals.
B: You finish reading the paper and realize that as frustrating as the paper is, they're probably still right.

Tonight's example--a paper I am reading in class this week--will not be named, because that would be rude. I am still a lowly undergrad, and I must freely admit that these people DO have a better idea of what they're doing than I do. Nonetheless, I must confess to being irritated by a few aspects of their approach. I'm at the point where I've read enouugh papers on drug addiction that I know what the protocols look like. Also, this paper involves a technique that my lab does a lot of, although I am forced to admit that I have never performed these experiments myself.

First off, they utilize two control groups: an untreated group and a sham group (a group which undergoes the same treatment the experimental group does, with saline replacing the drug in question). That's fine: especially since the procedure involves a survival surgery, it's sensible--and applaudable, since it's not always done--to make sure that the surgery itself is not responsible for any of the changes they monitor. However, although the untreated group is used as a control for all of their experiments, the sham group is only used in a few randomly selected experiments. This is the reverse of how it's usually done: one usually compares the exprimental group directly to a sham group, and then utilizes an untreated group for a few key experiments (if at all) to demonstrate a lack of difference between the sham group and an untreated group. I'm forced to conclude that I missed out on the great saline shortage of the late 90s.

Even though they've previously demonstrated a difference in response when utilizing an electrical stimulation vs. application of an agonist for the receptor in question, they only utilize electrical stimulation in this experiment and interpret their results assuming that agonist application would result in the same data, except where they don't. They claim that further confirmation of these presumptions is impossible with their current protocol--with no more detailed explanation, even though this claim makes no sense--and then confirm it using agonist application for one of the half dozen or so experiments they perform. Furthermore, if their protocol prevents them from getting the data that's actually important to their hypothesis, they should either use a different protocol or restrict their claims about results to what they've actually demonstrated. That's why papers have a "discussion" section in which you "discuss" potential extrapolations of your results and future avenues of research.

Almost all of their data is from brain slices in baths containing a few ingredients (only added to the experimental groups) that I haven't seen added in similar papers, and I'm pretty sure we don't use our selves. It's really quite possible that I'm just not as up on how this is done as I think I am, or that it was just done differently at the time than it is now. However, they only do one experiment where they compare the result of these ingredients when added to the baths for untreated slices, and they never perform an experiment where they show results without the ingredients in the bath. Again, in their defense, if they're right and I'm wrong it would in fact throw off the data.

They use one of their experimental groups first for experimental data, and then claim it as a control for their previous experimental data. It's possible this is thoroughly acceptable, but it feels like a bit of a no-no.

But their worst offense is going to be difficult to explain. The first experiment performed demonstrates that their primary experimental group and the untreated group are different in the magnitude of the basic response being studied. What does this mean, in your human english?

It means that statistic analysis indicates that the two groups are intrinsically different in terms of this response. It means that when you look at the data from these two groups--measure the magnitude of a response--you can safely say that you're actually dealing with two groups in terms of this response, that the differences are not due to chance. Now further experiments performed are based on--comparing how the two different groups respond to an inhibition of the response, or to an enhancement of the response. That's pretty safe, since the differences between the groups in those experiments are--shall we say--"more different" than the two groups are without modulation.

But then they perform an experiment in which the slices are sorted into groups based on the basic magnitude of their responses, which they then compare. And I'm forced to go: huh? Didn't they just demonstrate that these two groups are not, in fact, directly comparable in this respect? And to top it off, half of the groups have responses of magnitude 1.5-2.5 times greater than the highest magnitude achieved in the initial experiment, with no claim that the protocol was somehow different or even any acknowledgement that this might seem odd.

So, yeah: it's been a frustrating night.

By the way, I should make my presumed caveats clear at this point:

  • I'm not claiming the researchers are incompetent. I assume, in fact, that they are much more competent than myself--although perhaps not very good writers. It's possible that I'm just being overly paranoid about their approach, and these are all acceptable approaches I just haven't encountered or just don't understand. I do, in fact, intend to bring up my questions in class tomorrow and attempt to find out what I'm missing.

  • What they're doing is very very hard. I happen to have it on good authority that the type of experiment performed is one of the hardest to be found in neuroscience. I've been told that even people who have been great at it for years will suddenly hit dry patches where they can't get any data out of it for a month. I can understand the urge to cut some corners.

  • I have some fair amount of confidence in their data. I know that much of this has been confirmed by later research. Much of the rest, seems to be consistent with other results I've seen in similar areas. I just think it comes across as more than a bit sloppy, leaving a fair number of loose ends to be tied up.

Academics Blog Top Sites