Wednesday, April 28, 2010

Filters: Part III

So, we’ve come to Part Three and I still haven’t come to the identity of mystery researcher. That’s because of two things. One, biomedical research is complicated, and I’m trying to bring you, the lay reader, up to speed on a lot of topics. Two, that initial quote isn’t the only thing that makes mainstream researchers look askance at the mystery researcher, and I’m going to come back to all these topics when I deconstruct the arguments our mystery researcher makes in Parts Four and Five.

So what are the final topics I think you need to have been exposed to in order to parse the totality of what my mysterious medic posits as a unified theory?

Three things. One, you need a basic understanding of how science is published, including the process of peer review. Two, you need to understand how modern medicine evaluates data in clinical trials. Three, you need a basic overview of how the FDA approves and regulates drugs that are marketed to you.

So let’s say you’re a researcher who has done enough experiments to come to some conclusions and write a paper. The first hurdle you have to jump in order to get your new theory accepted is the peer reviewers at the best journal you think would even consider publishing your stuff. Oh yes, there is a hierarchy of journals, and the better the journal, the tougher the peer review.

Just what is peer review? Peer review is the process by which the obvious chaff is first sorted from the wheat for publication or grant funding. An editor or grant administrator sends out submission for anonymous review by known experts in the field. They are not looking to fact check the experiments themselves, they are looking to see if there are basic math or statistics mistakes, mistaken conclusions based on a misunderstanding of the literature, and other basic hurdles that keep the real effluvia out of science journals. Reviews come back, and in the case of the funding agencies, proposals are ranked. In the case of journals, which is most relevant to the topic at hand, reviewers check the methodology and basic science knowledge (including the pattern of references).

The reviews are anonymous because sometimes the editor thinks that a younger guy with less prestige actually knows more about an older researcher’s work than some other of the old guard. The editor wants an honest review, but the older guy who’s on all the funding committees can sink the younger guy’s career. Yeah, some scientists are idiots that way. So we try to get the most honest reviews we can.

Despite peer review, papers that contain results or conclusions that are just plain wrong do get published. A lot. Especially in the lower tier journals. All that the peer review system does is make a first-pass attempt to ensure that real garbage - math mistakes, sloppy methodology, half-baked hypotheses from researchers who can't be bothered to keep up with the literature, etc. do not choke the lawn. We've got enough weeds as it is.

This comes to an important point that the alternative medical people trot out every time a major case of scientific fraud gets exposed. “See?” they claim, “peer review does not work.” Let me be very clear here. Peer reviewers look for elementary errors in knowledge, logic, and math. In the better journals, there is also a judgement call on the part of the reviewers as to whether the paper is of high enough quality to meet the journal‘s standards. It is a very basic, minimum filter. Peer reviewers do not repeat the experiments in the paper when reviewing it. Therefore PEER RERVIEW IS NOT DESIGNED TO CATCH FRAUD. That kind of review would take years. Science would slow to a halt. Science relies on blacklisting to deal with fraud (i.e if you get caught, no more grant money for you, you’ll be using a soda straw and a magnifying glass to conduct experiments until you quit and go home).

If someone comes to you with a therapy and complains about the “old boys network” and how peer review is useless, ask them this – why did the reviewers reject the work? Ask to see the written reviews. If they won’t show the reviews to you, the chances are their mistakes are so basic and so fundamental, that the review was something like the one written for a famous physics paper that was later published in a rock-bottom quality Chinese journal and then exposed as fraud:

It is difficult to describe what is wrong in Section 4, since almost nothing is right. … The remainder of the paper is a jumble of misquoted results from math and physics. It would take up too much space to enumerate all the mistakes: indeed it is difficult to say where one error ends and the next begins.

In conclusion, I would not recommend that this paper be published in this, or any, journal.

That is pretty amusing. I’ve only had the opportunity to write a review that scathing once in my academic career. I’m willing to bet that the reviews on most alternative medicine look like that, though.

But let’s return to mainstream science. It bears repeating, a lot of what is published in the medical literature – by good researchers who are NOT perpetuating fraud - is wrong. Does that surprise you? In fact researcher John Ioannidis has put forward some detailed analysis tracking initial hypotheses over 20 years, and has shown that most of these hypotheses turn out to be wrong.

Like Steve Novella, this does not surprise me. When we operate on the cutting edge of human knowledge, we conduct experiments and come to conclusions based on those experiments. Even in the absence of outright faking of data, sometimes those conclusions are right, and sometimes they are wrong. More often than not they are wrong – we’re making guesses about nature based on limited data. But we had to prove that the wrong guesses are wrong, via the mechanism of the scientific method, in order to be sure that our reasoning about the correct hypotheses is complete.

In this way scientific research is like landing a spaceship on an alien planet. If alien sociologists landed in the middle of an Amish village in Lancaster PA they would write up their findings in a paper that might accurately describe the Amish way of life, but if they didn’t venture outside the village, they would get a hugely skewed idea of American life, and any conclusions they made about America in general would be dead wrong. If another ship landed in NY, that bunch of sociologists would be fighting with the first bunch like the blind men with the elephant. Only after landing a number of ships in the South, Midwest, East and West coasts would an accurate picture emerge.

What does this mean for the bozo filter I’m trying to help you build? Well, the first thing it should instill in you is a healthy skepticism of any one researcher. This drives us scientists nuts when the lay press come calling, because they pick the most photogenic and / or glibbest researcher and quote him or her to death when writing a story. Dr. X says this, Dr. X believes that. Just who the heck is Dr. X? What we want in science is a pattern of conclusions from many researchers at various institutions all pointing in one direction. We especially want that current research should not significantly contradict research from the past, if a pattern is to be established.

As David Gorski noted (and I urge you to read his post in full):

First, it is important to realize that confident medical judgments or conclusions rarely emerge from single studies – confidence requires a pattern of evidence over many studies. The typical historical course for such evidence is first to begin with clinical observations or plausible hypotheses that stem from established treatments. Based upon this weakest form of evidence preliminary or pilot studies are performed by some interested researchers to see if a new treatment has any potential and is at least relatively safe. If these early studies are encouraging then larger and larger studies, with more tight designs are often performed.

In this early phase of research results are often mixed (some positive, some negative) as researchers explore different ways to use a treatment, different subsets of patients on which to use the treatment, varying doses of medication, or other variables. Outcomes are also variable – should a hard biological outcome be used, or more subjective quality of life outcomes? What about combinations with existing treatments? Is there an additive effect, or is the new treatment an alternative.

It takes time to sort out all the possible variables of even what sound initially like a simple medical question – does treatment A work for condition X. Often different schools of thought emerge, and they battle it out, each touting their own studies and criticizing the others’. This criticism is healthy, and in the best case scenario leads to large, well-designed, multi-center, replicable consensus trials – trials that take into consideration all reasonable concerns where both sides can agree upon the results.

This poses a problem for the layman reading papers in a one-off fashion.

Allow me to let you in on another dirty little secret of science. When the socially inept geek smelling of cheese shows up to our party drunk and throws up in the potato chips, we just ignore him. We usually don’t call the cops or throw him out ourselves. In other words, we may suspect someone was sloppy researcher and his or her paper is a piece of garbage that should be retracted, but we rarely come out directly and say “j’accuse”. So the sloppy researcher’s work just sits there in the scientific literature. Ignored to be sure, but it still just sits there. Sits there for the unsuspecting layman to stumble across and shout “Eureka”.

Sometimes people who should know better go on touting some obscure researcher’s work. Before the internet, the layman had a hard time sorting wheat from chaff in this regard. But this is the internet age. The first thing you should do when presented with “but so and so SHOWED this to be true back in 1969” is to go to the scientific equivalent of Snopes: Google Scholar. Google scholar has a “cited by” field. Use it. If a given paper in the biomedical literature (this rule of thumb differs for different disciplines) is more than 10 years old and has less than 20 cites, it’s likely that not many people found it useful, or (even more worrisome) that not many people could reproduce the results.

Remember that journals tend not to publish studies with negative conclusions (i.e. you don’t see a lot of papers with “tried this guy’s stuff, didn’t work, he’s an idiot” in the Conclusions section). So the medical literature is rife with stuff that practicing researchers don’t believe. We scientists get a good idea of the lay of the land from our thesis advisors very early in graduate school. Your advisor will tlel you “don’t bother reading so-and-so. However, unless you’re in the club it’s hard to learn who exactly the cheese-smelling geeks are, except that Google Scholar now gives you a big old clue.

Google the subject of the paper you’ve just been handed (in Google Scholar), and see which names pop up most often. Check how many times they’ve been cited. Check how often a year they publish. Then compare it to the paper and author you have doubts about. This will go a long way in giving you a clue about the ability level of the researcher in question. If the paper and / or author in question has been cited a lot (and if still alive, is still actively publishing), it’s a good bet that your doctor won’t roll his or her eyes when you mention it. If it’s got 7 cites in 20 years, your doctor is a lot more likely to say: Who? What? Huh?

Stuff that’s been orphaned in the literature is almost always junk. More than 99 times out of 100, it’s junk. You can take that to the bank.

Next, how does modern medicine evaluate published evidence from clinical trials? As I said above, getting through peer review is just the first step to wide acceptance. Researchers then look at the data, the way it was collected, and how it fits into the larger pattern of evidence from other researchers, and decide if the paper is worth talking about and using as a reference in their own papers.

Human beings are subject to all kinds of biases. The medical profession has developed a number of methods to cut down on the effects of those biases. The most famous, is of course, the double-blind placebo-controlled trial (DBPCT). If neither the test subject nor administering doctor knows if the patient is getting a sugar pill or the real deal, they can’t knowingly skew the results. There are other methods, though.

Let me say this again, it bears repeating. Real researchers realize that EVERYONE is biased. Clinical trial designs and instruments are constructed to reduce those biases. Anyone who comes along and tells you that he or she is not biased and doesn’t need to follow the conventions of good research is a fool, a charlatan, or both. No exceptions.

One method that is very important in separating anecdote from real evidence (and I can’t say enough the plural of anecdote is NOT data), is the development of validated “instruments” which measure the severity of disease. Since Kelly asked me to write this, I’ll use RA as an example. The standard measures in RA are the American College of Rheumatology criteria and the Disease Activity Score. The ACR is generally favored by Americans and the DAS is generally favored by Europeans, but both scales have their plusses and minuses.

However, both scales take into account markers of inflammation that can be measured directly (C reactive protein levels, sedimentation rates, etc.). You can’t fake those, and they should not be responsive to placebo. Unfortunately those biomarkers also bounce up and down due to factors unrelated to RA, so they are only part of either score. The number of tender (meaning they hurt if you push on them) and swollen joints are also included. But wait. At what level of pain is a joint declared “tender”? That is why we conduct double-blind trials – as objective as that “tender joint count” sounds, it is binary measure (yes, it hurts or no, it doesn’t), and if either party knows which treatment is being given, they could skew the results (the patient by being more or less stoic, the doctor by pressing harder or more gently).

Both disease scales have been extensively studied and are reproducible, which is why we use them in clinical trials. We don’t rely solely on the “patient global assessment” (i.e. asking the patient “how are you doing”), though that IS a part of either measure. We have to ask the patient what’s going on, but if we rely only on the patient, we run often into the Monty Python scenario: She turned me into a newt! Well, I got better…

So when you see some alternative practitioner coming down the pike with a fistful of anecdotal reports, be skeptical. We do not accept anecdotal evidence in real science because we don’t trust even ourselves. We’ve all let our biases run away with us at times, it’s human nature.

What complicates research in autoimmune diseases such as RA is that the disease waxes and wanes, often for no discernable reason. A large number of cases of sarcoidosis, for example, resolve on their own.

The natural course of pulmonary sarcoidosis is highly variable. Unlike most other interstitial lung diseases, remission and resolution are common so that many patients are best served by avoiding use of potentially toxic therapy.

The disease comes for some unknown reason, we try to treat the symptoms, then it leaves for some other unknown reason, and we’re left scratching our heads. RA, unfortunately, has a much, much lower reported rate of spontaneous remission than sarcoidosis, but it does wax and wane, the so-called flaring that many, but not all, patients experience.

In a clinical trial these remissions, temporary or not, look like a drug response or a placebo response, but they are part of the natural history of the disease. If we’re not careful, we can look only at the good data, ignore the bad, and selectively perceive our way into a wrong conclusion. That is why the both the FDA and Big Pharma pay a small army of statisticians to go over clinical trials, and why the Agency (and reputable pharma companies) also frowns on any anecdotal evidence whatsoever.

You cannot just look at the people who respond to a drug in your statistical analysis. Why did the people who did not respond have a bad experience? Did the drug just not work? Or was it working and they dropped out due to side effects? Those are importnant, vital questions to ask when deciding the risk / benefit balance of any therapy. There’s a name for the logical fallacy of looking only at the responders: cherry picking.

By looking at the people who didn’t respond but were on drug, and the people who did respond, but were on placebo, you start to get an idea of the real potential of the treatment. If we are to stick to the premise that we should do no harm (and the FDA does, along with any doctor worth his or her salt), we have to reject anecdotal evidence, except as the start of a careful and non-anecdotally based plan of research. This is very hard for the human brain to do, because evolution has hard-wired us to learn from the anecdotal experience of our elders. But you have to push that back into the reptilian part of your brain when you look at scientific evidence. If someone’s been treating people for a good while and all they’ve got to show for it is anecdotes, you, the patient, are now in the other Monty Python scenario: bravely run away, and run away fast.

So let’s say our hypothetical intrepid researcher has discovered something so good that a representative of the evil pharma empire wants to make a drug out of it. Now we come to the public health portion of this backgrounder.

Most laypeople operate under a very mistaken impression of what the FDA can and can not do. Even most doctors aren’t very good at parsing this, because they get little to no training about it in medical school.

If you want to sell something via a real pharmacy that purports to be a medicine, the FDA has vast regulatory powers. They can dictate the terms of the clinical trials that they will accept in order to prove that a medicine is efficacious and reasonably safe (there is NO SUCH THING as a perfectly safe drug or therapy), and they can dictate the means of manufacture, the impurity levels in the finished product, and a lot of other things. Most of the cost of the pills and injections you buy is not in the drug substance itself – any reasonably competent chemist or biologist can make the “Active Pharmaceutical Ingredient”. The cost of a marketed drug is in all the clinical trials and manufacturing oversight that gives your doctor confidence he or she is not going to do you any harm in putting this “stuff” in your body. Do no harm. All competent doctors put that first. And the tradeoffs they make in real life keep them up nights.

What is the normal process one goes through in getting FDA approval for a new treatment? First the sponsor goes to the FDA with all the pre-clinical (animal!) data that supports the hypothesis that whatever you’re doing will be effective and reasonably safe. Then the sponsor needs to find investigators who can enroll patients. These might be academic clinicians or doctors in private practice, but the drug company does not enroll patients directly, they subcontract.

There is a requirement that an independent body called an Ethics Committee or Institutional Review Board, composed of people who have no dog in this fight, look the experimental design over and pronounce it kosher before those independent investigators put patients into the trial and give their data to the sponsor.

As a potential enrollee in a clinical trial, you should never, NEVER step into research that has not been cleared by an IRB. I’d suggest you ask for the IRB’s comments if you are approached about a trial and have concerns, but at the very least you should be told which IRB approved the clinical protocol. If you enroll at an Academic research center, there will likely be 2 IRBs involved, one global one mandated for the drug company by the FDA and EMEA (the European FDA) and one local one just for the institution your doctor belongs to, because most universities maintain their own Ethics Committees, and those bodies just don’t take the word of the drug company’s IRB, they double-check it.

Then it’s on to clinical research for regulatory approval.

Drug approval the world over comes in three phases. Phase I is usually short term dosing in perfectly healthy people. At this point you’ve done lots of animal studies, and you have some clue what might go wrong. But rats aren’t humans. (Remember what I said about human data trumping rat data in the in vivo / in vitro section of Part Two?) So you put the drug in once and observe the healthy people carefully. Then you work your way up to multiple doses.

When you and the FDA are comfortable that nothing weird is going on in around 100 or 200 healthy people, you go on to patients in Phase II. Not a lot of patients to be sure. Usually Phase II consists of two relatively short (a few months) duration trials in about 400 – 800 patients. If those patients seem to be getting better, you have a meeting with the FDA to outline the Phase III plans. If anything unexpected showed up in Phase II, you might have to do another small trial, or the FDA might require an outside board of independent docs break blind every so often to make sure everyone in Phase III is safe (called a Drug Safety Monitoring Board), or they might require more than the minimum, basic two replicate trials in Phase III to look for potential problems in special populations (for example, people with weak kidneys for a drug that is excreted via the urine, or people with liver problems if the drug is metabolized by the liver).

Phase III, in chronic disease, is comprised of at least 2 trials with about 1000 patients each for about a year’s duration. Usually Phase III is larger than this of course, but that is about the minimum. The FDA wants two replicate trials because it does sometimes happen that the results of the first Phase III trial are not repeated in the second one. Sometimes, even with a large statistical sample, your spaceship lands in the Amish village.

Even after approval, there are requirements for “post-marketing” or “Phase IV” studies to make sure that nothing slipped through the cracks in Phase III. If a serious side effect, let’s say liver failure, happens only in one patient out of a million, or even 1 in 100,000, statistically you’re unlikely to see that event in a 2000 to 6000 patient sample in Phase III. Post marketing studies are designed to try to capture those rare events.

This is a very basic overview, of course. I didn’t even talk about regularly conducted non-efficacy trials such as drug-drug interaction studies to make sure a drug that’s metabolized in a certain way doesn’t raise the blood concentrations of other drugs metabolized the same way to unsafe levels.

Finally, if the drug is approved, the FDA writes an approved label (that package inset of fine print most patients just throw away) based on what the Agency thinks is the best evidence. The sponsor may have conducted a trial that the FDA thinks is sloppy or needs to be repeated, and the evidence from that trial will NOT go into the label. Everything a drug company sales rep says to a doctor has to be “on label”. If he or she says something about that trial the FDA didn’t like? Well, let’s just say that the industry has recently had its wrist slapped a lot for stuff like that that.

The extreme scrutiny in the prescription market often leads laypeople to think that “they” (I assume the “they” so many laypeople often refer to is the FDA) are watching everything with the same level of control.

Not so.

First of all, nutritional supplements play by different rules from prescription drugs thanks to the DSHEA. Note the woring in that last link. Read it carefully. Is there any mention of "efficacy" on that page? No, there is not. The FDA had no authroity to require that a nutritional supplement actaully works.

Linus Pauling is responsible for that, having lent his reputation to outright quackery in the arena of Vitamin C supplements. In the course of that legal battle, the FDA’s hands have been tied for non-drug supplements. As long as they don’t cross certain lines, “nutraceutical” companies can hint around that they help, or even cure some ailment, and then throw the boilerplate disclaimer in the fine print, and get away with it.

Have you ever read the boilerplate disclaimer on a nutraceutical? You should, it should be an integral part of your bozo filter. I found similar language on several sites that are making what appear on the surface to be health-related claims:

The products and the claims made about specific products on or through this site have not been evaluated by _____________ or the FDA, and are not approved to diagnose, treat, cure or prevent disease.

The information provided on this site is for informational purposes only and is not intended as a substitute for advice from your physician or other health care professional or any information contained on or in any product label or packaging.

You should not use the information on this site for diagnosis or treatment of any health problem or for prescription of any medication or other treatment.

You will also find the “claims” of efficacy to be extremely vague on nutraceuticals.

“Helps promote a healthy _____”. What the blue blazes does “promote” mean? Scientists use specific language such as: “has been shown to prevent the progression of joint destruction in patients who have failed anti-TNF therapy”. THAT is a specific statement that can be proven or disproven by the use of statistics. Anything else is a religious discussion among people of different faiths.

If you have never paid attention to these disclaimers and weasel words on your echinacea , start doing so . “They” have very restricted powers in the US, which is one reason that “prescription” pharmaceuticals are referred to as “ethical” pharmaceuticals. Not to say that the pharma industry has a spotless ethics record, but they are accountable to the FDA when they don’t conform to the minimum standards. Nutraceuticals don’t even have to meet this minimum standard. Neither does your garden variety General Practitioner.

What did I mean by that last crack? A doctor, by dint of having a medical license, can prescribe any medication for any reason, or no reason at all. There is a good reason for this. The FDA’s purpose in asking for clinical trials is to describe the efficacy and safety of an agent in a broad spectrum of the patient population. Your individual doctor can look at the clinical evidence and say “This trial had 90% Caucasians in it and it is metabolized by the liver. You’re Asian, and Asians have a different pattern of liver enzymes, so let’s try this out at a dose that’s lower than the label.” A drug rep could NEVER suggest that (legally), but it’s a perfectly valid reason to go “off label”. There are many other such reasons, and your doctor went to school for many years and put in a lot of hard work in order to have the right to look at the evidence and toss it out the window (judiciously, of course).

Sometimes a side effect is so rare it doesn’t show up to the statisticians who measure such things until a drug has been in millions of people. That is why the FDA requires “post marketing surveillance” as well as the three major Phases of clinical development. A doctor needs to keep this in mind before going off-label.

Your doctor should go off label cavalierly, especially when the dose is higher than recommended, or he or she’s using the medicine for a disease that the FDA has not approved it for yet. Absence of evidence (of harm in this case) is not evidence of absence. Doctors can and do abuse their privilege to prescribe off-label.

GPs are the worst in this regard, because their training is less scientific than a specialist’s training. In fact GPs are not scientists at all. Their memorize and regurgitate style of learning in medical school covers scientific topics, but it is perfectly possible to become a GP and hold the most unscientific opinions. Science Based Medicine has plenty of documentation on GPs who get into what we in science call “woo”. If you see an alternative therapy that has a few MDs endorsing it, and those MDs are all GPs, watch out. Not to say that specialists don’t ever fall off the deep end, but those cases are fewer and farther between.

Finally, there is one more way to avoid the scrutiny of the FDA. Use a regimen of drugs that are approved for another purpose, but don't sponsor a clinical trial. I know you're scratiching your head at that one. What you can do is sell information - books, CDs, DVDs, but do NOT sell a drug, device or any treatment - then you would come under the authority of the FDA. To further cover yourself, rcommend that an MD read your literature and administer the treatment under his or her own license. Remember what I said about GPs and the limitations of their training? It is certianly possible to find a GP willing to do this. There are doctors who sell nutritional supplements in their office and who practice homeopathy, so there are certainly doctors willing to try a scientific-sounding treatment out on a desperate patient, especially if that doctor does not have deep training in the scientific method and specialist literature.*

If you are trying to evade FDA oversight, you can even ask that people voluntarily send you their medical history. This has the appearance of a clinical trial BUT IT IS NOT A CLINICAL TRIAL by the legal definition that gives the FDA oversight on human testing.

From an scientific perspective, this voluntary gathering of data is less than useless. There is no way to check the quality of the data coming from the individual doctors, and there is no enforced consistency in record keeping or in measuring the outcomes - did the doctor really do a 28 joint count on the patient, or did he or she just ask the patient a few questions in the exam? In other words, this is a collection of ancedotes. Once again, let's all say it together: the plural of anecdote is NOT data!

With this, I think you will have all the tools to parse the statements of our mystery researcher. Let’s go, shall we?

*I don't mean this criticism to be a blanket condemnations of GPs, here. Most, the vast majority, of GPs are good people who do a major amount of good in the world. It's just that assuming that the holder of an MD degree is always practitioner of science based medicince can adversely affect your health and your wallet.

No comments: