AHRQ Report – Excluding Progress? The Exclusionary factors and Missing Studies

by Cort Johnson | Oct 15, 2014 | Homepage | 36 comments

Was the AHRQ too hard on the ME/CFS studies?

My assumption has been that the AHRQ knows what it’s doing and is doing its job well. That assumption is based on the fact that the AHRQ appears to be thriving as an institution.

The AHRQ notes in its report, however, that “divergent and conflicted opinions are common” even among experts in these reports. I have been told that at least one very divergent opinion has been privately expressed by a respected ME/CFS expert who was consulted for this report and whose suggestions were not taken. I contacted another ME/CFS researcher who felt that overly strict inclusion criteria prevented the inclusion of many potential biomarkers.

With that in mind let’s take a look at the probably the most controversial aspect of the report – the small number of studies included in the final analysis. First let’s ground ourselves in the first question the AHRQ was asked and how important it was.

Key Question 1. What methods are available to clinicians to diagnose ME/CFS and how does the use of these methods vary by patient subgroups?

At least one of the ME/CFS experts invited to review the work was not happy…

Diagnosing ME/CFS – The current dogma is that an ME/CFS diagnosis is primarily based on identifying symptoms and ruling out other disorders. No laboratory tests are accepted diagnostic tools for ME/CFS. An AHRQ report stating that some lab tests could be used to diagnose ME/CFS would be big news, and that was clearly one of the hopes of the ME/CFS community. That didn’t happen.

Subgroups – Identifying subgroups was another important question. An AHRQ finding that x, y, or z laboratory tests could be used to diagnose specific types of patients would be another very valuable finding for this field. Since many studies have examined laboratory and other variables in ME/CFS, we might have expected that the AHRQ report would have something positive to say about finding subsets. They didn’t.

Getting positive answers to those questions, of course, would have required having the requisite studies making it to the final analysis, and they didn’t. The AHRQ included only four potential diagnostic biomarker studies out of possibly hundreds of possible studies in their analysis. That was a stunning loss.

Missing Studies

The loss was bigger than the report suggested. The AHRQ panel obviously needed to do a complete search in order to fully assess the state of diagnosis and treatment in Chronic Fatigue Syndrome.

Over 90% of the studies reviewed did not make it the final analysis….plus some studies were not reviewed at all.

Their search of the Ovid Medline database for 1988 to November Week 3 2013 brought up 5,902 potentially relevant articles of which 914 were selected for full-text review. Almost 6,000 citations is a lot of citations, but the panels search using the word “fatigue” was guaranteed to bring up an enormous number of citations.

As Erica Verillo has pointed out, Ovid Medline is a big database, but it’s not as big as the PubMed database. A search for Chronic Fatigue Syndrome on the PubMed database brings up 6400+ citations. Many of these would have been excluded because of their peripheral connection to ME/CFS, but they also appear to include studies that the Ovid Medline search missed (or were ignored).

Scanning the 60 plus pages of excluded and included studies in the AHRQ appendices suggested some important studies might be missing. A subsequent search for some prominent ME/CFS researchers indicated that many important studies were not just missing from the final analysis, but had apparently not been reviewed at all. They simply did not appear anywhere in the AHRQ’s analysis.

A 2010 study,for instance, examining immune functionality in viral vs non-viral subtypes that appeared to be a perfect study for examining subgroups was not reviewed. Remarkably, none of Shungu’s (NIH funded) brain lactate studies (which did use a disease control group) even made it to the point where they could have been excluded. Nor was Baraniuk’s stunning cerebral spinal fluid proteome study found anywhere in the report. Sometimes the omissions were hard to understand; one of the Light gene expression studies was included, but four others were not. (See Appendix for list.)

A Cook exercise study that seemingly did what the AHRQ wanted – compare ME/CFS and FM patients – didn’t make the first cut. Nor was the AHRQ seemingly aware of several studies showing reduced gray matter/ventricular volume by Lange. They missed Newton’s muscle pH and peripheral pulse biomarker studies. The exciting Schutzer study (another expensive NIH-funded study) showing different proteomes in the cerebral spinal fluid of ME/CFS and Lyme disease patients was nowhere to be seen.

This list is nowhere near complete – it simply covers some researchers I thought to look for. It suggests that possibly many studies that were never assessed for this report should have been.

Excluded studies

Only 64 of the 800 plus studies included in the review made it to the final analysis

A central question is how could so many studies, many of them done by good researchers (with major NIH grants), have been excluded from the final analysis? The exclusionary criteria prevented the AHRQ from assessing dozens of possible diagnostic biomarker studies. Natural killer cells, cerebral blood flow, exercise studies, pathogen studies, hormones, etc. – many important studies – all went by the wayside. An AHRQ positive report on any physiological biomarker of them would have been helpful given the behavioral headwinds ME/CFS still has to confront. Not one made the cut.

I don’t have the expertise to determine if the AHRQ’s inclusion and exclusion criteria were too strict, not strict enough, or just right. Judging from the results, the bar for admission was a high one indeed.

The AHRQ provided a list of exclusionary factors as well as a list of excluded studies and the reason for their exclusion. I went through about half of the excluded studies – about thirty pages worth – and divided them into potential treatment and diagnostic biomarker studies. The list of excluded potential diagnostic biomarker studies stretched to ten pages!

Exclusionary Criteria

The basic exclusionary criteria are below.

2,3,4	Excluded because the study does not address a Key Question or meet inclusion criteria, but full text pulled to provide background information
5	Wrong population
6	Wrong intervention
7	Wrong outcomes
8	Wrong study design for Key Question
9	Wrong publication type
10	Foreign language
11	Not a human population
12	Inadequate duration

The AHRQ’s rigorous exclusion criteria knocked over 90% of the reviewed studies from the final analysis, but the AHRQ was vague in explaining what they were defining as exclusion criteria. Given how important the exclusion criteria ended up being, their laxity in this area was surprising.

Exclusion criteria #2, for instance, was the most commonly used reason for excluding a diagnostic study, but it’s impossible to tell what it (or reasons 3 or 4) refer to. The report simply states that exclusion criteria 2, 3, and 4 refer to studies that did not address the Key Question, meet inclusion criteria, or ???.

Some vague exclusion criteria made it difficult to analyse the analysis

Similarly, exclusion criteria #9, “wrong publication type,” was the most common reason for rejecting a treatment study, but there is no explanation that I can find in the report or the appendices that explains what a “wrong publication type” is. (A search of the 300 plus pages of their “methods manual” was fruitless.)

The list above only begins, however, the list of exclusionary factors. Studies that didn’t meet the AHRQ’s “inclusion criteria” were also excluded, and among those inclusion criteria are more exclusionary factors as well.

Studies rejected because of “wrong study design”, for instance, include “non-systematic reviews, letters to the editor, before and after studies, case-control studies, non-comparative studies, reviews not in English, and studies published before 1988.”

Inclusion Criteria – for Diagnostic Studies

Then there are the inclusion criteria which are absolutely required for the inclusion of “diagnostic studies” in the analysis.

These criteria required that comparators of diagnostic accuracy and concordance be done – a process that appears to require that at least one of the following diagnostic outcome measures be assessed: sensitivity, specificity, positive predictive value, negative predictive value, positive likelihood ratio, negative likelihood ratio, C statistic (AUROC), net reclassification index, concordance, any potential harm from diagnosis (such as psychological harms, labeling, risk from diagnostic test, misdiagnosis, other).

Studies that did not include some of these statistical analyses were excluded from the review.

Missing statistical analyses appear to have doomed many studies

This requirement appeared to be a deal breaker for many of the putative biomarker studies. Most studies that have assessed potential biomarkers – whether NK cell functioning or cerebral blood flow – are what the AHRQ referred to as “etiological studies” looking for the cause of ME/CFS. They generally did statistical analyses that differentiated ME/CFS patients from healthy controls, but they did not do the types of analyses the AHRQ believes are needed to validate them as diagnostic tools.

The fact that many good researchers, some of whom have gotten significant grants from the NIH (indicating their study designs are solid), didn’t do these analyses suggests that these may be specialized types of analyses they’re not familiar with or don’t think are necessary.

Common Reasons for Exclusion

Now we look at a couple of the more common reasons for excluding a study from review.

Exclusion Criterion #2

Occurring in 45 of the 119 studies I pulled, exclusion criterion #2 was the most common reason cited for not including a possible diagnostic biomarker study. Since I examined about half the excluded studies, it’s possible that this one criterion was responsible for over 100 studies being excluded from the final report.

Unfortunately, it’s impossible to determine whether these studies were excluded because they didn’t address a Key Question or because they didn’t meet inclusion criteria.

Here are some examples taken randomly from potential diagnostic biomarker studies that failed based on Inclusion criterion 2:

A recent immunological biomarker study found distinct differences between NK cells and immune factors in ME/CFS patients and healthy controls, but stopped there. None of the statistical analyses the AHRQ was looking (sensitivity, specificity, positive predictive value, negative predictive value, positive likelihood ratio, etc.) were done.

A 1997 markers of inflammation study did include both chronic fatigue and chronic fatigue syndrome patients, but did none of the statistical analyses the AHRQ was looking for.

A 2005 serotonin receptor binding study (containing all of ten patients) found differences between healthy controls and CFS patients but did no further analyses

An oft-cited 1991 cortisol study examined the differences between healthy controls and ME/CFS patients and found them – but did no further analyses.

A 2012 study that did indeed find impaired cardiac functioning in ME/CFS (in 10 patients and 12 healthy controls) didn’t do any of the diagnostic analyses the AHRQ was looking for.

Kerr’s gene expression subtype study did look at a variety of clinical phenotypes but did not include any of the statistical analyses the AHRQ was looking for.

Snell’s study that used a classification analysis to show that a two-day exercise test was 95% accurate in distinguishing between ME/CFS patients and healthy controls, despite the extra statistical analyses included, also did not meet the inclusion criteria.

Studies that Should Have Been Included?

Doing a PubMed search I found several studies that did appear to meet the AHRQ’s statistical analysis requirements. All the studies below used an ROC analysis that produced sensitivity and specificity data. Some of these studies (plasma cytokines, energy metabolism, peripheral pulse) do not appear to have been reviewed at all.

Biomarkers in chronic fatigue syndrome: evaluation of natural killer cell function and dipeptidyl peptidase IV/CD26.

Plasma neuropeptide Y: a biomarker for symptom severity in chronic fatigue syndrome

Were some studies that did the right statistical analyses missed?

Plasma cytokines in women with chronic fatigue syndrome Fletcher MA¹, Zeng XR, Barnes Z, Levis S, Klimas NG.

Impaired blood pressure variability in chronic fatigue syndrome–a potential biomarker

Chronic fatigue syndrome and impaired peripheral pulse characteristics on orthostasis–a new potential diagnostic biomarker. Allen J¹, Murray A, Di Maria C, Newton JL.
Decreased expression of CD69 in chronic fatigue syndrome in relation to inflammatory markers: evidence for a severe disorder in the early activation of T lymphocytes and natural killer cells. Mihaylova I¹, DeRuyter M, Rummens JL, Bosmans E, Maes M.

Altern Ther Health Med. 2012 Jan-Feb;18(1):36-40. The assessment of the energy metabolism in patients with chronic fatigue syndrome by serum fluorescence emission. Mikirova N¹, Casciari J, Hunninghake R.

A measure of heart rate variability is sensitive to orthostatic challenge in women with chronic fatigue syndrome

Exclusionary Criterion #8

Study design was a commonly cited factor

With 25 citations exclusionary criterion #8 “Wrong Study Design” was the next most commonly cited reason to exclude studies from the final analysis. Reasons for not accepting a study based on study design included “Non-systematic reviews, letters to the editor, before and after studies, case-control studies, non-comparative studies, reviews not in English, and studies published before 1988”

Using case-control designs ended up being a big factor in the randomly picked studies I pulled out.

The use of a case-control design (a prohibited study design) appears to have doomed a study finding an increased incidence of severe life events three months prior to getting ME/CFS.
The case-control nature of the Jammes study finding high levels of oxidative stress and HSP factors after exercise in ME/CFS appears to have eliminated it from consideration as well.
Case-control problems popped up again in Newton’s recent study finding increased acidosis after exercise study.
It was not clear what study design issues stopped a 1998 study finding reduced cognition after exercise from being included.

Conclusion

Stiff exclusionary criteria resulted in over 90% of ME/CFS studies not making the final cut, leaving possibly hundreds of biomarker studies out of the final analysis. A significant number of studies appear not to have been reviewed at all. Several diagnostic studies that were excluded from the analysis also appeared, at least to this layman, to fit their inclusion criteria. The AHRQ’s inability to explain several important exclusion criteria in sufficient detail made it impossible to tell exactly why many studies were excluded.

Including the missing studies might have provided more meat for the AHRQ to work with, but it’s hard to escape the conclusion that the stiff inclusion/exclusion criteria were the primary reason for the report’s paltry findings. Including studies that the panel missed or which did the ROC analyses the AHRQ appeared to be looking for might, however, have changed the complexion of the report.

Several ME/CFS experts have privately expressed concern at the AHRQ’s findings. We’re also waiting on the SolveME/CFS Association’s review of the report.

Appendix

Studies Missed (Not Included or Excluded in the Review)

J Behav Neurosci Res. 2010 Jun 1;8(2):1-8. A Comparison of Immune Functionality in Viral versus Non-Viral CFS Subtypes. Porter N¹, Lerch A², Jason LA, Sorenson M, Fletcher MA, Herrington J.

Cytokine expression profiles of immune imbalance in post-mononucleosis chronic fatigue. Broderick G, Katz BZ, Fernandes H, Fletcher MA, Klimas N, Smith FA, O’Gorman MR, Vernon SD, Taylor R. J Transl Med. 2012 Sep 13;10:191. doi: 10.1186/1479-5876-10-191.
Exercise responsive genes measured in peripheral blood of women with chronic fatigue syndrome and matched control subjects. Whistler T, Jones JF, Unger ER, Vernon SD.

A Chronic Fatigue Syndrome – related proteome in human cerebrospinal fluid. Baraniuk JN, Casado B, Maibach H, Clauw DJ, Pannell LK, Hess S S. BMC Neurol. 2005 Dec 1;5:22.

Differences in metabolite-detecting, adrenergic, and immune gene expression after moderate exercise in patients with chronic fatigue syndrome, patients with multiple sclerosis, and healthy controls. White AT, Light AR, Hughen RW, Vanhaitsma TA, Light KC.

Genetics and Gene Expression Involving Stress and Distress Pathways in Fibromyalgia with and without Comorbid Chronic Fatigue Syndrome. Light KC, White AT, Tadler S, Iacob E, Light AR.

Severity of symptom flare after moderate exercise is linked to cytokine activity in chronic fatigue syndrome. White AT, Light AR, Hughen RW, Bateman L, Martins TB, Hill HR, Light KC.

Moderate exercise increases expression for sensory, adrenergic, and immune genes in chronic fatigue syndrome patients but not in normal subjects. Light AR, White AT, Hughen RW, Light KC. Psychosom Med. 2012 Jan;74(1):46-54. doi: 10.1097/PSY.0b013e31824152ed. Epub 2011 Dec 30.

Cerebral vascular control is associated with skeletal muscle pH in chronic fatigue syndromepatients both at rest and during dynamic stimulation. He J, Hollingsworth KG, Newton JL, Blamire AM.

Clinical characteristics of a novel subgroup of chronic fatigue syndrome patients with postural orthostatic tachycardia syndrome. Lewis I, Pairman J, Spickett G, Newton JL.

Chronic fatigue syndrome and impaired peripheral pulse characteristics on orthostasis–a new potential diagnostic biomarker. Allen J, Murray A, Di Maria C, Newton JL.

Physiol Meas. 2012 Feb;33(2):231-41. doi: 10.1088/0967-3334/33/2/231. Epub 2012 Jan 25. Chronic fatigue syndrome and impaired peripheral pulse characteristics on orthostasis–a new potential diagnostic biomarker.

Allen J¹, Murray A, Di Maria C, Newton JL. Increased d-lactic Acid intestinal bacteria in patients with chronic fatigue syndrome.

Sheedy JR, Wettenhall RE, Scanlon D, Gooley PR, Lewis DP, McGregor N, Stapleton DI, Butt HL, DE Meirleir KL. In Vivo. 2009 Jul-Aug;23(4):621-8.

Responses to exercise differ for chronic fatigue syndrome patients with fibromyalgia. Cook DB, Stegner AJ, Nagelkirk PR, Meyer JD, Togo F, Natelson BH. Med Sci Sports Exerc. 2012 Jun;44(6):1186-93. doi: 10.1249/MSS.0b013e3182417b9a.

Med Sci Sports Exerc. 2005 Sep;37(9):1460-7. Exercise and cognitive performance in chronic fatigue syndrome. Cook DB¹, Nagelkirk PR, Peckerman A, Poluri A, Mores J, Natelson BH.

Regional grey and white matter volumetric changes in myalgic encephalomyelitis (chronic fatigue syndrome): a voxel-based morphometry 3 T MRI study. Puri BK¹, Jakeman PM, Agour M, Gunatilake KD, Fernando KA, Gurusinghe AI, Treasaden IH, Waldman AD, Gishen P.

Unravelling intracellular immune dysfunctions in chronic fatigue syndrome: interactions between protein kinase R activity, RNase L cleavage and elastase activity, and their clinical relevance. Meeus M, Nijs J, McGregor N, Meeusen R, De Schutter G, Truijen S, Frémont M, Van Hoof E, De Meirleir K. In Vivo. 2008 Jan-Feb;22(1):115-21.

Detection of herpesviruses and parvovirus B19 in gastric and intestinal mucosa of chronic fatigue syndrome patients. Frémont M, Metzger K, Rady H, Hulstaert J, De Meirleir K. In Vivo. 2009 Mar-Apr;23(2):209-13.

J Psychosom Res. 2006 Jun;60(6):559-66. Impaired natural immunity, cognitive dysfunction, and physical symptoms in patients with chronic fatigue syndrome: preliminary evidence for a subgroup? Siegel SD¹, Antoni MH, Fletcher MA, Maher K, Segota MC, Klimas N.

Neuroimage. 2005 Jul 1;26(3):777-81. Epub 2005 Apr 7. Gray matter volume reduction in the chronic fatigue syndrome. de Lange FP¹, Kalkman JS, Bleijenberg G, Hagoort P, van der Meer JW, Toni I.

Neuroimage. 2005 Jun;26(2):513-24. Epub 2005 Apr 7. Objective evidence of cognitive complaints in Chronic Fatigue Syndrome: a BOLD fMRI study of verbal working memory. Lange G¹, Steffener J, Cook DB, Bly BM, Christodoulou C, Liu WC, Deluca J, Natelson BH. Appl Neuropsychol. 2001;8(1):23-30.

Quantitative assessment of cerebral ventricular volumes in chronic fatigue syndrome. Lange G¹, Holodny AI, DeLuca J, Lee HJ, Yan XH, Steffener J, Natelson BH. PLoS One. 2011 Feb 23;6(2):e17287. doi: 10.1371/journal.pone.0017287.

Distinct cerebrospinal fluid proteomes differentiate post-treatment lyme disease from chronic fatigue syndrome. Schutzer SE¹, Angel TE, Liu T, Schepmoes AA, Clauss TR, Adkins JN, Camp DG, Holland BK, Bergquist J, Coyle PK, Smith RD, Fallon BA, Natelson BH.

36 Comments