Design and Analysis of Controlled Trials in Naturally Clustered EnvironmentsImplications for Medical Informatics
Abstract
In medical informatics research, study questions frequently involve individuals who are grouped into clusters. For example, an intervention may be aimed at a clinician (who treats a cluster of patients) with the intention of improving the health of individual patients. Correlation among individuals within a cluster can lead to incorrect estimates of the sample size required to detect an effect and inappropriate estimates of the confidence intervals and the statistical significance of the intervention effects. Contamination, which is the spread of the effect of an intervention or control treatment to the opposite group, often occurs between individuals within clusters. It leads to an attenuation of the effect of the intervention and reduced power to detect a difference. If individuals are randomized in a clinical trial (individualrandomized trial), then correlation must be taken into account in the analysis, and the sample size may need to be increased to compensate for contamination. Randomizing clusters rather than individuals (clusterrandomized trials) can eliminate contamination and may be preferred for logistical reasons. Clusterrandomized trials are generally less efficient than individualrandomized trials, so the tradeoffs must be assessed. Correlation must be taken into account in the analysis and in the samplesize calculations for clusterrandomized trials.
Medical informatics interventions that are designed to improve the care of patients are frequently aimed not directly at patients but at clinicians who care for multiple patients. For example, a clinical decision support system may give advice to clinicians, yet the important outcome is the patients' health. When such interventions are studied, determining the appropriate randomization method, sample size, and approach to analysis can be challenging. Even when the intervention is aimed at the individual, natural clustering into clinicians' practices, families, outpatient practices, hospital wards, schools, and communities can affect the results. Clustering may be nested1; for example, patients may be clustered by physicians and physicians may be clustered by multiphysician practices.
In individualrandomized trials, a researcher assigns individuals to a control or intervention group without regard to clustering, or at least in such a way that control patients and intervention patients may be mixed within a cluster. Thus, a given clinician may treat control patients and intervention patients, receiving computergenerated advice on some patients but not on others. If the researcher randomizes individuals but ignores the natural clustering of patients in the analysis, erroneous conclusions may result because of correlation among patients within clusters. Furthermore, contamination due to indirect benefit from the intervention to patients in control groups may affect the results.
In clusterrandomized trials, investigators randomize intact units such as clinicians, families, practices, or other clusters into intervention or control groups. For example, in a study by McDonald et al.,2 27 practice teams were randomly allocated to use a clinical decision support system or to serve as controls. In another example, McDowell et al.3 randomized 822 families to receive no influenza vaccine reminder (control group) or to receive a reminder from their clinician, by telephone, or by letter. If the researcher randomizes by cluster, then the power of the study to detect differences between groups will decrease and, if the analysis is not done correctly, erroneous conclusions may be drawn.
A review of evaluations of clinical decision support systems4 revealed that 24 of 61 studies randomized clusters of patients. Of those 24, only one mentioned sample size calculations, and it did not account for clustering, and only 14 accounted for clustering in the analysis. In this paper, we review the issues surrounding the clustering of patients, covering randomization, sample size, and approaches to analysis.
Addressing the Pitfalls of Naturally Clustered Environments
In the discussion that follows, we refer frequently to an example in which the patients are subjects and are clustered by virtue of seeing the same clinician or attending the same clinic. The discussion applies more broadly, however, to students in classrooms, clinicians in practices, hospitals in networks, and others. The clustering of subjects into groups can lead to two major effects—correlation and contamination.
Correlation
Two variables are said to be correlated when they change together, in either the same or opposite directions.5 In a naturally clustered environment, correlation can occur when patients in a cluster have outcomes influenced by some common factor. In most medical informatics studies, correlation will be positive: Patients within groups will tend to have similar outcomes.
Correlation within clusters has several sources.6–8 First, patients often selfselect; that is, they choose the clusters to which they belong. For example, patient characteristics such as age, gender, ethnic group, location, or insurance plan may influence their choice of clinicians, leading to similarities across the patients who see a given clinician.
Second, important clusterlevel attributes may affect all cluster members in the same way. For example, the rates of clinicians' compliance with the reminders generated by a clinical decision support system may vary systematically across clinicians. Thus, patients who receive care from the same clinician may be more likely to receive similar treatment than are patients who receive care from different clinicians.
Third, individuals within a cluster often influence each other. For example, the transmission of attitudes and behavior among a cluster of clinicians can lead to all members of the cluster displaying similar behavior. Thus, all clinicians who work on a specific ward or at a specific hospital may have a similar rate of compliance with reminder messages.
Correlation within clusters may affect the results of individualrandomized trials and clusterrandomized trials differently. In individualrandomized trials, most clusters have both intervention and control subjects. Assuming positive correlation, subjects within clusters will look more like each other than subjects in other clusters. Ideally, one would like to compare intervention subjects with control subjects within each cluster (because subjects are most like each other within clusters) and then average the effect across clusters.9
If, instead, clustering is ignored, then all the intervention subjects are pooled and all the control subjects are pooled and the two groups are compared. The variance due to differences between clusters (which is relatively large) is mixed with variation between subjects within clusters (which is relatively small because of the positive correlation), resulting in unnecessarily large standard deviations, larger p values, wider confidence intervals, and falsenegative results.9 By accounting for clustering, the researcher can remove a significant source of variance (the clusters) and create more precise estimates.
In clusterrandomized trials, failure to account for clustering may even lead to falsepositive results with erroneously small p values. In this case, each cluster has either intervention subjects or control subjects but not both. Comparing intervention with control subjects therefore requires comparisons across clusters, and the variance due to differences between clusters contributes to the variance of the estimates (rather than being something to be factored out).
Ignoring clustering in the analysis will mix cluster variance (relatively large) with variance between subjects within clusters (relatively small because of the positive correlation) and lead to an underestimate of the overall variance, with inappropriately small p values and narrow confidence intervals. (For example, if the correlation is high, then adding patients to existing clusters will add little information to the data set, yet an analysis that ignores clustering will count each new patient as an independent observation and exaggerate the precision of its estimates.) A proper analysis that accounts for clustering will correctly estimate the variance of the estimates. In either form of randomization, proper analysis can account for correlation.
Contamination
Contamination is the spread of the effect of an intervention from one group to another. Contamination occurs when the control group members are exposed to the experimental intervention or the intervention group members are exposed to the control treatment.10 Contamination is a concern in medical informatics studies. For example, in a study of clinical decision support systems with patients as the randomization factor, clinicians may gain knowledge from use of the system in intervention patients and apply it to control patients.11 Similarly, in a trial for the prevention of coronary heart disease, participants in the control group may learn about the experimental intervention and adopt it themselves because intervention and control participants are in close proximity and information may be shared.10
Contamination leads to the attenuation of the treatment effect, because the control group and the intervention group look more alike. The result is a reduced ability to detect an effect (falsenegative result). The researcher can compensate for contamination by increasing the sample size; the effect will be underestimated, but it may still be significantly different statistically from no effect.
The researcher can eliminate the contamination by using a clusterrandomized design, thus separating the control subjects from the intervention subjects. McDonald et al. used a clusterrandomized design to reduce contamination: by randomizing practice teams, the investigators could ensure that the clinicians caring for the control subjects would be isolated from the effects of experience with the clinical decision support system.2 Similarly, in the study by McDowell et al.,3 use of families as the randomization unit reduced potential contamination. If an individual were randomized to serve as a control, but the spouse was randomized to the intervention group, then a reminder issued to the spouse might have an influence on the vaccineseeking behavior of the control subject.
Randomization Method: Making a Choice
Neither an individualrandomized design nor a clusterrandomized design is superior in all circumstances, so the researcher must make a choice. Assuming that the data will be analyzed properly to account for natural clustering (see subsequent sections), the choice depends on efficiency, contamination, and practical considerations such as ethics, cost, and feasibility.10
If contamination is low, then individualrandomized trials tend to be more efficient than clusterrandomized trials and thus tend to require fewer subjects. If individuals are randomized, then each cluster will usually have individuals both in the control and in the intervention groups, so that control and intervention subjects are in effect matched by cluster, making it easy to remove the effect of intercluster variability. If clusters are randomized, then each cluster has only control subjects or only intervention subjects. Intercluster variability can still be estimated and accounted for, but the analysis is less efficient.
If contamination is likely to be significant, then a clusterrandomized design that eliminates contamination can improve the probability of detecting an effect, and this improvement may be greater than the relative loss of efficiency due to correlation within clusters. In terms of total sample size, clusterrandomized trials may be more efficient than individualrandomized trials when contamination is greater than 30 percent.12
Generally, individualrandomized trials should be considered first if an intervention is delivered to individual patients directly. Both correlation within clusters (patients being treated by the same clinician) and contamination are still possible, but as long as clusters are accounted for in the analysis, individual randomization is likely to be more efficient.
If an intervention is delivered to clinicians, a clusterrandomized trial may be superior because contamination is difficult to quantify and may in fact be very high.10 McDonald et al. argued, however, that some interventions that are aimed at clinicians but that involve complex calculations may be immune to contamination because the clinicians cannot apply the experience to the control subjects without the help of the information system.13 The justification for the use of the clusters as the units of allocation should be reported explicitly.14
Logistical or ethical issues may favor a clusterrandomized design. For example, women invited to participate in a breast cancer screening program may want to discuss their options and the associated risks and benefits with their clinician before making a decision. If only half of a particular clinician's patients have been invited to participate, patients not invited may feel resentment.15 For this reason, many trials of cancer screening programs have adopted clusterrandomized designs.
Design and Analysis of Individual Randomized Trials
Handling Correlation in the Analysis of Individualrandomized Trials
In this section, we consider the two main approaches to accounting for correlation within clusters—fixed effects models and random effects models. Both types of models separate the variability between clusters from the effect of the intervention, thus improving the precision of the estimate of the treatment effect.
In a fixed effects model, the effects of different clusters on the outcome are regarded as fixed but unknown quantities to be estimated.16 This model assumes that the clusters observed in the experiment are the only clusters of interest. If subjects are clustered by academic department, for example, and if all the departments in an institution (or all the departments of interest) are included in the study, then department can be treated as a fixed effect. Fixed effects models estimate regression coefficients (using linear regression or logistic regression, for example) to embody the effect of each cluster. Strictly speaking, conclusions drawn from the experiment apply specifically to those clusters that were included in the study, although it can be inferred that similar results might be obtained in similar clusters.
Given a welldefined question, fixed effects models usually lead to more precise results, and such models are the only realistic option if there are very few clusters. True random sampling of clusters is rare in clinical trials.
Random effects models assume that the clusters are a random sample drawn from a larger population of potential clusters.16 For example, a study may include a limited number of clinical practices that constitute a random sample drawn from some larger number of practices. The results of the random effects analysis apply not just to the clusters that were observed but also to the larger population from which they were drawn. The results are thus more generalizable than those for the fixed effects models. Such models offer a more realistic representation of the true uncertainties due to the sampling of clusters, and this is represented by wider confidence intervals for the results.17 The choice between fixed or random effects models may depend on the sampling situation and on the intended scope of the inferences.9,17–20
In practice, a given model may have many terms, some of which are random and some of which are fixed. For example, the cluster term may be a random effect, but the intervention term, which takes on only two values (“intervention” or “control”), will be fixed. Such models may be referred to as “mixed” effects models or simply as random effects models. The important issue in this discussion is whether the cluster term is a fixed or random effect.
To illustrate the two models and the consequence of not accounting for clustering, we simulated a data set for an individualrandomized trial in a naturally clustered environment. In the simulation, 100 patients were randomized to treatment (placed on an antihypertension guideline) or to control (no guideline). Patients attended one of ten clinics, and correlation due to selfselection within clinics was expected. Randomization did not take clinics into account.
The data set contained four variables—patient age (a continuous variable), clinic identifier (a nominal variable), treatment group (a dichotomous variable), and systolic blood pressure (the continuous outcome variable). The intracluster correlation coefficient for systolic blood pressure within clinics was 0.470 after adjustments for age and for treatment group; thus there was a significant positive correlation in the data set. We used linear regression (GLM procedure in SAS) for the fixed effects models and linear mixed models (MIXED procedure in SAS) for the random effects model.21,22 (The data sets and SAS code are available at http://www.dmi.columbia.edu/homepages/chuangj/cluster/.)
The first model (Table 1) ignored clustering (the clinic variable) in the analysis, resulting in a wide confidence interval for the treatment effect and a nonsignificant p value (p=0.136). The second model accounted for clustered as a fixed effect, resulting in a narrower confidence interval and a statistically significant p value (p=0.013). The third model accounted for clustering as a random effect, resulting in a confidence interval that was only slightly wider than that for the corresponding fixed effect model and a similar p value (p=0.015). Clearly, the decision to account for clustering (by any method) was most important. The choice between a fixed effects model or a random effects model to account for clustering led to only a subtle difference in this example.
Model  Description  Fixed Effects  Random Effects  pValue of Difference  Treatment Difference(95% CI) 

1  Fixed effects model that ignores clustering(incorrect model)  Age, treatment  –  0.136  4.5 mmHg (−1.4–10.4) 
2  Fixed effects model that accounts for clustering  Age, treatment, clinic  –  0.013  6.0 mmHg (1.3–10.7) 
3  Random effects model that accounts for clustering  Age, treatment  Clinic  0.015  5.8 mmHg (1.1–10.5) 
When the outcome variable is binary rather than continuous, a number of approaches may be taken to analyzing individualrandomized trials in naturally clustered environments. The tutorial by Agresti and Hartzel, for example, compared MantelHaenszel methods (FREQ procedure in SAS), logistic regression (GENMOD procedure in SAS), and a generalized linear mixed model (NLMIXED procedure in SAS).19
Compensating for Contamination in Individualrandomized Trials
To make up for the attenuation of the treatment effect due to contamination, the researcher may increase the sample size by the factor , where “contamination” is the proportion of the treatment effect that is attenuated.10,23 For example, if the treatment effect is reduced by 10 percent because of contamination, then the sample size should be increased by approximately 25 percent. The degree of attenuation is difficult to estimate, however, and the calculation of adjusted sample sizes is often impractical in medical informatics studies. In these cases, a clusterrandomized design may be better.
Design and Analysis of Clusterrandomized Trials
Choice of Study Design
When researchers randomize clusters rather than individuals, several common designs are available—completely randomized, stratified, and matchedpair.14 In a completely randomized cluster study, each cluster is assigned with a predefined probability to one of the possible intervention groups, and assignments are made independently of each other. In a stratified randomization cluster study, clusters are presorted into strata according to characteristics that are likely to be associated with the outcome, and then clusters within strata are randomly assigned to intervention groups. Completely randomized and stratified designs may also employ a blocking strategy to ensure that the number of clusters per treatment group is approximately balanced.23–25 In a matchedpair cluster design, pairs of clusters are matched on the basis of characteristics that are likely to be associated with the outcome, and then one randomly chosen cluster in each pair is assigned to the intervention group and the other is assigned to the control group.
Matchedpair and stratified designs can reduce the potential for baseline imbalance (imbalance in factors that are likely to be correlated with outcomes), especially in a small study.26 Just as in individualrandomized trials, stratification and matching should be done only on clusterlevel variables that are known to be highly correlated with outcome.14,27 Frequently used stratification factors include cluster size, geographic area, socioeconomic indicators, and characteristics of clinicians and hospitals.28–31
Sample Size Estimation in Clusterrandomized Trials
Applying standard sample size formulas to clusterrandomized trials may lead to underestimation of the required sample size. For completely randomized cluster design, we multiply standard sample size estimates by a design effect (Deff) term given by where m denotes the estimated average cluster size (average number of individuals per cluster, assuming that all clusters are of a similar size) and ρ denotes the intracluster correlation coefficient. Methods for samplesize estimation for stratified and matchedpair designs are presented elsewhere. 14,32,33
The intracluster correlation coefficient represents the ratio of betweencluster variability to total variability, where 0≤ρ≤1: where σ^{2}_{b} denotes the betweencluster variance component and σ^{2}_{w} denotes the withincluster variance component.24 If ρ equals 0, then individuals within the same cluster are no more likely to be correlated with each other than with individuals in different clusters. If ρ equals 1, there is no variability within a cluster, and individuals within the same cluster respond identically.23,27 The correlation coefficient can be obtained from previously published studies or from the analysis of pilot data. A random effects model (for example, MIXED procedure in SAS) can supply the estimates of the variance components (σ^{2}_{b} and σ^{2}_{w}).
An example is taken from Kerry and Bland,34 in which a behavioral intervention trial was proposed to reduce smoking rates among patients seen by primary care practice groups. The clusters were patients seen at individual practices, and the researchers used completely randomized clusters. The goal was to estimate the sample size needed to achieve 90 percent power at the twosided 5 percent level of significance for detecting a 5 percent prevalence difference in smoking. The smoking rate in the control group was 22 percent and the betweenpractice variance (σ^{2}_{b}) of the smoking rate was 0.0014, which was obtained from an earlier study.35
Based on a standard sample size equation for an individualrandomized trial, the total number of patients required per intervention group was 1,318. The mean of the smoking rates (P̅) between the control (22 percent) and intervention (17 percent) groups was 19.5 percent. The value of withinpractice variance (σ^{2}_{w}) was 0.1570, which was estimated from P̅(1 − P̅). Based on σ^{2}_{b} and σ^{2}_{w}, the intracluster correlation coefficient was 0.0088.
If the number of patients per practice, m, was 50, then the design effect was 1.43, and the total number of patients required was 1,900, implying 38 practices in each intervention arm (rather than 27 practices in each arm if the original sample size of 1,318 had been used). Failing to account for clustering might have led to a negative result due to insufficient sample size.
Handling Correlation in the Analysis of Clusterrandomized Trials
To avoid falsepositive or otherwise misleading conclusions, it is critical to account for correlation within clusters in clusterrandomized trials. There are two overall approaches to the analysis—clusterlevel analyses and individuallevel analyses that account for clustering. (Note that the level of the randomization may be different from the level of analysis. One may randomize clusters but still analyze by individuals as long as this is done correctly.)
Clusterlevel analyses aggregate the individual observations using cluster means, proportions, or log odds, resulting in a single value per cluster. Standard statistical methods are then applied. In general, clusterlevel analyses can be used for completely randomized, matchedpair, and stratified cluster designs. For example, a standard or a weighted twosample Student t test or a Wilcoxon ranksum test (a nonparametric approach also known as the MannWhitney U test) can be used to cluster means36,37 in a completely randomized design. The weighted t test is preferred to the standard t test if cluster sizes vary significantly. A paired t test, a weighted paired t test, or a signed rank test can be applied at the cluster level in matchedpair and in stratified designs. Although clusterlevel analyses address the problem of correlation within clusters, individuallevel covariates cannot be analyzed because the data have been aggregated.
Individuallevel analyses preserve the individual observations but still account for the correlation within clusters. Random effects models and generalized estimating equations (GEEs)38 are commonly used. Individuallevel analyses can be applied to any of the randomized cluster designs. They account for both individual and clusterlevel variation and thus may produce more precise estimates than clusterlevel analyses. In addition, individuallevel analyses can be applied to nonrandomized designs because they support the control of baseline differences between groups; nevertheless, randomized studies are preferred when they are possible.
The choice between individual and clusterlevel analyses in clusterrandomized trials depends on several factors.14,27 In the absence of adjustment for individuallevel covariates (that is, if the individual level information is ignored, other than its contribution to the cluster mean), the two approaches will produce similar results. When the focus of the inference is at the cluster level—for example, clinicians' compliance with clinical guidelines—clusterlevel analyses are preferred because the units of intervention are the same as the units of evaluation. Furthermore, when the number of clusters per intervention group is small, it may be better to use a clusterlevel analysis. Individuallevel analyses that account for clustering may be more appropriate when the individual outcomes (for example, the status of diabetic control) are the primary focus. In circumstances where individual covariates need to be adjusted, individuallevel analysis should be used.
According to our previous review,4 11 of 14 clinical decision support system studies that did account for clustering in their analyses used clusterlevel analyses.2,29–31,39–45 They most frequently used the twosample t test and analysis of variance. The other three studies used individuallevel analyses.46–48 Of these, one used a random effects model and two used generalized estimating equations.
We used data from Kerry and Bland36,37 to illustrate the analysis of clusterrandomized studies. They studied the effect of a clinical guideline on the appropriateness of radiology referrals in 34 general medical practices (Table 2). Intervention practices received a copy of the guideline and control practices did not. Practices (clusters) were of widely varying size. The outcome was dichotomous—whether or not the patient's care conformed to the guideline. We used the SAS System22 for all analyses. The results are shown in Table 3.
Intervention Group  Control Group  

No. of Conforming Requests/Total  %  No. of Conforming Requests/Total  %  
20/20  100  7/7  100  
7/7  100  33/37  89  
15/16  94  32/38  84  
28/31  90  23/28  82  
18/20  90  16/20  80  
21/24  88  15/19  79  
6/7  86  7/9  78  
5/6  83  19/25  76  
25/30  83  90/120  75  
53/66  80  64/88  73  
4/5  80  15/22  68  
33/43  77  52/76  68  
32/43  74  14/21  67  
16/23  70  83/126  66  
44/64  69  14/22  64  
4/6  67  21/34  62  
10/18  56  4/10  40  
Mean (SD)  –  81.5  –  73.6 
–  (12.0)  (13.1)  
Total^{*}  341/429  79.5  509/702  72.5 
Source: Kerry SM, Bland JM.37 © 1998 BMJ Publishing Group. Used with permission.

↵* The percentage conforming based on the raw total does not equal the mean of the percentage conforming for each cluster, because clusters are of unequal size.
Statistical Test  Test Statistic (df)  p Value  Effect Size  95% CI 

Ignoring clustering in the analysis (incorrect approach):  
Chisquare test  χ^{2}=6.95 (1 df)  0.008  PD=7%  2%–12% 
OR=1.5  1.1–2.0  
Clusterlevel analyses:  
Twosample t test  t=1.84 (32 df)  0.074  MD=8%  −1%–17% 
Weighted t test  t=2.09 (32 df)  0.044  WMD=7%  0%–14% 
Individuallevel analyses that account for clustering:  
Random effects model  t=2.00 (1,097 df)  0.046  OR=1.5  1.0–2.3 
Generalized estimating equations  z=2.06  0.040  OR=1.5  1.0–2.1 
Abbreviations: PD indicates proportion difference; MD, mean difference; WMD, weighted mean difference; OR, odds ratio; CI, confidence interval; df, degree of freedom.

▪ Ignoring clustering in the analysis (incorrect approach). If clustering is ignored, a simple 2x2 contingency table is formed and a chisquare test (FREQ procedure in SAS) can be used to compare the two proportions. The difference in conformance (7 percent) appeared to be highly significant (p=0.008). This naïve approach must not be used in practice, because it will count every individual as an independent observation despite positive correlation within clusters. The result will be artificially precise estimates and artificially low p values.

▪ Twosample t test (clusterlevel analysis). For a simple clusterlevel analysis, we used a twosample t test (TTEST procedure in SAS) with the percentage of patients conforming per practice as the observations. The mean difference in conformance (8 percent) was not significant (p=0.074).

▪ Weighted t test (clusterlevel analysis). The number of patients per practice varies widely in this example, so a weighted t test may be preferred.27 We used a t test weighted by the number of patients per cluster (TTEST procedure with WEIGHT statement in SAS).36 The weighted mean difference (7 percent) was statistically significant (p=0.044).

▪ Random effects model (individuallevel analysis). We used the SAS GLIMMIX macro to implement a generalized linear mixed model. In this example, the random effects variable was the practice (cluster); the other variables were fixed effects. The odds ratio (1.5) was significantly different from 1 (p=0.046). (With Version 7 of SAS, the NLMIXED procedure can be used to fit a similar model.)

▪ Generalized estimating equations (individuallevel analysis). Generalized estimating equations have become important tools for analyzing longitudinal and other correlated data.14,27,38 They generate estimates similar to those from standard logistic regression but adjust for the effect of clustering. The structure of the correlation can be estimated from the data. We used the GENMOD procedure with REPEATED statement in SAS to implement generalized estimating equations. The odds ratio (1.5) was significantly different from 1 (p=0.040).
In this example, the naïve approach gave a deceptively low p value—far lower than any other approach tried. A cluster level analysis using a simple t test was not statistically significant due to widely varying cluster size. All other approaches had similar results, with p values ranging from 0.040 to 0.046. Therefore, the single most important decision is to account for clustering in the analysis. Failure to do so can lead to incorrect results. The differences among the various methods of accounting for clustering can be subtle, although a poor choice can result in some loss of power.
Conclusions
In many medical informatics studies, individuals are naturally grouped into clusters. Because of correlation within clusters and contamination between intervention groups within clusters, falsepositive or falsenegative results may be obtained if clustering is not accounted for in sample size estimates and in the analysis. Clusterrandomized designs can eliminate contamination and may sometimes be appropriate for logistical reasons, but they must be analyzed appropriately.
Acknowledgments
The authors thank Lyn Dupré for her editing.
Footnotes

This work was supported in part by grant R01 LM06910 from the National Library of Medicine and grant R01 HL65365 from the National Institutes of Health. Dr. Chuang was supported in part by the Medical Foundations in Memory of Dr. ChiShuen Tsou and Dr. Albery LyYoung Shen, Taiwan.
 American Medical Informatics Association