When data from periodontal clinical trials are reported, clinicians should not be left to wonder about the clinical relevance of statistically significant results. As new procedures are developed, they need to be evaluated to determine if they provide statistically and clinically significant benefits. If both are furnished, the therapeutic modality can be considered for incorporation into treatment regimes. Currently, there is too great a reliance on using statistical significance testing, or hypothesis testing, to detect a statistically significant difference between therapies, which then often is used to infer that a therapy supplied a clinically meaningful result.16 This is problematic because it is possible for a procedure to provide a statistically significant improvement, while the result may not be clinically significant.
There is a need to define a set of criteria related to clinical parameters that would reflect important therapeutic changes that have clinical relevance to the practitioner.
Accordingly, there is a need to define a set of criteria related to clinical parameters that would reflect important therapeutic changes that have clinical relevance to the practitioner. In this literature review, I address the limitations of statistical significance testing and the advantages of identifying criteria related to periodontal clinical parameters used to define clinical significance. These two related subjects are important because they can influence how clinicians interpret the results of clinical trials and how patients subsequently are treated.
 |
GOAL OF PERIODONTAL THERAPY
|
|---|
The primary purpose of periodontal therapy is to attain periodontal health and retain the dentition. This goal includes restoration of form and function, esthetics and prevention of further disease progression. The absence of clinical inflammation (for example, redness or bleeding on probing), stable or decreasing probing depths, unchanging or improving clinical attachment and bone levels, unvarying or decreasing mobility patterns, tooth retention, comfortable function and patient satisfaction indicate health and reflect successful outcomes of periodontal therapy.7,8
 |
THE PROBLEM OF DEFINING CLINICAL SIGNIFICANCE
|
|---|
A dilemma concerning the interpretation of data from periodontal clinical trials exists because the profession has been reluctant to establish standards to quantify clinical significance. This has resulted in using arbitrary statistical standards to define the merits of therapeutic techniques. Feinstein9 stated that "there is an entrenched reluctance to judge the familiar" (use of routine clinical parameters) "and docile conformity in accepting the unfamiliar" (hypothesis testing). This problem could be resolved if changes representing meaningful results were defined for clinical parameters in diverse situations, thereby facilitating hypothesis testing regarding relevant clinical findings. This task is daunting; therefore, no consensus has been reached in defining the magnitudes of change for specific parameters that could reflect clinically significant improvements.
A dilemma concerning the interpretation of data from periodontal clinical trials exists because the profession has been reluctant to establish standards to quantify clinical significance.
Periodontal diseases are site specific, and various types of defects and diseases may require different definitions of clinical significance for their responses to therapy.6 In this regard, it would be advantageous to define important clinical improvements that previously have been referred to with different terms such as "clinically significant," "clinically meaningful," "clinically relevant," "quantitatively significant," "of biological distinction" or "substantively important."10 Identifying important criteria related to particular periodontal parameters would help clinicians select the most appropriate therapy for specific problems. No one criterion applies to all situations. Nevertheless, the need to quantitate clinical significance becomes very apparent when hypothesis testings shortcomings are delineated.
 |
STATISTICAL SIGNIFICANCE TESTINGS SHORTCOMINGS
|
|---|
There are four steps in the development of an analytical investigation: establishing the research hypothesis (association between exposure and outcomes believed to exist), study design, data collection and statistical assessments that include hypothesis testing.11 Statistical significance testing, also called hypothesis testing, is an extension of the scientific method. It provides a yes or no answer to the question of whether there are differences between variables, or outcome measures, in test and control groups. The term "statistical significance," however, only denotes that the associations between tested variables did not occur by chance. The term "statistically significant at an
level of .05" means that the null hypothesis (that there is no relationship regarding specific variables between test and control groups) was rejected and the chances of the association occurring by chance was small (probability is 5 percent or less, or the odds are 1 out of 20).1113 When the null hypothesis is rejected erroneously, it is referred to as a Type I error; when the null hypothesis is accepted incorrectly, it is called Type II error or ß error.11,12 Issues related to the inability of hypothesis testing to provide clarity regarding detecting clinically relevant periodontal changes follow.
Statistical rarity.
Hypothesis testing determines if the difference between means of a variable in test and control groups occurs by chance. However, the level of statistical significance (
level) of .05, which often is used in clinical trials, is selected arbitrarily. It initially was recommended by Fisher, who noted that measures of 1.96 standard deviations on either side of the mean of a Gaussian curve would include 95 percent of the data.9 He concluded that the outmost 5 percent of the data were unusual and significant in their divergence. This standard was selected because it was convenient, and it has become entrenched as scientific dogma. Fisher subsequently wrote that this fixed level of significance was "absurdly academic" and that it should be flexible based on the evidence.9 For example, if a large clinical effect is detected that has a P value of .06 or .07 (not statistically significant if the
level was set at .05), it is not prudent to ignore a potentially clinically significant finding. The reason for this failure probably could be attributed to inadequate planning for how big a sample size was needed to detect a statistically significant difference of a clinically relevant difference with a reasonably high probability.
Magnitude of the effect.
Statistical significance testing does not reflect the magnitude of the effect, and the term "statistically significant difference" does not denote that the difference between a test and control group was clinically meaningful with regard to a desired outcome. Furthermore, it should not be presumed that a small P value reflects a large effect. It only indicates that the result was less likely to have occurred by chance given that the null hypothesis was true.
Study population.
A homogeneous population will provide statistically significant relationships more readily than will a less homogeneous population. Furthermore, results of hypothesis testing are affected by the size of the study. For example, if the study sample size is large, a small difference between test and control groups could be statistically significant even though the difference may represent a "small," meaningless clinical result. Conversely, a large clinically significant result may be statistically insignificant if the study sample size is small. In cases in which large, clinically relevant differences are found that are not statistically significant, post hoc power analyses should be conducted to indicate the small likelihood of detecting a statistically significant difference that is clinically relevant because of the small sample sizes used. As indicated previously, this situation can be avoided by planning the study better.
Incongruity with the response to therapy.
Statistical significance testing provides a yes or no answer when therapies are compared using differences between mean values in test and control groups. Determining mean values for the efficacy of therapy, however, does not provide any specific information about what occurs at a particular site, as the results there could be greater or less than the mean values. Group means also provide limited information about how treatment responses vary from person to person, and they do not indicate the proportion of patients who attained health and whose health improved, remained the same or deteriorated. Therefore, when clinicians interpret the potential benefit of a therapeutic measure based on the mean value of a variable, they need to extrapolate this information to a particular patient and site, keeping in mind the severity of the defect and the desired treatment outcome. It would be helpful for clinicians if researchers reported additional summary statistics (such as the standard deviation, median, mode, minimum, maximum values and frequency distributions) because they relate more information than only mean data.
Type II error or ß error.
Failure to reject the null hypothesis does not mean that no true difference exists between the test and control groups. Chance or too few subjects being enrolled in the study can prevent the finding of a statistically significant difference when one exists. Researchers can avoid the problem of including too few subjects by performing a power analysis to determine the appropriate sample size that needs to be evaluated.12
 |
DEFINING CLINICAL SIGNIFICANCE
|
|---|
The definition of clinical significance varies depending on the specific clinical field being addressed, the size of the effect, the measurement used to evaluate a therapy and the clinical importance of the findings.14 The following are various definitions of clinical significance found in the literature.
Hollon and Flick15 suggested that "the minimal unit of clinical significance should be defined in terms of the smallest of reliable changes of interest to some, but not necessarily to all interested parties."
Lindgren and colleagues14 indicated that "when two treatment methods are compared, the smallest difference between therapies with respect to an important outcome variable that would result in a decision to modify treatment denotes clinical significance."
LeFort10 mentioned that the term "clinical significance" reflected "the extent of change, whether the change makes a real difference to subject lives, how long the effects last, consumer acceptability, cost-effectiveness and ease of implementation."
Hujoel and colleagues2 suggested a working definition for clinical significance as "statistically significant difference in a clinically important outcome identified in a definitive or phase III clinical trial."
Kingman16 stated that statistical significance should become a necessary condition for clinical significance and that both statistical and clinical significance should coincide. He indicated that to accomplish this, a definition of clinical significance is required. To meet the requirement, a consensus among recognized experts may be needed to define the term.
Killoy17 indicated that clinical significance is a subjective evaluation of significance by the clinician and that before a finding can be clinically significant, it must have achieved statistical significance.
The clinical importance of the data needs to be interpreted by the clinician before making therapeutic decisions. There is, however, no precise way to define clinical relevance regarding how small an improvement is meaningful in every situation. Therefore, taking into consideration the previous definitions, I suggest the following definition of clinical significance: clinical significance denotes a change that may alter how a clinician will treat a patient, and this value judgment varies depending on the situation. Clinicians, researchers, industry representatives, third-party payers and so forth may interpret clinical relevance differently, as they may place emphasis on different outcomes (for example, size of effect, cost, time needed for therapy, ease of implementation, duration of results and consumer acceptability). To arrive at a conclusion that a result is clinically significant, the finding must be clinically meaningful and statistically significant.
To arrive at a conclusion that a result is clinically significant, the finding must be clinically meaningful and statistically significant.
 |
PERSPECTIVES ON THE DEFINITION OF CLINICAL SIGNIFICANCE
|
|---|
The following are perspectives that groups interested in defining clinical significance as it pertains to periodontal therapy may consider important. This listing is incomplete, and there are overlapping aspects of defining clinical significance in each category. I am including this information to emphasize that different viewpoints may affect the interpretation of clinical significance.
Clinician.
The clinician plays a critical role in defining what is a clinically meaningful result. The clinician needs to relate various monitored clinical parameters to the goals of therapy. The clinician also may be interested in the size of the effects, time needed for therapy, ease of implementation, cost, side effects, duration of results, consumer acceptability and so forth.3,6,7,10
Patient.
The patients perspective of successful therapy and clinically significant changes is important. Reduction of specific symptoms that are considered relevant to the treating clinician, however, might not be deemed important by the patient. In these situations, educating the patient about the meaningfulness of specific desired outcomes is important for attaining patient satisfaction. With regard to clinical significance, patients are interested in quality-of-life issues such as retention of their teeth, comfort, good function of their teeth, lack of side effects and resolution of their problems.18,19
Researcher.
The researcher may consider small statistically significant changes to be clinically important because they demonstrate a benefit that did not occur by chance. These small improvements could be considered windows of opportunity that may provide avenues for additional research. On the other hand, one researcher has said that "to determine clinical significance, the overall purpose of the study, its design and size of the effect that may matter to a patient first must be determined."20
Federal regulatory agencies.
Federal regulatory agencies, such as the U.S. Food and Drug Administration, or FDA, are concerned about both safety and effectiveness of treatment methods.21,22 In general, once a product is considered safe, federal regulatory agencies focus on results that are statistically significant, thereby providing mathematical certainty that the results attained did not occur by chance. This helps bring products to market that are superior to standard controls or are equivalent to available products. This policy could be affected if the dental profession provided criteria that ought to be attained to reflect clinically significant changes.
Industry.
Companies developing dental products search for statistically significant differences between therapies since it is easier to attain this threshold than it is to try to achieve a level of clinical significance that has not been accepted universally by clinicians. Furthermore, the level of difference needed to achieve a statistically significant difference between products could remain small, thereby facilitating attainment of FDA approval.
Third-party payers.
Third-party payers usually are focused on reimbursement and accountability.23 They may want to see a large change in an important clinical parameter and its ability to prevent or reduce the potential of disease relapse by the patient. Third-party payers would like guidelines for consistency in the use of new therapies.
Public health.
The aim of a health care system is to provide adequate access to quality care at a reasonable cost.24 Large, costly improvements, however, may not be considered clinically relevant because they would benefit only a few people who could afford sophisticated therapies. On the other hand, small inexpensive benefits may be considered clinically important since they could be provided to many patients at a reasonable cost.
 |
METHODS AND CRITERIA TO EVALUATE CLINICAL RELEVANCE
|
|---|
Clinical attachment levels, bone height and bone density are primary outcome variables commonly used to monitor periodontal patients.25 They are considered primary outcome variables since they are part of the tooths attachment apparatus. Secondary outcome variables include probing depths, mobility patterns, bleeding on probing, and biochemical and microbiological assessments.25 With regard to reporting data for these outcomes, there is no universal agreement as to the best method to measure clinical relevance in periodontal therapy. The advantages and disadvantages of methods and criteria that could help define clinical significance are listed in the table
.6,2642 Ideally, recommendations to be used to identify clinically significant changes associated with periodontal therapy should be established at consensus workshops by experts. Until this occurs, defining clinical significance will depend on interpreting publications and personal experiences.