|
|
||||||||
|
J Am Dent Assoc, Vol 134, No 5, 575-582.
© 2003 American Dental Association | ![]() |
RESEARCH |
A comparison of clinicians assessment versus a computerized tool
| ABSTRACT |
|---|
|
|
|---|
Methods. The authors assembled a group of 107 subjects and performed standard periodontal examinations. The authors entered the resulting information into the PRC and calculated risk scores for two and four years, assuming no treatment would be performed. Using the same subject records, three groups of expert clinicians assigned risk scores for years 2 and 4. The authors analyzed the data to reveal the extent of interevaluator variation and the level of agreement between expert clinician scores and PRC scores.
Results. The extent of variation among scores assigned by individual expert clinicians was greater than the authors had expected. Expert clinicians consistently assigned more subjects to PRC risk group 2 and fewer to risk group 5 than did the PRC. The authors observed very high heterogeneity in the risk scores expert clinicians assigned to patients in each of the PRC-assigned groups. Thus, expert clinicians varied greatly in evaluating risk and, relative to the PRC, they appeared to underestimate periodontitis risk, especially for high-risk patients.
Conclusions and Practice Implications. The authors observations suggest that use of risk scores generated for individual patients by subjective expert clinician opinion about risk in periodontal clinical decision making could result in the misapplication of treatment for some patients and support the use of an objective tool such as the PRC. Use of the PRC over time may be expected to result in more uniform and accurate periodontal clinical decision making, improved oral health, reduction in the need for complex therapy and reduction in health care costs.
In the last several decades, periodontal diseases have been researched intensively, and our knowledge base and understanding have grown greatly. Studies have demonstrated that while bacteria are an essential cause of peridontitis, bacteria alone are insufficient; a susceptible host also is essential. Susceptibility and its quantitative measure, risk, vary greatly from one person to another.19 Several determinants of risk and susceptibility have been identified.1014 Heredity alone appears to account for roughly 50 percent of the risk of developing periodontitis.15 Poor oral hygiene, tobacco smoking and certain systemic diseases and conditions, especially diabetes mellitus, are some of the most significant risk factors.11 These and other factors directly enhance or decrease a persons risk of experiencing periodontal deterioration.
Assessment and use of risk level in prevention and management of periodontitis is complex and difficult. Individual risk factors differ greatly in their importance in enhancing disease susceptibility, and multiple risks appear to be synergistic rather than additive.8 Very little is known about relative weighting of individual factors or interactions among factors that may affect weighting when more than one factor is involved. Consequently, risk assessment and application of risk evaluation to the management of periodontitis remain in their infancy. Risk assessment for periodontitis remains subjective, empirical and variable from one clinician to another and from one patient to another. Development of methods for objective, accurate quantification of risk and susceptibility and application of the results would greatly facilitate patient care.
We have developed a computer-based tool for assessing a patients risk of experiencing periodontal disease and for predicting disease onset and progression. The tool, called the Periodontal Risk Calculator, or PRC (Dental Medicine International, Philadelphia), is based on mathematically derived algorithms that assign relative weights to the various known risks that enhance a persons susceptibility to periodontitis. It is user-friendly and requires only information that is gathered during routine dental or periodontal examinations. Risk scores determined using the PRC are accurate and valid predictors of future periodontal deterioration, as measured by actual alveolar bone loss and tooth loss over a period of 15 years.16
The purpose of the study reported here was twofold:
Compared with general dentists, periodontists have more training and experience in managing periodontal diseases. We therefore tested the hypothesis that the variation among individual periodontists in assessing levels of risk would be relatively small and the agreement with the PRC-calculated risk scores would be strong, while among general dentists the variation would be greater and agreement with the PRC scores would be weaker. We performed and recorded the findings of oral examinations for a group of 107 subjects with a wide range of risk levels. On the basis of this information, we calculated a risk score for each subject using the PRC. Using the same records, two groups of periodontists and one group of general dentists assigned risk scores for each subject. We determined the extent of interevaluator and intergroup variation and agreement, as well as the extent of agreement between risk scores assigned by the groups of expert clinicians and scores calculated by the PRC.
Our other inclusion/exclusion criteria were that subjects must be 21 years or age or older; could not have had active periodontal or orthodontic therapy within the previous six months but could have had periodontal maintenance; could practice any type of daily oral hygiene, including use of antimicrobial oral rinses; must give informed consent; and must be willing and able to come to the Regional Clinical Dental Research Center at the University of Washington School of Dentistry in Seattle for one screening visit and one two-hour examination.
We obtained potential subjects through advertisements in local newspapers and on radio stations. We interviewed respondents by phone; we appointed those who appeared to qualify and performed screening examinations on them. We enrolled subjects who were qualified and gave informed consent and performed full examinations on them.
Periodontal examination and generation of risk scores.
We took full-mouth periapical radiographs with bitewings for each enrolled subject. We evaluated the films for hopeless teeth, periapical and carious lesions, extent of alveolar bone loss, vertical bone lesions, root calculus, and retained and fractured roots. One dental hygienist examiner performed full-mouth charting, including missing and carious teeth, gross occlusal abnormalities, gingival recession greater than 2 millimeters, probing pocket depth and clinical attachment level at six positions around each tooth, tooth mobility (recorded on a scale of 03 to indicate normal, slight, moderate and severe mobility), presence of any oral mucosal lesions and bleeding on probing. We recorded medical and dental histories, including any medications being taken and any systemic diseases and conditions. Clinical photographs (35-mm color slides) were taken with the teeth occluded from the facial anterior, left and right posterior and, using mirrors, lingual and palatal aspects. We arranged the record components in a standardized order in chart folders and checked them for completeness. All records were coded, and all other identifying information was removed. The information required by the PRC for calculation of risk scores has been reported previously.16 Using the PRC and information from the records, we calculated risk scores for each subject for years 2 and 4 from the baseline examination, assuming that no treatment would be performed. We expressed the level of risk on a scale of 1 through 5, with 5 representing the highest level of risk.
Expert periodontists and general dentists.
We assembled three groups of expert evaluators. Group A consisted of 10 periodontists, all of whom we assumed had a greater than average interest in and knowledge of periodontal risk assessment because of their participation in the development of the PRC. (R.C.P. and J.M. were among these periodontists.) Group B consisted of two internationally recognized full-time periodontal practitioners, both of whom were past presidents of the American Academy of Periodontology, two academic periodontists who were in part-time practice, one full-time periodontist practitioner from the U.S. military and one Swedish full-time periodontist who also is a recognized expert and author of numerous publications on risk assessment. Group C consisted of 36 general dentists, all of whom were in full-time practice and who were judged to be periodontally aware based on their records of referring patients for specialty periodontal care.
We randomly assembled subject records in batches of 26 to 28, 21 to 22 and five to six for groups A, B and C, respectively. We instructed the evaluators to assess subjects based on their risk of developing periodontal disease for those who did not have it, and the risk of experiencing future progression of periodontal diseases for those who already had it. We asked each evaluator to study the records and assign risk scores for two and four years, assuming no treatment was performed. We provided an overall description of the study design, but we gave no information or instructions to the evaluators other than that they were to evaluate the subject records provided and assign risk scores. There was no limit on the length of time expended by each evaluator on the evaluation. Evaluators could not discuss their evaluations and scoring with anyone, nor were they permitted to ask questions of the investigators or others associated with the study.
Each periodontist in both groups evaluated the records for all 107 study subjects. Each general dentist evaluated the records of 16 to 30 subjects (median = 27 subjects), and seven to 10 general dentists evaluated each subject (median = nine dentists). The evaluators recorded scores on the coded forms provided and transmitted them to the principal investigator, who entered the results into secure computer files and checked them for accuracy.
Data processing and statistical analysis.
We computed risk score frequencies for each evaluator and for the PRC. Within each evaluator group, we used the median risk score for each subject to define consensus or average risk scores, which we rounded to the nearest integer. We used the intraclass correlation coefficient to assess interevaluator reliability separately for the two periodontist groups and the dentist group based on a two-way analysis of variance model with random effects for evaluator and subject.17 We used a weighted The risk scores generated for individual patients by subjective expert clinician opinion are highly variable and could result in the misapplication of treatment for some patients.
![]()
METHODS
TOP
ABSTRACT
METHODS
RESULTS
DISCUSSION
CONCLUSION
REFERENCES
Subject population.
Our recruitment goal was to assemble a study population of approximately 100 subjects who represented a wide range of risk of experiencing periodontal deterioration. We designed the recruitment so that the final study population would have specific proportions of subjects who
statistic to quantify the agreement between the PRC and the consensus risk scores for each evaluator group. We used Cicchetti-Allison weights when ratings with the same score were given a weight of 1 and decreasing weight was given as the difference between ratings increased. Ratings that differed by 4 received a weight of 0.18 We also used the weighted
statistic to assess the agreement of each evaluator with the group consensus and PRC risk scores, and the Spearman rank correlation coefficient to describe the association between the group consensus and PRC risk scores.
![]()
RESULTS
TOP
ABSTRACT
METHODS
RESULTS
DISCUSSION
CONCLUSION
REFERENCES
All 107 subjects who enrolled in the study completed it. The group was 44 percent male, had a mean age of 49.6 (± standard deviation of 2.6) years and had an average of 26.2 teeth. Twenty-three of 35 postmenopausal female subjects were taking hormonal replacement or alendronate therapy. About 75 percent of the subjects had early-to-severe periodontitis, and 6 percent reported having had some form of periodontal therapy. Fifty-two percent were current or former smokers, and 45 percent reported some level of alcohol consumption. Ten percent reported having a history of diabetes, and 17 percent reported having a history of heart disease. Based on the PRC-calculated risk scores for year 2 (Figure 1
), the subject population was well-distributed among the five risk groups, although possibly weighted somewhat toward the higher risk scores.
|
values for agreement of consensus scores for the three evaluator groups among the groups and with the PRC scores are shown in the table. The consensus scores for years 2 and 4 exhibited good but not excellent agreement among the evaluator groups (0.590.70), but only fair agreement with the PRC (0.440.49). The consensus scores for agreement between group A and B periodontists and the general dentists in Group C for years 2 and 4 were similar, but the range for general dentists was greater than for periodontists. The Spearman rank correlation coefficients indicated a somewhat stronger relationship than the
statistic. Rank correlation for individual evaluators with their group consensus scores ranged from 0.76 to 0.88 for years 2 and 4; rank correlation for group consensus scores and PRC scores ranged from 0.72 to 0.78 for year 2 and 0.61 to 0.75 for year 4.
The intraclass correlation coefficient, or ICC, provides a measure of interevaluator reliability for each evaluator group. The group A periodontist evaluators had the highest level of agreement (ICC of 0.670.70), followed by group B periodontists (ICC of 0.630.66); the general dentist evaluators in Group C had the lowest level of agreement (ICC of 0.530.55). The fraction of the variance due to systematic differences among evaluators for both periodontist groups was less than 0.07, compared with 0.20 for the general dentists. Interevaluator variability also was reflected in the range of weighted
values for individual evaluators (Table
).
|
The distribution of individual evaluator scores shown in Figure 1
and the ICC and weighted
scores demonstrated substantial interevaluator variation between expert evaluator- and PRC-assigned scores, but they do not reveal other differences between the PRC and expert clinicians in assigning risk scores. Figure 2
displays the extent of heterogeneity of subjects in the five PRC risk groups based on the consensus risk scores assigned by the three groups of expert clinicians. Clearly, the PRC groups are highly heterogeneous. While more than one-half of subjects in PRC group 1 have an expert evaluator consensus score of 1, others have scores of 2, 3 and 4. Conversely, subjects with consensus scores of 1 also are found in risk groups 2, 3 and 4. Only a minority of subjects in PRC group 5 has an expert consensus score of 5, while the majority have consensus scores of 2, 3 and 4. Risk groups 2, 3 and 4 also are highly heterogeneous.
|
| DISCUSSION |
|---|
|
|
|---|
To this end, we assembled a group of 107 study subjects, selected to manifest a broad range of risk, and performed standard dental examinations, periodontal examinations or both, including preparation of periapical radiographs and clinical photographs. We entered the resulting information into the PRC and obtained a risk score for each subject for years 2 and 4 hence. The sample size was sufficient to achieve the aims of the study as indicated by the relatively narrow confidence intervals for the weighted
values (Table
). With the given sample size, we were able to estimate within ± 0.1 with 95 percent confidence the reliability between the PRC and each evaluator group, as measured by the weighted
statistic. The distribution of scores for the 107 study subjects among PRC groups 1 through 5 demonstrated that the study subjects manifested a wide range of risk of experiencing periodontitis.
The expert clinician evaluators were a very diverse group. Group A consisted of 10 practicing periodontists whom, we reasoned, would have greater-than-average knowledge about assessment of risk of periodontitis because of their participation in the development of the PRC. Group B included experts from the practice community, academic periodontic departments and the U.S. military who were expected to have no special knowledge of risk assessment, and one Swedish periodontist who is a recognized expert. Group C consisted of 36 periodontally aware general dentists in full-time general practice whom, we reasoned, would have less knowledge about periodontal risk assessment than the periodontists, but would have more knowledge than the average general practitioner.
We tested the hypothesis that group A periodontists and the Swedish periodontist would have the highest level of agreement with their group consensus and with the PRC-assigned scores, followed by group B periodontists, and that the general dentists in group C would have a lower level of agreement with their group consensus and with the scores assigned by the PRC. We then performed statistical analyses to test the extent of intra- and interevaluator group variation, as well as the extent of agreement with group consensus scores and with the risk scores assigned by the PRC. We determined the heterogeneity of each of the five PRC groups on the basis of the expert clinician evaluators scores.
The weighted
statistic is an index commonly used for measuring agreement with ordinal data.
statistics have a range from 1 to 1, where 1 indicates perfect agreement and less than 0 indicates agreement less than expected by chance. In general, a weighted
value greater than 0.75 indicates excellent agreement beyond chance, 0.40 to 0.75 indicates fair to good agreement and less than 0.40 indicates poor agreement. We used the weighted
statistic to evaluate the extent of variation among evaluators and the extent of agreement of evaluators with their group consensus and agreement between risk scores calculated by the PRC and those assigned by expert evaluators. On the basis of the
statistic, we found that the two groups of periodontists and general dentists did not differ in their risk assessments, although the range was greater for general dentists. Agreement between expert evaluator scores and PRC scores was only fair (0.440.49), mostly because a substantial proportion of subjects received a lower score according to the expert clinician group consensus than by the PRC.
The ICC is an appropriate statistical approach to evaluate differences between individual groups of periodontist and general dentist evaluators. The ICC apportions the variance as that due to variation among subjects and that due to variation among evaluators. The ICC has a range of 0 to 1, in which 1 indicates that all of the variation observed is due to variation among subjects. A value of greater than 0.75 among subjects indicates excellent reliability in the sense that relative to the variation between subjects, the variation due to differences between evaluators or other sources is small. For periodontists, the proportion of variation assigned to subjects was high (0.630.70) and that assigned to evaluators was low (about 0.07); periodontist groups A and B were very similar to each other. Among general dentists, the proportion of variation assigned to subjects was lower (0.530.55) and that assigned to evaluators was almost threefold higher (0.20). Whereas there was no indication of large systematic differences among the evaluators in either of the periodontist groups (including the Swedish periodontist), a systematic difference among general dentists was apparent.
The Spearman rank correlation is a measure of the trend for changes in one variable to be reflected by changes in the other variable. The rank correlation between group A and group B periodontist scores was high for both years 2 and 4, and the scores for general dentists were only slightly lower. The rank correlations for the three groups of evaluators with the PRC scores were considerably lower. Thus, the rank correlations indicate a higher level of evaluator association between clinician evaluators and the PRC than suggested by the
statistic and ICC. This relationship most likely is owed to the better agreement among evaluators in the ordering of the risk scores than in the actual assignment of numerical scores.
The extent of variation in scores assigned to subjects for years 2 and 4 by the two periodontist groups was large, but it was even larger for the general dentist group. Nevertheless, the consensus scores for the three groups of evaluators, which are independent of the range of variation of individual scores, clustered near one another for each of the five risk groups and near the PRC values for groups 3 and 4 and to a lesser extent for group 1, but not for groups 2 and 5. Relative to the PRC calculations, expert clinician opinion assigned a larger proportion of the subject population to PRC risk groups l and 2 and a much smaller proportion to group 5. These observations are notable. When looked at overall, if the PRC-calculated risk scores are correct as suggested by a previous study,16 not only is there an unexpectedly large variation among dentists and periodontists in assessment of risk for a given case, but also dentists and periodontists generally appear to underestimate risk. Many people in lower risk groups actually may belong in risk group 5, and many of the excessive numbers of subjects in groups 1 and 2 may belong in higher risk groups.
When the data are examined in greater detail, an additional feature is apparent. If expert evaluator consensus and PRC risk scores were in complete agreement, the group of bars for each of the five risk groups in Figure 2
would have a uniform color; clearly, that is not the case. The extent of deviation from uniform colors and the distribution of colors among the bar groups are an expression of heterogeneity or lack of agreement. The composition of the PRC risk groups is highly heterogeneous. For example, subjects with expert evaluator consensus scores of 2 or 3 are distributed throughout PRC risk groups 1 through 5, and only a minority of subjects in PRC risk group 5 actually have an evaluator consensus score of 5; the majority has risk scores of 2, 3 and 4. Subjects with a score of 1 based on expert opinion are distributed through PRC groups 1 through 4 and those with scores 2 and 3 are found in all five PRC groups. This pattern of heterogeneity is apparent for all three expert evaluator groups. For a given subject and a given expert, the probability of congruence of the expert opinion and PRC score is relatively small.
| CONCLUSION |
|---|
|
|
|---|
In general, on the basis of expert clinician consensus scores, we found that more than one-half of the subjects in PRC risk groups were dispersed throughout groups other than that assigned by the PRC. Dispersion would have been even greater had individual rather than group consensus scores been used. If the PRC risk scores are correct as indicated by a previous study,16 both dentists and periodontists appear to underestimate the risk of developing periodontitis. These observations suggest that risk scores generated for individual patients by subjective expert clinician opinion are highly variable and, when used in periodontal clinical decision making, could result in the misapplication of treatment for some patients. Use of a risk assessment tool over time may be expected to result in more uniform and accurate periodontal clinical decision making, improved oral health, reduction in the need for complex therapy and reduction in health care costs.19
|
|
|
|
| FOOTNOTES |
|---|
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
C. Kunzel, E. Lalla, and I. Lamster Dentists' Management of the Diabetic Patient: Contrasting Generalists and Specialists Am J Public Health, April 1, 2007; 97(4): 725 - 730. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. P. Vitaliano, R. Persson, A. Kiyak, H. Saini, and D. Echeverria Caregiving and Gingival Symptom Reports: Psychophysiologic Mediators Psychosom Med, November 1, 2005; 67(6): 930 - 938. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. C. Page, J. A. Martin, and C. F. Loeb The Oral Health Information Suite (OHIS): Its Use in the Management of Periodontal Disease J Dent Educ., May 1, 2005; 69(5): 509 - 520. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. SPALLEK A resource guide for practice development through technology J Am Dent Assoc, October 1, 2004; 135(suppl_1): 38S - 44S. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |