The Journal of the American Dental Association
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


J Am Dent Assoc, Vol 137, No 1, 30-41.
© 2006 American Dental Association

This Article
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Hammond, D.
Right arrow Articles by Hambleton, R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Hammond, D.
Right arrow Articles by Hambleton, R.

POINT/COUNTERPOINT

Do portfolio assessments have a place in dental licensure?



Del Hammond and Chad W. Buckendahl, PhD


   No, portfolio assessments should not be used in dental licensure.
 TOP
 No, portfolio assessments should...
 REFERENCES
 FOOTNOTES 
 Yes, portfolio assessments can...
 REFERENCES 
 
In recent years, there has been unrest within the dental community—practitioners, educators and students—about using the clinical tests that assess candidates’ competence as a requirement for state licensure in dentistry. Concerns about accessibility to care, accreditation pressures and the number of seemingly diverse clinical testing programs have contributed to this unrest. Some states, such as New York and Minnesota, have responded to these pressures by allowing one year of postgraduate dental training in lieu of a clinical examination. Other states, such as Alaska and Kentucky, accept clinical test results from more than one of the regional test development/administration agencies. And in yet other states, such as Colorado and Missouri, any state board or regional agency test may be allowed. This greater acceptance of other states’ or regions’ results suggests comparability across testing programs that may not be supported by validity evidence.

The use of portfolios for licensure decisions, in addition to being a difficult system to develop and administer, could present several threats to validity.

In response to an increasing desire for professional mobility and greater standardization of the clinical testing process, some regional agencies and states have undertaken efforts to develop a national clinical test that could replace the multiplicity of tests currently available. Other commentators have suggested that dental school portfolios would be an appropriate replacement for the current clinical testing model for licensure.1 Although much of the debate surrounding this issue has occurred within the dental community, it is important to discuss these issues from a measurement perspective.

As in the profession of dentistry, the field of measurement involves advanced study in specific areas of a complex discipline. Measurement specialists typically are people with graduate-level training in research design, statistics, data analysis, psychometric theory, and developmental or cognitive science. Because the average person has taken a test at some point in his or her life, most of us have an intuitive sense about how a test is developed, scored and used. For many situations that do not involve critical decisions, this intuitive sense of testing may do no harm. However, because these intuitive perspectives often are inconsistent with professionally accepted testing practice, they can lead to problems in defending important decisions.2 Thus, we seek to contribute to the discourse on these topics by discussing issues that bear on the validity of licensure testing programs and the appropriateness of using portfolio assessments for this purpose.

LICENSURE REQUIREMENTS Present requirements for licensure by most states in this country include

– graduation from an accredited dental education program;
– satisfactory completion of the written National Dental Board Examinations (NDBE) Part I and Part II;
– satisfactory completion of a state or regional clinical examination;
(in some states) additional testing, certification and/or other state-specific requirements.

Each of these provides unique information to the licensure decision.

An accredited training program carries with it the suggestion that curricular, instructional and clinical opportunities necessary for practicing dentistry are offered to candidates during the course of their program. Evidence of a good training process, though, does not ensure a good product. This is why many accreditation bodies (not just in dentistry) have added outcome assessment requirements as a strategy to gather and report evidence that empirically supports the assertion that graduates have met the institution’s goals. Because these outcome assessments are tied to accreditation requirements, institutions are unable to characterize these as independent verifications of the training programs. Thus, the licensure testing process often has been considered by institutions as an independent validity check on their assertions. Although to a layperson this may appear appropriate, the purpose of each of these assessment systems is different and should not be confused as interchangeable.

The NDBE examinations and the clinical licensure examinations focus on different aspects of the dental profession as they relate to safe, entry-level practice. Parts I and II of the national examination concentrate on the knowledge that is required for an entry-level dentist, whereas a clinical examination is designed to measure the clinical judgments and skills of an entry-level dentist. The common analogy of the driver’s license test is appropriate here, as it usually requires both a written test of the knowledge of how to operate a vehicle safely and legally and a driving test to demonstrate the skill to operate a vehicle safely. Also, a practitioner’s completion of a recognized training program is a third piece of evidence that provides additional assurance of competence and allows insurance companies to offer reduced premiums to the practitioner. This is similar to the three-part system used in dental licensing decisions. Note that neither the written nor the performance component of the licensure test is administered by driving schools. More importantly, these tests are not intended to predict future performance or success, but only that the candidate possesses the minimum skills necessary to obtain the license.

The need for this verification of knowledge and skill to be independent of a training or preparation program cannot be understated. Downing and Haladyna3 characterized this independence as important to the credibility of the testing program as evidence of external validity. Other researchers—such as Ruch,4 Buros,5 Madaus6 and Buckendahl and colleagues7—also have suggested that for any accountability system, there is a need for a level of external quality control through independent review or verification. Discounting the need for independence can lead to the unfortunate events that Enron experienced when concerns were raised regarding the company’s use of the same firm for both accounting and the auditing of that accounting.7 The broader question is whether portfolios of clinical performance that depend on educator oversight, scoring or both should be used to support licensure decisions.

WHAT ARE PORTFOLIOS? Portfolios traditionally have been used in educational settings as a way for teachers to evaluate students’ progress, accomplishments and abilities developed in a course. A portfolio typically would consist of work, usually the best performances, completed by a student during a course of study. Haladyna8 discussed four types of portfolios for people with a variety of purposes and content. Of the four types, he identified the evaluation portfolio as the only one to consider for grading students. The contents would include a standardized collection of student work and could be determined by the teacher or, in some cases, by the student. The teacher could determine how portfolios should be scored and would use rating scales when evaluating products produced by the students.

However, because of the high-stakes nature of licensure decisions, portfolios historically have not been used in these instances. Portfolios have become a popular measurement tool for teacher certification, though Wilkerson and Lang9 expressed reservations about this use. Some of these concerns include variation in portfolio products, independence of the candidate’s work, and whether the products are representative of the domain and candidate’s abilities.10 When Haladyna8 summarized the use of portfolios, he wrote, "When it comes to accountability at the class, school, or state levels, enough questions have been raised and left unanswered regarding validity and reliability that it seems unwise to use a portfolio for anything but formative student evaluation and as one of the criteria for grading." Although grading may allow instructors to consider factors such as progress, effort, attitude and classroom behavior such as attendance or participation, these factors are not relevant for a licensure decision.

THE VALIDITY OF PORTFOLIO ASSESSMENTS FOR LICENSURE IN DENTISTRY A portfolio used for dental licensing has been defined as "the best collection of evidence, with multiple replicates from diverse sources."1 The portfolio could include "two endodontic test cases and two crown placements, three comprehensive periodontal management cases, a removable prosthesis and a surgery case, three randomly selected chart reviews and defense of three completed cases with predetermined parameters." Under this scenario, candidates would be given as many opportunities as needed to produce their best work for the portfolio to be evaluated. Within this system, additional variation would be added to the system in terms of the number of attempts, variability of cases and independence of the work. A candidate that demonstrates minimally acceptable performance on a given procedure once after five attempts may be capitalizing on chance simply through the number of replications. Therefore, it is reasonable to expect a level of on-demand performance as a job-related skill.

It also has been proposed that candidates could assemble their portfolios in the final six months of their dental school education, but that candidates who are not judged competent by a board may need standardized performance testing or other special evaluations.1 Although the suggestion appears intuitive, an obvious challenge is that any parallel testing system would need to be equivalent. Maintaining two very different, yet equivalent formats would increase the costs of the testing program for all candidates. If the formats are dramatically different, there is a greater risk of legal challenge by candidates who observe or perceive inequity in the system.

Because the purpose of licensure testing is to have candidates demonstrate that they are at least minimally competent to enter safe, independent dental practice, then portfolio assessments are likely a mismatch with state board requirements by providing educational verifications that are not relevant to the information needed. Although this information serves an institution’s accreditation needs, it does not necessarily inform the licensure decision. The question should be about not what a student accomplished in school with consultation and educational guidance, but the quality of work a candidate can demonstrate independently at a time near that when he or she wishes to enter practice. Questions about school accomplishments should be satisfied by the candidates’ degree requirements and school accreditation requirements. Note that these purposes are distinctly different from licensure’s need to satisfy the public’s demand for ensuring that only competent candidates are allowed to enter professional practice.

This brings up two issues for portfolio assessments within the educational process. The first is independence. Except for their final clinical evaluations in the competencies required by schools, students are not expected to be totally independent in educational settings. Educators should be expected to provide guidance, teaching students correct process and reasoning. If in existing clinical programs sufficient time is not allocated to testing students within patient treatment plans, additional clinic time would be required for portfolio assessments. Using existing patient cases, earmarked for student training, to evaluate candidates either would limit instructor input during this process severely or could result in an evaluation of instructors’ input rather than assessments of student licensure candidates’ ability.

If in the latter stages of the educational process students were to provide dental care that is totally independent from any consultation with any other dentist or dental educator, the students would need careful monitoring to ensure their independence. Close monitoring of the students’ portfolio work and control of the settings in which they work would depend on the universities at which the students are trained and the staff in those universities. To standardize the training and evaluation of the candidates with the diversities inherent in a broad range of university locations could be difficult and costly. It also contributes to a greater level of construct-irrelevant variance (a type of error in scores), which damages the validity of the system. This type of monitoring would require observation and evaluation by independent, external professionals. Candidates accepting advice relative to portfolio work would be considered to be cheating. This source of error variance has been identified as a great concern when portfolios are used for high-stakes evaluations.8,9

The second related issue is time frame. If demonstrations of competence are evaluated at some time in the past, as would be the case with portfolio assessments, the relevance of those demonstrations to current abilities may need to be verified. The validation of current competence also includes demonstration of the ability to provide a collection of services, requiring an evaluation of a current range of competence instead of evaluating individual competencies immediately after the student has perfected each specific skill. Candidates may have reached a level of competency at some time in the past when they had recent, repeated practice on specific tasks that were assessed. It is quite a different matter to maintain proficiency in a representative number of tasks and to be able to demonstrate competency in several tasks at the time they are about to enter dental practice.

If demonstrations of competence are evaluated at some time in the past, as would be the case with portfolio assessments, the relevance of those demonstrations to current abilities may need to be verified.

RELIABILITY Conceptually, reliability is an estimate of the amount of measurement error in test scores. The greater the error, the lower the confidence we have in the results or the decisions made using the results. Although it is a necessary element of validity, reliability evidence alone is insufficient to support valid score interpretations. More importantly, there are different sources of error that may exist within a testing system. Priority should be given to the sources of error that can most threaten the validity of the results. Estimating these different sources of error requires evaluating different methodologies that target these questions appropriately. Some of these sources of error may include content sampling from the larger domain, the number of observations of performance, the number of occasions candidates have to demonstrate performance and the number of examiners rating a candidate’s performance.

Some criticisms of current clinical licensure reliability focus on predicted internal consistency reliability evidence.1 Internal consistency evaluates the intercorrelations among items/tasks and is appropriate for objectively scored tests where there is little or no variation in agreement on the correct response. Calculations such as coefficient alpha11 often are used to estimate these interrelationships among items. However, in terms of best practice, there are three problems with calculating internal consistency for the clinical licensure tests in dentistry. First, because the performance tasks are scored subjectively using scoring guides or rubrics, the educator raters who score candidates’ performance have the potential to introduce a major source of error in the scores that is unrelated to the internal consistency of the test. It is the raters who make the score judgments, with the associated introduction of a type of error that is technically referred to as "construct-irrelevant variance."

Second, when one uses internal consistency methods for reliability estimation, one assumes that there is a unidimensional relationship among all items. This assumption is likely to be unsupported on dental clinical tests. This means that there may not be a strong relationship among a candidate’s performances on the restorative, endodontic, periodontal or other sections on the clinical test because these disciplines represent different dimensions of dentistry that contain unique skill sets. Knowing that the dental profession has specialties in each of these areas, this may not be surprising, as the field recognizes that proficiency in one aspect of dentistry does not necessarily ensure competence in another.

Third, a licensure test is designed to focus its measurement on the point of minimum entry-level competence. As a result of this moderate level of difficulty relative to the abilities of well-prepared candidates, candidates’ scores typically fall within a narrow range. Consequently, there may be insufficient variance to estimate an accurate internal consistency reliability value. Variability in scores is a basic assumption of internal consistency methods and has been noted as a problem in calculating correlations for other criterion-referenced testing programs.12 However, a lack of internal consistency reliability evidence does not suggest that users should not have confidence in the scores or decisions.

Ratings assigned by examiners are more relevant sources of error in performance tests like a clinical licensure examination. Thus, estimates of interrater agreement are good indicators of the confidence in scores and decisions. Because of the subjective nature of these ratings, examiners have additional responsibilities when using this scoring system. First, clearly defined criteria that define acceptable and unacceptable performance are needed. Second, examiners must participate in training that provides both a conceptual understanding of the scoring criteria and practice opportunities and testing to ensure that they are capable of applying the criteria. Training activities should include multiple examples of performances that clearly relate to the judgments that examiners are asked to provide in the operational setting. Third, at least two examiners would independently rate each candidate’s performance using the scoring criteria in which they were trained. In some models, if there is a disagreement between the raters, a third rater often is called in as an independent adjudicator whose ratings would serve as a tie-breaker judgment for the decision. In other models, three independent examiners rate all performances and use the judgment that is reached by at least two of the three examiners. Each of these models uses examiners’ judgments as multiple independent observations, relying on the preponderance of evidence for the decision rule. Finally, ongoing monitoring of the examiners’ application of the scoring criteria is needed to control the quality of score accuracy. Strategies that collect examiner consistency information should be collected at different points throughout the examination to evaluate the quality of the examiners’ performance.

Portfolios used for certification and licensure must meet the psychometric standards that would be applied to any other high-stakes test.

FAIRNESS IN TESTING Another component of validity is fairness, which Wilkerson and Lang9 identified as a major concern when using portfolios for certification and licensing. They also wrote that portfolios used for certification and licensure must meet the psychometric standards that would be applied to any other high-stakes test. They discussed portfolio psychometric issues based on the Standards for Educational and Psychological Testing13 and the U.S. Equal Employment Opportunity Commission’s Uniform Guidelines,14 which, they wrote, are used consistently as the "standard" in legal disputes. Several of the guidelines in Guidance for Clinical Licensure in Dentistry15 also are directed at developing and administering tests that are equitable and free from bias.

Standardization in testing, including content, administration and scoring, is necessary to provide an equal opportunity for each candidate and is covered in extensive detail in these two documents. Several aspects of standardization present difficulties for portfolio assessments. All candidates for licensure are not new graduates. Some are practitioners moving to private practice from military practice, while others may be current practitioners moving to states that do not offer licensure based on their current license. These candidates would not be able to participate in school-based portfolio assessments and an alternate would be required. Equivalence of the two assessments would need to be proven. Of equal significance, methods must be developed to ensure standardized administration and freedom from bias across schools and within schools to promote equivalent content and scoring. Even one small component of this standardization, such as calibrating examiners by employing standardized training, standardized examiner testing and a standard set of criteria, would be difficult to establish in a large-scale assessment program.

Ensuring the anonymity of candidates helps control bias in scoring. If the educators are faculty members at a school at which they are raters of portfolios or of the work included in portfolios, the independence of the judgments and candidates’ anonymity both are lost. The faculty members would possess information about candidates and, in general, have interest in the student outcomes for that school. Institutional pressures and regional influences could introduce rating bias.3 Hiring dentist raters from outside sources might be problematic from both cost and scheduling perspectives. Development of methods to guard against rater bias would be desirable in either case.

IMPLEMENTING A PORTFOLIO ASSESSMENT Developing and implementing a testing program is not a quick process. Although some have suggested that this could be "accomplished over a few weekends,"1(p.180) this understates the importance of this critical first step in the test development process. Intuitively, developing a testing program may appear to involve simply writing a number of test questions or creating a number of performance tasks, but it is much more complex than that. Conducting a practice analysis to define the entry-level knowledge, skills, abilities and judgments needed for safe, independent practice typically involves multistep processes that span a number of months.16,17 Although this is the foundational evidence of a program’s validity, there are a number of additional technical and legal issues that are considered during the development and maintenance of a testing program for licensure.18

There are some important questions that must be addressed in a discussion of using a portfolio assessment system for clinical licensure in dentistry. With concerns about costs borne by candidates, who will fund the development and maintenance of a portfolio assessment system that meets the demands placed on other high-stakes assessments? Because licensure remains a state function, how would an educational institution–administered assessment monitor the independence of candidates’ work? Given the need for equity and standardization of conditions, who will train staff members to accomplish these tasks in accordance with applicable standards? What quality control processes will be used to ensure that raters are applying the scoring criteria consistently, and what decision rules will be used to eliminate raters from the process if they do not? Would all educational institutions accept the additional responsibilities and quality expectations required to maintain a portfolio assessment system? Given the limited assessment standardization within and among schools, what measures of external quality control could be implemented to ensure the technical and legal defensibility of the testing program? These are nontrivial topics that high-stakes testing programs need to be able to defend.

CONCLUSIONS Portfolios may be appropriate for the evaluation of a candidate’s performance in an educational setting. However, the use of portfolios for licensure decisions, in addition to being a difficult system to develop and administer, could present several threats to validity. Possible major problems include conflict of interest for educational institutions, time frame, administration standardization, equitable testing and scoring for all candidates, ensuring independent work by candidates, rater standardization and the potential for rater bias.

A portfolio assessment may not address the question of a candidate’s current competence but instead may provide additional information that would validate his or her compliance with dental education requirements. As an alternative to using portfolios for licensure assessments, schools should be encouraged to use portfolios more extensively to provide greater assurance that graduation requirements are met. This might enhance the value of the dental degree and further reduce the chances that unqualified dentists will have the opportunity to enter dental practice. However, this programmatic information is not a substitute for the independent verification of entry-level skills that is at the heart of licensure testing programs for many professions.


   FOOTNOTES
 

Mr. Hammond is a measurement specialist, Western Regional Examining Board, Phoenix.


Dr. Buckendahl is the director, Buros Institute for Assessment Counseling and Outreach, Buros Center for Testing, 21 Teachers College Hall, University of Nebraska–Lincoln, Lincoln, Neb. 68588-0353, e-mail "cbuck1{at}unl.edu". Address reprint requests to Dr. Buckendahl.


The authors would like to acknowledge the American Association of Dental Examiners for its support of this article.


   REFERENCES
 TOP
 No, portfolio assessments should...
 REFERENCES
 FOOTNOTES 
 Yes, portfolio assessments can...
 REFERENCES 
 

  1. Chambers DW. Portfolios for determining initial licensure competency. JADA 2004;135:173–84.

  2. Braun HI, Mislevy R. Intuitive test theory. Phi Delta Kappan 2005;86:489–97.

  3. Downing SM, Haladyna TM. Model for evaluating high-stakes testing programs: why the fox should not guard the chicken coop. Educ Meas: Issues Pract 1996;15:5–12.

  4. Ruch GM. Minimum essentials in reporting data on standard tests. J Educ Res 1925;12:349–58.

  5. Buros OK. Mental measurements yearbook. Highland Park, N.J.: Gryphon; 1938.

  6. Madaus GF. An independent auditing mechanism for testing. Educ Meas: Issues Pract 1992;11:26–31.

  7. Buckendahl CW, Plake BS, Impara JC. A strategy for evaluating district assessments for state accountability. Educ Meas: Issues Pract 2004;23:17–25.

  8. Haladyna TM. Writing test items to evaluate higher order thinking. Boston: Allyn and Bacon; 1997.

  9. Wilkerson JR, Lang WS. Portfolios, the Pied Piper of teacher certification assessments: legal and psychometric issues. Educ Policy Analysis Arch 2003;11(45). Available at: "http://epaa.asu.edu/epaa/v11n45/v11n45.pdf". Accessed Nov. 18, 2005.

  10. Gearhart M. Whose work is it? A question for the validity of large-scale portfolio assessment (CSE technical report 363). Los Angeles: Center for the Study of Evaluation, National Center for Research on Evaluation, Standards, and Student Test (CRESST), Graduate School of Education, University of California, Los Angeles; 1993.

  11. Cronbach LJ. Coefficient alpha and the internal structure of tests. Psychometrika 1951;16:297–334.

  12. Popham WJ, Husek TR. Implications of criterion-referenced measurement. J Educ Meas 1969;6:1–9.

  13. American Educational Research Association, American Psychological Association, National Council on Measurement in Education, Joint Committee on Standards for Educational and Psychological Testing. Standards for educational and psychological testing. Washington: American Educational Research Association; 1999.

  14. U.S. Equal Employment Opportunity Commission. Uniform guidelines on employee selection procedures. Fed Regist 1978; 43:38290–315.

  15. American Association of Dental Examiners. Guidance for clinical licensure examinations in dentistry. Chicago: American Association of Dental Examiners; 2003.

  16. Knapp JE, Knapp LG. Practice analysis: building the foundation for validity. In: Impara JC, ed. Licensure testing: Purposes, procedures, and practices. Lincoln, Neb.: Buros Institute for Mental Measurements; 1995:93–116.

  17. Smith J, Crawford L. Report of the findings from the 2002 RN practice analysis update. Natl Counc State Boards Nurs Res Brief 2002;10:9–41.

  18. Impara JC. Licensure testing: Purposes, procedures, and practices. Lincoln, Neb.: Buros Institute for Mental Measurements; 1995.


 


Richard R. Ranney, DDS, MS and Ronald Hambleton, PhD, MA


   Yes, portfolio assessments can be used successfully in dental licensure.
 TOP
 No, portfolio assessments should...
 REFERENCES
 FOOTNOTES 
 Yes, portfolio assessments can...
 REFERENCES 
 
A large majority of leaders in dental education (88 percent of respondents to a survey conducted in 2003) agree that third-party evaluations of graduates are appropriate for licensure purposes.1 At the same time, they overwhelmingly feel (96 percent of respondents to the same survey) that it is important to bring about some changes in licensure procedures at the national level.1 There are several reasons for this unrest with the current model of clinical examinations for licensure. We will present those here, along with some arguments for initiating more research on the technical merits of portfolio assessments.

There already are several examples of successful portfolio assessments and objective structured clinical examinations in place that meet the requisite standards.

Examining agencies have published few or no data that would allow an assessment of the reliability or validity of their examinations.24 The technical data may exist, but the fact that little has been published to substantiate the reliability and validity of examination scores and associated pass-fail decisions is disappointing and should be of concern to all—those in the profession and members of the public. Some data suggest that reliability of clinical licensure examinations has been far below the level of acceptability for high-stakes tests.5 Also, pass rates among testing agencies have varied significantly and, for at least one of the major regional agencies, also have varied significantly over time.610 These variable rates may reflect changes in the candidate population, but they also may reflect both unreliability and invalidity in the examination process. One study conducted over nine years found internal reliability among restorative and prosthodontics sections of a clinical licensure examination to be near zero. For example, the level of pass-fail decision consistency, corrected for chance agreement over the two sections, was reported to be .02 with a sampling error of .04—a totally unacceptable level for any licensing examination.11 Flipping a coin to make pass-fail decisions would have the same level of reliability.

We could find no published studies of predictive validity. Predictive validity is the extent to which performance on the examinations relates to performance in practice. We understand that the goal of these examinations is not to predict future success; still, studies of predictive value do provide additional validity evidence and often are available in other professions. For example, it has been common in medicine to link step 1 and 2 examination scores to physicians’ performance in their residency programs. Even more importantly, content validity evidence has not been compiled and reported. Recall that the intent of dental licensure examinations is to assess skills necessary for safe entry-level practice. But where is the published content validity evidence showing that the skills included in the current examinations fulfill that expectation? The proportion of dental practice covered by these examinations actually is low, resulting in many of the skills and values needed for good practice being omitted from the examinations.5 Attempts to establish concurrent validity of examination scores—that is, correlations with or comparisons to other measures (such as instructor ratings of candidate performance)—have been inconsistent. Most of the published studies found no relationships with other measures (as hypothesized), so though a few examples of positive relationships exist, the studies collectively failed to establish concurrent validity.3,1016 Finally, the greatest dissatisfaction with clinical licensure examinations among dental deans who participated in a survey involves ethical issues connected with the use of patients.1 Many others also have expressed such concerns.1723 Ethical concerns about current examination practices further compound the problem with the examination. In sum, the reliability and validity evidence available to support the current dental examination process is disappointingly small, and this fact should be of major concern to everyone.

The reliability and validity evidence available to support the current dental examination process is disappointingly small, and this fact should be of major concern to everyone.

In a meeting with some dental deans that one of the authors (R.R.R.) attended in 2001, the president-elect of the American Association of Dental Examiners (AADE) presented the concept of portfolios for an alternative entry-level clinical licensure evaluation. The deans of six dental schools developed the idea further into a "straw man" proposal, dated January 2002. They proposed that an examining agency develop a pilot project based on that model but not necessarily including its specifics, and that that agency determine the system’s effectiveness in evaluating the competency of candidates for entry-level dental licensure. The American Dental Association (ADA) 114H-2001 Task Force endorsed the general model,24 and its Council on Dental Education and Licensure determined the model to be "best suited to assess a candidate’s competency, as it most closely parallels actual patient care" (L.A. Assael, DMD, written communication, 2003).

Late in 2001 the AADE also requested that the American Dental Education Association (ADEA) send representatives to its Innovative Testing Methodologies Committee, which became the AADE/ADEA Joint Committee on Innovative Testing and Educational Methodologies (ITEM). The ITEM committee discussed several alternative models, including the "straw man" portfolio proposal, objective structured clinical examinations (OSCEs) (as used in medical examinations in the United States, Canada and many other countries) and sophisticated simulators. But the committee met only three times without resolution on preferred model(s) before AADE withdrew from the joint activity with ADEA in March 2003. Subsequently, AADE’s ITEM committee, without ADEA representatives, rejected portfolios for use in testing.25

Before its dissolution, the joint ITEM committee unanimously approved criteria for an ideal clinical examination,1 namely that the clinical licensure process should

– be a process administered by independent third parties occurring within the educational process;
– allow for a comprehensive evaluation of the full continuum of a candidate’s competency;
– instill public confidence;
– evaluate candidates’ competence within the context of a treatment plan designed to meet the patient’s oral health care needs;
– provide valid data for outcomes assessments;
be provided at a reasonable cost to the applicant.

We completely agree with these six criteria and believe they provide a sound basis for evaluating both the current examination process and any alternatives that may be suggested in the future.

Because they consist of cases (or portions thereof) of patients treated by students during their educational experience, portfolio assessments, in principle, would appear to be more consistent with the first four criteria above than is possible with the current testing model, which consists primarily of one-time observations of selected, predefined clinical procedures. A portfolio assessment could be and should be implemented by third-party agencies, potentially can sample a wider and richer set of clinical problems and procedures, and—if it was implemented correctly with substantial quality controls—likely would be received by the profession and the public in a more positive way than current examination practices. We recognize, however, that much work would be needed to ensure that the first four criteria are satisfactorily met with portfolio assessments. Remaining issues regarding portfolios are whether they can be practical, cost-effective and psychometrically sound.

The major theoretical advantages of portfolios include their ability to incorporate a number of different assessment tools for different competency domains and their ability to foster, as well as assess, reflective learning, a hallmark of professional behavior.26 Careful design is essential for portfolios to be useful in critical candidate evaluation, and, as with other assessments, their reliability and the validity of score inferences made from them would need to be established. Portfolio components can be developed by examination committees to cover as many of the competency domains within the practice of dentistry as evaluators wish, in contrast with the limited sampling that is done with current clinical licensure testing. The period for gathering data regarding performance can be made consistent with expected learning and curricula, can include multiple examples of the same skills and values, and can be defined to include contemporary performance, as well as dated, self-assessments. Performance data could come from predoctoral dental education, residency training or practice experiences, depending on the status of the candidate and the preparedness of examiners to deal reliably with those variations. Failures on portfolio assessments could be dealt with by requiring new portfolio demonstrations of inadequately demonstrated performance and, if appropriate to the circumstance, concurrent additional training. Quality controls of portfolio assessments can be implemented to eliminate many of the most egregious errors such as cheating.

Portfolio components can be developed by examination committees to cover as many of the competency domains within the practice of dentistry as evaluators wish, in contrast with the limited sampling that now occurs.

The companion article by Mr. Hammond and Dr. Buckendahl27 does an excellent job of laying out many of the threats to valid portfolio assessments. Their concerns are important and would need to be addressed in the design and implementation of a portfolio assessment system. But we do not agree with them that the problems are insurmountable, and they fail to consider in their review that portfolio assessments do not have to be perfect—they simply need to be more reliable, valid and practical than the current testing practices to justify their use. Note, too, that we are not suggesting that portfolio assessments be implemented immediately. Rather, we are suggesting that they be studied seriously for operational use.

Construct-irrelevant variance (that is, sources of variability in the candidate scores that are not relevant to dental competence) is a concern with any testing format, and efforts to minimize it in portfolio assessments would be necessary as with any high-stakes testing format. Standardization and training of assessors, faculties and advisers would be demanding tasks for those conducting portfolio assessments. But they are too, or should be, with the current clinical examination model. Few data are available to support reliable and valid portfolio assessments in clinical dental evaluations, but inter-rater reliability of .81 recently was documented for evaluations in a dental hygiene program.4 This is an encouraging finding. While improvement on that level of reliability should be the goal, it still is higher than seems to be the case for the current assessment model in clinical licensure testing.5 Conditions for portfolios can be set, especially as facilitated by modern information technology, for ensuring the recording of independent candidate effort before any faculty or other adviser intervention.

A major source of concern regarding current dental licensure examinations is the unreliability of one-time clinical observations,28,29 on which all current licensing examinations rely. Experimental evidence showed that no increase in number of examiners can compensate for the much larger source of variance contributed by the candidate-by-trial interaction, most likely a function of the variability among the patients and the multidimensionality of the many possible cases for examinations.29 In fact, in the results of that study, adding an infinite number of examiners would not have produced as much improvement in score reliability as would adding a single additional trial. This is an important finding, and it should be considered seriously by the advocates of the current examination system.

The practice of calling a third examiner for any opinion when there is disagreement between two raters, sometimes used in clinical licensure examinations, also is problematic from the perspective of bias. The third examiner would know a priori that one examiner already has decided on a failing performance, unlike the usual situation in which a third examiner is called in when the assigned rater scores differ by a certain amount (say, two points on a five-point scale). Perhaps the problem of bias might be ameliorated by calling in a third examiner on occasion even when the raters agree, but this feature is not used in practice. Certainly, steps need to be taken in practice to eliminate potential bias with the scoring of current examinations.

Unlike present clinical examinations, portfolios offer the advantage of multiple observations of the same or similar skills, an excellent way to increase reliability in clinical observation.

Unlike present clinical examinations, portfolios offer the advantage of multiple observations of the same or similar skills, an excellent way to increase reliability in clinical observation.30 And portfolio assessment can bypass some construct-irrelevant variance—that is, variance that has nothing to do with reliability of examiners but rather has to do with circumstances of the examination, such as patients not showing up as scheduled, equipment failures, illnesses and other logistical difficulties that arise in current processes. Educational institutions—at least predoctoral dental education institutions—make a significant investment of curricular time, faculty and staff effort and in use of facilities for purposes of preparing students for clinical licensure examinations and then in hosting the examinations. Portfolios offer the potential advantage of replacing those costs with costs more aligned with the direct and comprehensive patient care aspects of the dental curriculum.

Some of our colleagues, such as Hammond and Buckendahl,27 clearly are opposed to the use of portfolio assessments in credentialing examinations for technical, practical and fairness reasons. At the same time, the concept of a portfolio assessment seems attractive and has had substantial success already in credentialing teachers and is thought of positively by some medical educators.3133 What is needed at this time is not rejection of a promising idea, but more research on the topic. How should tasks for the portfolios be selected and scored to ensure the content validity of the examination? How many tasks would be needed, and how could scoring approaches be validated? What level of training would be needed to bring scorers up to the standard required for reliable, valid and fair examinations? What quality controls would need to be put in place? What predictive and concurrent validity studies might be carried out to verify the validity of portfolio assessments? How might cheating and inflation of scores that undermine validity and fairness be detected? How might performance expectations for competent candidates be set on portfolio assessments? These are some of the main questions requiring resolution before portfolio assessments are implemented. Other concerns and suggestions can be found in the article by Hammond and Buckendahl.

Twenty years ago, it would have been difficult to imagine that physicians would be required to pass a standardized patient examination before entering residency programs. These examinations involve substantial preparation of actors playing the role of patients, subjective scoring of physician notes, rating scales completed by the patients, new cases offered every day for candidates so as to minimize the value of advance information from other candidates and extensive quality control procedures. Today, such examinations are a routine part of the credentialing process for all medical students, and only rarely have there been challenges from candidates regarding validity and fairness.34 Perhaps portfolio assessments, too, will overcome the technical challenges ahead and will advance the validity of dental credential examinations, much as standardized patient examinations have in medicine and related fields.

Significant resources are being applied to the development of one or two national clinical licensure examinations. This conforms to the direction—national versus regional or local—favored by leaders in dental education.1 However, the proposed examinations are based on the current model of one-time clinical observation. Given the current interest in reforming the system, now is the time to apply resources to investigating which of several possible alternative models provide the most reliable, valid, fair and cost-effective process, either in total or as part of a licensure examination, not simply to extending the current model to the national level. In the end, as is currently the case, the cost of licensure examinations will be borne, directly or indirectly, principally by the candidates for licensure. However, it is in the interest of the dental profession and the public it serves to make the initial investment to determine the best strategy for the future. Of course it will not be easy. Things this important rarely are.

Dentistry in Canada has, for several years, used a different model for clinical licensure examinations than is the case in the United States. Canada’s examination, consisting of combination of a written examination and an OSCE, was developed cooperatively by educators and examiners together and has been validated concurrently against performance in that nation’s dental schools.35 In the United States, written examinations exist as developed and administered by the Joint Commission on National Board Examinations and do not need duplication or supplementation by additional written testing.36 Canada’s success with OSCEs (in both dentistry and medicine) should stimulate consideration of that approach along with others in the United States. Other professions in the United States, such as optometry and medicine, use an OSCE approach, including the use of standardized patients. The Education Commission for Foreign Medical Graduates in the United States has been conducting an OSCE-like examination since 1998 for graduates of foreign medical schools seeking to apply to residency programs in the United States. The testing program involves many thousands of candidates each year. Dentistry remains alone among health care professions in requiring performance of irreversible procedures in patients in an attempt to detect incompetence.

CONCLUSIONS The current model of clinical licensure examination in the United States suffers from low reliability, absence of evidence for validity of inferences from the testing and ethical concerns about the use of patients. In this time of reconsideration of licensure processes, other models such as portfolios and OSCEs should be evaluated to determine if they might improve clinical testing for licensure purposes. We repeat that we are in agreement with many of the potential problems of portfolios identified by Hammond and Buck-endahl.27 Their article has laid out clearly the technical, fairness and practical standards that must be met with portfolio assessments—and in doing so, it serves the profession well. But we do disagree when they argue that the standards could not be met in practice with portfolio assessments serving as part of the licensure process in the dental profession. There already are several examples of successful portfolio assessments and OSCEs in place that do meet the requisite standards. It remains possible for the dental profession to do the same, but an investment in research is needed.


   FOOTNOTES 
 TOP
 No, portfolio assessments should...
 REFERENCES
 FOOTNOTES 
 Yes, portfolio assessments can...
 REFERENCES 
 

Dr. Ranney is a professor emeritus, Dental School, University of Maryland. Address reprint requests to Dr. Ranney at 729 Larue Road, Millersville, Md. 21108, e-mail "pranney3{at}comcast.net". Address reprint requests to Dr. Ranney.


Dr. Hambleton is a distinguished university professor, Center for Educational Assessment, University of Massachusetts, Amherst.


Support for the development of this article was provided by the American Dental Education Association (ADEA). The opinions expressed are those of the authors and do not necessarily reflect opinions or policies of ADEA.


   REFERENCES 
 TOP
 No, portfolio assessments should...
 REFERENCES
 FOOTNOTES 
 Yes, portfolio assessments can...
 REFERENCES 
 

  1. Ranney RR, Haden NK, Weaver RW, Valachovic RW. A survey of deans and ADEA activities on dental licensure issues. J Dent Educ 2003;67(10):1149–60.[Abstract]

  2. Dugoni AA. Protection of the public. J Calif Dent Assoc 2003; 31(11):801–3.[Medline]

  3. Formicola AJ, Lichtenthal R, Schmidt HJ, Myers R. Elevating clinical licensing examinations to professional testing standards. N Y State Dent J 1998;64(1):38–44.[Medline]

  4. Gadbury-Amyot CC, Kim J, Palm RL, Mills GE, Noble E, Overman PR. Validity and reliability of portfolio assessment of competency in a baccalaureate dental hygiene program. J Dent Educ 2003;67(9): 991–1002.[Abstract]

  5. Chambers DW. Portfolios for determining initial licensure competency. JADA 2004;135(2):173–84.

  6. American Dental Association, Committee on the New Dentist. Dental boards and licensure information for the new graduate. Chicago: American Dental Association; 2001.

  7. American Dental Association, Council on Dental Education and Licensure. 2002 survey of clinical testing agencies. Chicago: American Dental Association; 2003.

  8. American Dental Association, American Student Dental Association. Dental boards and licensure information for the new graduate. Chicago: American Dental Association, American Student Dental Association; 2003.

  9. American Dental Association, American Student Dental Association. Dental boards and licensure information for the new graduate. Chicago: American Dental Association, American Student Dental Association; 2004.

  10. Ranney RR. What the available evidence on clinical licensure exams shows. J Evid Base Dent Pract (in press).

  11. Ranney RR, Gunsolley JC, Miller LS, Wood M. The relationship between performance in a dental school and performance on a clinical examination for licensure: a nine-year study. JADA 2004;135(8): 1146–53.

  12. Casada JP, Cailleteau JG, Seals ML. Predicting performance on a dental board licensure examination. J Dent Educ 1996;60(9):775–7.[Medline]

  13. Hangorsky U. Clinical competency levels of fourth-year dental students as determined by board examiners and faculty members. JADA 1981;102(1):35–7.

  14. Ranney RR, Wood M, Gunsolley JC. Works in progress: a comparison of dental school experiences between passing and failing NERB candidates, 2001. J Dent Educ 2003;67(3):311–6.[Abstract]

  15. Stewart CM, Bates RE Jr, Smith GE. Relationship between performance in dental school and performance on a dental licensure examination: an eight-year study. J Dent Educ 2005;69(8):864–9.[Abstract/Free Full Text]

  16. Stewart CM, Vertucci FJ, Bates RE. Improving performance on the endodontic section of the Florida Dental Licensure Examination. J Dent Educ 2004;68(8):829–33.[Abstract/Free Full Text]

  17. Anusavice KJ, Benn DK. Is it time to change state and regional dental licensure board exams in response to evidence from caries research? Crit Rev Oral Biol Med 2001;12(5):368–72.[Abstract/Free Full Text]

  18. Buchanan RN. Problems related to the use of human subjects in clinical evaluation/responsibility for follow-up care. J Dent Educ 1991;55(12):797–801.[Medline]

  19. Dugoni AA. Licensure: entry level examinations—strategies for the future. J Dent Educ 1992;56(4):251–3.[Medline]

  20. Feil P, Meeske J, Fortman J. Knowledge of ethical lapses and other experiences on clinical licensure examinations. J Dent Educ 1999;63(6):453–8.[Abstract]

  21. Formicola AJ, Shub JL, Murphy FJ. Banning live patients as test subjects on licensing examinations. J Dent Educ 2002;66(5):605–9.[Abstract]

  22. Hasegawa TK. Ethical issues of performing invasive/irreversible dental treatment for purposes of licensure. J Am Coll Dent 2002;69(2): 43–6.[Medline]

  23. Meskin L. ‘Freshly washed little cherubs.’ JADA 2001;132(8): 1078–82.

  24. American Dental Association. Report: study of role of patient-based examinations. ADA House of Delegates 2002;5067–77.

  25. Maitland R. President’s message. The Bulletin (American Association of Dental Examiners);Summer 2003:1.

  26. Carraccio C, Englander R. Evaluating competence using a portfolio: a literature review and web-based application to the ACGME competencies. Teach Learn Med 2004;16(4):381–7.[Medline]

  27. Hammond D, Buckendahl CW. Do portfolio assessments have a place in dental licensure? No, portfolio assessments should not be used in dental licensure. JADA 2006;137:30–42.[Medline]

  28. Chambers DW, Dugoni AA, Paisley I. The case against one-shot testing for initial dental licensure. J Calif Dent Assoc 2004;32(3): 243–52.

  29. Chambers DW, Loos L. Analyzing the sources of unreliability in fixed prosthodontics mock board examinations. J Dent Educ 1997;61(4): 346–53.[Abstract]

  30. Brennan RL. Generalizability theory. New York: Springer; 2001.

  31. Moss PA, Sutherland LM, Haniford L. Geist PK. Interrogating the generalizability of portfolio assessments of beginning teachers: a qualitative study. Educ Policy Analysis Arch 2004;12(32).

  32. Reckase MD. Portfolio assessment: a theoretical estimate of score reliability. Educ Meas: Issues Prac 1995;14(1):12–14, 31.

  33. Roberts C, Newble DI, O’Rourke AJ. Portfolio-based assessments in medical education: are they valid and reliable for summative purposes? Med Educ 2002;36(10):899–900.[Medline]

  34. Boulet JR, McKinley DW, Whelan GP, Hambleton RK. Quality assurance methods for performance-based assessments. Adv Health Sci Educ Theory Pract 2003;8:27–47.[Medline]

  35. Gerrow JD, Murphy HJ, Boyd MA, Scott DA. Concurrent validity of written and OSCE components of the Canadian dental certification examinations. J Dent Educ 2003;67(8):896–901.[Abstract]

  36. Ranney RR, Gunsolley JC, Miller LS. Comparisons of National Board Part II and NERB’s written examination for outcomes and redundancy. J Dent Educ 2004;68(1):29–34.[Abstract]





This Article
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Hammond, D.
Right arrow Articles by Hambleton, R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Hammond, D.
Right arrow Articles by Hambleton, R.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS