|ORIGINAL RESEARCH ARTICLE
|Year : 2014 | Volume
| Issue : 3 | Page : 269-276
A 360-degree evaluation of the communication and interpersonal skills of medicine resident physicians in Pakistan
Muhammad Tariq1, John Boulet2, Afaq Motiwala3, Nida Sajjad4, Syeda Kauser Ali5
1 Associate Professor and Section Head, Internal Medicine, Director Postgraduate Programs, Department of Medicine, Aga Khan University, Karachi, Pakistan
2 Research Supervisor, Vice President, Research and Data Resources, for the Foundation for Advancement of International Medical Education and Research, Assistant Vice President, Research and Evaluation, for the Educational Commission for Foreign Medical Graduates
3 Department of Medicine, Research Assistant, Aga Khan University Hospital, Karachi, Pakistan, Resident in Internal Medicine, University of Texas, Southwestern Medical Centre, Dallas, Texas, USA
4 Research Assistant, Department of Medicine, Aga Khan University Hospital, Karachi, Pakistan
5 Associate Professor, Department of Educational Development, Aga Khan University, Chair Curriculum and Assessment Committee for Postgraduate Medical Education, Karachi, Pakistan
|Date of Web Publication||26-Feb-2015|
Associate Professor and Section Head, Internal Medicine, Director Postgraduate Programs, Department of Medicine, Aga Khan University, Karachi
Source of Support: None, Conflict of Interest: None
Background: To provide high-quality patient care, effective communication and interpersonal skills are necessary for physicians. A 360-degree evaluation of residents in the department of medicine was conducted to assess their interpersonal and communication skills. The measurement properties and utility of the multi-source ratings were investigated. Methods: A cross-sectional assessment of a cohort of Internal Medicine residents was conducted at the Aga Khan Medical University in Pakistan. Every resident (n = 49) was evaluated by eight raters, including physicians, nurses and unit staff. Each resident also completed a self-evaluation. Evidence to support the validity of the ratings was gathered by exploring performance differences amongst more- and less-experienced providers. Analysis of variance (ANOVA) was employed to test for differences in mean scores, both for rater type and experience (residency year). Generalizability theory was employed to estimate the reliability of the ratings. Results: We received 367/441 (83.2%) completed forms. There was a significant effect attributable to rater source (F = 5.2, P < 0.01). There were no significant differences in mean scores for residents at different levels of training. The mean resident self-assessment scores were significantly lower than those provided by faculty (P < 0.01). Based on eight raters, the reliability of the ratings was moderate (ρ2 = 0.39). Discussion: The 360-degree evaluation technique can be used to measure the communication and interpersonal skills of residents. It can also provide important data to guide resident feedback. Health care providers and staff who interact with residents on regular basis can, as a group provide moderately consistent judgments of their abilities.
Keywords: Communication skills, interpersonal, multisource feedback, residents, ratings, 360-degree evaluation
|How to cite this article:|
Tariq M, Boulet J, Motiwala A, Sajjad N, Ali SK. A 360-degree evaluation of the communication and interpersonal skills of medicine resident physicians in Pakistan. Educ Health 2014;27:269-76
|How to cite this URL:|
Tariq M, Boulet J, Motiwala A, Sajjad N, Ali SK. A 360-degree evaluation of the communication and interpersonal skills of medicine resident physicians in Pakistan. Educ Health [serial online] 2014 [cited 2017 Sep 21];27:269-76. Available from: http://www.educationforhealth.net/text.asp?2014/27/3/269/152188
| Background|| |
A curriculum is not complete without a well-structured assessment system. Assessment plays an integral role in identifying and responding to learners' needs.  In the recent past, various innovative methods have been used to enhance the quality of assessment to make scores more accurate, reliable and timely.  To promote learning, assessment should be formative, educational and incorporate constructive feedback in the process. To assess the competency of physicians, assessment needs to have a summative function, where specific criteria delineate adequate and inadequate performance.  In United States of America, the Accreditation Council for Graduate Medical Education (ACGME) proposed and implemented a competency-based framework in 1998, which includes resident communication and interpersonal skills (CIS) as major competencies to be taught and assessed. ,,,, All assessment methods have strengths and weaknesses. Often, in an attempt to minimize measurement error and to obtain more valid indicators of ability, multiple assessment methods are employed to measure specific competencies. However, regardless of the assessment method chosen, the usefulness of the derived measures can be judged with respect to reliability, validity, impact on future learning, acceptability to the stakeholders including the learners and the teachers and costs.  Reliability refers to the reproducibility of the scores obtained from an assessment and validity refers to whether an instrument actually does measure what it claims to measure. ,
To assess CIS, multiple assessment methods can be employed. ,,,, The 360-degree evaluation also known as Multisource feedback is one of the promising methods of assessing competencies in the workplace.  As per Rodgers et al., the successful implementation of 360-degree evaluation instrument requires formal training of raters in educational testing and measurement, computer expertise, and a pilot testing to determine reliability and validity. The success of 360-degree evaluation depends on the fairness of the raters and their ability to provide feedback and the raters' ability to receive feedback.  Extensive literature, both in healthcare and in industry, shows multisource feedback to be practical, valid and reliable.  For instance, Joshi et al. found that a 360-degree instrument consisting of a 10-item questionnaire completed by nurses, faculty, allied health professionals, medical students, patients, residents and self, yielded reliable evaluations of residents' competency in interpersonal and communication skills, and could effectively be used to guide formative feedback.
A diverse group of practitioners and staff who interact with residents, each with their own perspective, can provide individual evaluations. The evaluators may include faculty, fellow residents, medical students, nurses, ancillary staff, patients, families and the residents themselves through self-assessment.  Single rater-source ratings from 10-11 peers, 5-15 nurses, 20-50 faculty or 50-147 patients are needed to get reliable estimates of residents' competency. ,,,, In comparison, Ramsey et al.  in their analysis of Multi-source ratings of 187 physicians through 3005 questionnaires found that 10-11 responses per assessee were needed to achieve a generalizability coefficient of 0.7. Another study demonstrated the need for eight raters to give an intra-class coefficient of 0.8.  More than 4000 residency programs in the USA and all foundation programs in the UK use 360-degree evaluations to assess trainees. This method of assessment has also been used for family doctors and surgeons in Canada and internists in the USA. 
Because of the need for multiple evaluations by independent raters, and the need to gather data over time, the process of conducting 360-degree evaluation demands significant logistical support. However, if conducted adequately with proper ground work and teaching and training of the potential evaluators, errors can be minimized, and the validity and reliability of the evaluations can be improved. 
The goals of the study were to (i) document the feasibility of 360-degree evaluations of residents by hospital staff and (ii) gather preliminary data to support the reliability and validity of the 360-degree evaluation ratings.
| Methods|| |
The Aga Khan Medical University (AKU) Karachi, Pakistan is a philanthropic, not-for-profit, private teaching institution, chartered in 1983. The University comprises of medical college, school of nursing and postgraduate medical education department. AKU is committed to the development of human capacities through the discovery and dissemination of knowledge, and application through service. It seeks to prepare individuals for constructive and exemplary leadership roles, and shaping public and private policies, through strength in research and excellence in education, all dedicated to providing meaningful contributions to society. Our undergraduate program was restructured and we have introduced problem-based curriculum, implemented in 2003. It is a hybrid curriculum with retention of some traditional component. Our postgraduate program is undergoing structural change and we are responding to the need of the hour with competency/outcome-based curriculum. The outcome-based assessments are already implemented and the curriculum is in the process of development. The following competencies have been identified, which the graduate will possess: Medical Expertise, CIS, Professionalism, Practice-based learning and Improvement (PBLI), Systems-based practice (SBP) and Scholarship. We have 33 residency programs of 3-5 years duration, 27 fellowship programs of 1-3 years duration and an internship program. The candidates can apply from all over Pakistan.
At the Aga Khan University Hospital (AKUH), we have a 4-year structured residency program in internal medicine. Over the years it was observed that most of the complaints about our residents from nursing staff, other health care providers, patients and their families were related to attitude, communication and interpersonal skills. We therefore, conducted a 360-degree evaluation of interpersonal and communication skills of our residents.
This was a cross-sectional study conducted at the AKUH between November 2009 and March 2010. This study was a component of a larger project designed to look at the impact of structured verbal feedback on residents' performance. Initially, we designed a 360-degree evaluation form based on input from the authors, an extensive literature search and expert opinion. The evaluation form included a list of questions pertaining to various attributes, including residents' communication skills, teamwork skills, punctuality, reliability (dependability) and overall professional competence. Each item (dimension) on the instrument was rated on a unidirectional rating scale ranging from poor to excellent. A series of discussions were held with faculty and residents to review the form for contextual relevance and clarity. The 360-degree evaluation form was further modified and refined based on the discussions. We decided to add an 'unable to comment' option if the evaluator did not feel he or she was able to make a judgment concerning skill level for a particular attribute. The modified 360-evaluation form was then pilot-tested with 15 evaluators including residents and faculty. The opinions of the evaluators regarding the form were sought; they generally seemed satisfied with the form and suggested minor modifications and clarifications in the anchors, which were incorporated in the final form.
The final 360-degree evaluation form (appendix 1 [Additional file 1]) was then administered to all residents, in year 1 (R1) to year 4 (R4) in the department of internal medicine. There were more residents in first two years of core internal medicine residency as part of the normal program transitions for some into subspecialty programs. Each resident was rated by eight raters. A ninth, self-rating was also completed by residents themselves using the same form. A list of potential raters including faculty, nursing staff, unit receptionists (URs), ward coordinators and peers with whom the resident had rotated/interacted in the last six months for a period of more than two months was developed. The raters were then randomly selected by the research associate with the help of the chief residents. The evaluators included two nurses (one head nurse and one registered nurse), two faculty members, one UR, one ward service coordinator (ward manager), two peers (fellow residents) and self.
Informed consent was sought from all residents being studied and the selected raters prior to initiating the study. Meetings were held to ensure their full understanding and cooperation, and to facilitate a smooth implementation of the study. The 360-degree evaluation form was shared in the meeting and any queries regarding the items on the form clarified; however, structured rater training was not conducted. Ethics approval was obtained from the Ethical Review Committee (ERC) of the AKUH, Karachi, Pakistan.
The data was entered and analyzed using SAS version 9.1. The mean scores obtained by each resident for each domain, as well as the mean of all domains for each resident, were calculated and compared. The different categories of evaluators were also grouped together and compared with respect to their mean ratings. To gather evidence to support validity, the residents were grouped together based on their year of training, and their mean scores compared. The results were expressed as means (± standard deviation) and percentages. Analysis of variance (ANOVA) was employed to test for differences in mean scores, both for rater type and residency year. Based on the total score (average of dimension ratings), Generalizability theory was employed to estimate the reliability of evaluation ratings. , The Generalizability study was conducted to look at the variability in ratings associated with the choice and number of raters. The variance components were used to estimate the reliability of the ratings procured from the 360-degree evaluation. This reliability of the evaluations based on eight raters was first calculated. Then, based on a decision study, using the estimated variance components, the number of raters needed to achieve a reliability coefficient of 0.80 was estimated.
| Results|| |
A total of 49 residents from the Internal Medicine residency program were evaluated using the 360-degree evaluations. Twenty-five (51%) of these residents were males and 24 (49%) were females. Seventeen (34.7%) of the residents were in year 1 of training, 13 (26.5%) in year 2, 10 (20.4%) in year 3 and 9 (18.4%) in year 4 of their training.
For purposes of analysis and comparison, the raters were divided into five broad categories; faculty, nurses, residents, unit staff (UR/service coordinators) and self-evaluators. We received a total of 367 evaluations, but 3 were omitted due to incomplete data. We received 62 evaluations from faculty, 94 from nurses, 95 from the residents, 84 from URs and 32 (of 49) self-evaluations.
Mean ratings for all residents, averaged over domains, were compared among rater types using ANOVA, and followed up, where appropriate, with post hoc tests (Scheffe). Based on the one-way ANOVA and the total score across dimensions, there were significant differences in mean scores by rater group (F = 5.2, P < 0.01). Based on post hoc Scheffe's tests, there were significant differences in mean ratings (P < 0.05) between the unit staff (M = 6.2, SD = 1.3) and self-evaluations (M = 5.4, SD = 1.0), and unit staff (M = 6.2, SD = 1.3) and nurses (M = 5.4, SD = 1.3).
Mean ratings provided by unit staff were the highest. Evaluations provided by nurses were, on an average, relatively low, while those by faculty (M = 5.7, SD = 1.0) and other residents were higher (M = 5.8, SD = 1.0). [Table 1] shows mean ratings by rater group.
[Table 2] shows the mean total ratings classified on the basis of level (year of residency). Level 2 residents were rated the lowest overall, whereas Level 3 residents were rated the highest. However, there were no significant differences in mean scores by residency level (P = 0.08).
Based on the estimated variance components, the generalizability of the total evaluation score (average of dimension ratings), with eight raters, was 0.39. A decision study indicated that it would take 24 raters to achieve a generalizability coefficient of 0.80.
| Discussion|| |
Three hundred and sixty-degree evaluation has proven to be an effective tool for professional assessment in business organizations, and recently it has been employed for the evaluation of healthcare workers. Having been recognized in the ACGME's Toolbox of Assessment Methods, the 360-degree evaluation is second only to Standardized patients (SPs) during an (OSCE) Objective Structured Clinical Examination in evaluating the competency of residents. 
During four-year medicine residency, besides interacting with the clinical faculty, residents work in close association with nurses, unit coordinators and URs, peers including junior/senior residents, interns, medical students, patients and their families. Studies have shown that different groups of raters have unique perspectives on physician behavior, and that ratings among physicians, nurses and patients can vary. , Raters from different disciplines can provide specific performance-relevant information to the ratee that may not be identified by supervisory ratings alone. Considering the complex workplace nature of the job and interpersonal relations, it is believed that a single supervisor's ratings may not be sufficient to provide a holistic picture of work performance of a trainee.  The360-degree evaluation is a workplace-based assessment, where raters provide motivated social judgments of performance. Raters are active information processors who gather, interpret, integrate and retrieve information for judgment and decision making, which is influenced by their understanding of performance, personal goals and interactions with the ratee in the social context of the assessment process.  Therefore, even though the 360-degree evaluations may not evaluate all aspects of competence, it plays an important role in assessment of residents as it can employ multiple observers and can be done repeatedly to assess any change in skills. 
For our study, we designed an 8-dimension 360-degree evaluation instrument to assess interpersonal and communication skills of residents. In comparison, previously published studies used more complex rating scales that required the evaluation of 26, 12 and 19 domains. ,, Since getting evaluations from multiple sources requires significant logistic support, we tried to keep our rating instrument simple and user friendly. Analysis of our data showed that, while there were no significant differences in average ratings across residency years, the ratings provided by the faculty and peers were higher compared to those provided by the nurses. This affirms the concept that different raters can have different views and associated performance expectations. Lower nurse ratings were expected as many of the incidents and complaints against residents related to interpersonal behaviors are raised by nurses. Other studies have also reported weak correlation between nurse and faculty evaluations of the same trainees.  Nurses are very important part of the healthcare teams and they have unique opportunities to observe the day-to-day behaviors of the residents, particularly in the after-hours, during performance of procedures, and in managing emergencies. As per Ogunyemi et al.,  the evaluations by the nurses would be useful for formative feedback to residents for their professional development.
As part of this investigation, residents also completed self-evaluations. This allow for a 'gap analysis' between how individuals perceive themselves and how others perceive them.  Self-assessment has been identified as a vital aspect of professional self-regulation.  Generally, our results show that residents underestimate their ability compared with other assessors. Evidence from a large systematic review on accuracy of physician self-assessment suggests that physicians have a limited ability to accurately self-assess.  Three large studies demonstrated that self-ratings tend to be higher than multisource feedback ratings, particularly in the less highly rated individuals. ,,, Literature supports that individuals scored low by others tend to overestimate their behaviors when they score themselves, and that individuals scored high by others tend to self-rate lower than these others. ,, Although students and residents as learners may not be accurate, their perceptions must be taken into account.  As per Eva and Regehr,  self-assessment functions as a mechanism for identifying one's weaknesses and the strengths. The self-assessment is a very effective tool for assessment and feedback and its structure and function need to be explored further.
While one might expect communication ability to increase with experience, we found that average ratings were highest for R3s and lowest for R2s. This finding could be related to the structure of our postgraduate program with respect to clinical rotations and the level of supervision. The residents are given graded responsibilities during their four years of training. In first year they rotate mostly in general medicine in a team structure. The team is led by the senior resident (year 3/4) and the year-1 resident works under his/her umbrella. The year-1 residents interact with all the co-workers but they are always supervised. In year-2, residents are mostly rotating in subspecialties under an indirect supervision of subspecialty fellows/residents. The year-2 residents are expected to see patients in the emergency room as first on-call. The senior medicine resident is not there to supervise them in dealing with any complex patient care issues. The idea behind such rotation is to provide them exposure to enhance their ability of independent patient management, albeit they are more vulnerable and exposed with respect to their professional behavior. This may help explain why they are rated lowest (their patient care tasks are more difficult). However, more research is needed to validate these findings. When residents move to year-3, they assume the role of a senior resident, a team leader who has learnt to deal with more difficult situations, including communicating with team members and families. The year-4 residents are near their completion of training and the expectations from them is high; therefore, they may receive relatively lower ratings or it might be possible that this cohort was not good and multiple evaluations over time would give us more appropriate results.  Despite all of the above explanations concerning the complexities of patient care and the expectations of the assessors, larger samples of residents are needed to more fully investigate the relationship between experience, case complexity, and communication ability. Moreover, across all residency cohorts, some individuals received poor evaluations, indicating that further communication skills assessment and remediation is necessary.
Based on eight evaluations of each resident, the generalizability of the overall communication score was only moderate (ρ2 = 0.39). While this level of reliability would not be acceptable for summative assessment purposes, it may be adequate for the purposes of providing feedback to the residents, especially if verbal comments accompany the numerical scores. Unfortunately, given the relatively small number of resident evaluations, and evaluators, it was not possible to look at the variance attributable to each of rater groups. This would certainly be a worthwhile goal of a future investigation. Given larger numbers of evaluations by each group (e.g. nurses, physicians), and the availability of specific comments, one might be able to look more specifically at what specific groups are expecting in terms of resident communication skills.
This study is a first step in evaluating the communication skills of residents in our program and our study has some limitations. First, it was a single center study and the participants included were internal medicine residents only. Therefore, generalizability to other programs is not known. While the ultimate goal of a broader study is to look at the impact of feedback from the 360-degree evaluation, necessitating the repeated measurement of the residents, longitudinal data is not yet fully available across the cohorts. Our second limitation was not including patients and their families as raters in our study because of logistical issues. Ultimately, doctor-patient communication is probably best assessed by the patients and our future studies on 360-degree evaluation need to include patients as the evaluators. Some of the items in our 360-degree instrument included more than one concept like communication and attitude, which might have caused difficulty for the raters in assessing the ratees. In addition, we did not conduct rigorous rater training on how to fill out the forms, modification in the instrument with improved explanation guide and appropriate rater training would be one of our future goals. Lastly we used conventional rating scales, which was ratings from 1 to 7 with 1 being poor and 7 being excellent. However, recent work on workplace-based assessment by Crossley et al. has shown greater utility of construct-aligned scales, as they are more reliable with a greater validity In construct aligned scales expertise of the raters and the trainee's developing ability with the use of behavioral descriptors is aligned linked to the construct of clinical independence in the workplace-based assessment so that the assessors can discriminate more widely between low and high performing individuals. 
In addition, recent literature emphasizes the importance of judgments by experts rather than objective observations to be used for assessment in the workplace.  One of our future research opportunities would be to study the feasibility and utility of construct aligned rating scales in 360-degree evaluation.
In conclusion, the use of a 360-degree evaluation in our setting is feasible yet demands additional psychometric work. Gathering more evidence to support the validity of evaluation ratings is certainly warranted. The possibility that a single 360-degree evaluation form may not work for all types of raters, if they do not share similar opportunities to observe resident skills, should be explored further. Moreover, questions concerning proper number and mix of assessors, the periodicity of the evaluations, and the quality and impact of feedback from 360-degree evaluations need to be answered.
| Acknowledgments|| |
The authors wish to acknowledge the faculty for the development of the questionnaire. The authors are also grateful to all the evaluators, which included faculty, nurses, unit (ward) coordinators (managers), URs and residents for taking out time to fill in the 360-degree questionnaire. The authors would thank FAIMER faculty for providing guidance and help.
| References|| |
Epstein RM. Assessment in medical education. N Engl J Med 2007;356:387-96.
Wass V, Van der Vleuten C, Shatzer J, Jones R. Assessment of clinical competence. Lancet 2001;357:945-9.
Swing S, Bashook P. Toolbox of assessment methods. version 1.1. Evanston, USA: Accreditation Council for Graduate Medical Education/American Board of Medical Specialties; 2000:2015.
Batalden P, Leach D, Swing S, Dreyfus H, Dreyfus S. General competencies and accreditation in graduate medical education. Health Aff 2002;21:103-11.
Swing SR. The ACGME outcome project: Retrospective and prospective. Med Teach 2007;29:648-54.
Van Der Vleuten CP. The assessment of professional competence: Developments, research and practical implications. Adv Health Sci Educ Theory Pract 1996;1:41-67.
Lynch DC, Surdyk PM, Eiser AR. Assessing professionalism: A review of the literature. Med Teach 2004;26:366-73.
Hodges BD, Ginsburg S, Cruess R, Cruess S, Delport R, Hafferty F, et al
. Assessment of professionalism: Recommendations from the Ottawa 2010 Conference. Med Teach 2011;33:354-63.
Wilkinson TJ, Wade WB, Knock LD. A blueprint to assess professionalism: Results of a systematic review. Acad Med 2009;84:551-8.
Lurie SJ, Mooney CJ, Lyness JM. Measurement of the general competencies of the accreditation council for graduate medical education: A systematic review. Acad Med 2009;84:301-9.
Rodgers KG, Manifold C. 360-degree Feedback: Possibilities for Assessment of the ACGME Core Competencies for Emergency Medicine Residents. Acad Emerg Med 2002;9:1300-4.
Atkins PW, Wood RE. Self-Versus others′ ratings as predictors of assessment center ratings: Validation evidence for 360-degree feedback programs. Pers Psychol 2006;55:871-904.
Joshi R, Ling FW, Jaeger J. Assessment of a 360-degree instrument to evaluate residents′ competency in interpersonal and communication skills. Acad Med 2004;79:458-63.
Ramsey PG, Wenrich MD, Carline JD, Inui TS, Larson EB, LoGerfo JP. Use of peer ratings to evaluate physician performance. JAMA 1993;269:1655-60.
Wenrich MD, Carline JD, Giles LM, Ramsey PG. Ratings of the performances of practicing internists by hospital-based registered nurses. Acad Med 1993;68:680-7.
16. Ramsey PG, Carline JD, Blank LL, Wenrich MD. Feasibility of hospital-based use of peer ratings to evaluate the performances of practicing physicians. Acad Med 1996;71:364-70.
Kaplan CB, Centor RM. The use of nurses to evaluate houseofficers humanistic behavior. J Gen Intern Med 1990;5:410-4.
Butterfield PS, Mazzaferri EL. A new rating form for use by nurses in assessing residents humanistic behavior. J Gen Intern Med 1991;6:155-61.
Wood L, Wall D, Bullock A, Hassell A, Whitehouse A, Campbell I. ′Team observation′: A six-year study 1 of the development and use of multi-source feedback (360-degree assessment) in obstetrics and gynaecology training in the UK. Med Teach 2006;28:177-84.
Overeem K, Wollersheim H, Driessen E, Lombarts K, Van De Ven G, Grol R, et al
. Doctors perceptions of why 360-degree feedback does (not) work: A qualitative study. Med Educ 2009;43:874-82.
Boulet JR. Generalizability Theory: Basics. Encyclopedia of Statistics in Behavioral Science. USA: John Wiley and Sons Ltd; 2005.
Shavelson RJ, Webb NM, Rowley GL. Generalizability theory. Am Psychol 1989;44:922-32.
Woolliscroft JO, Howell JD, Patel BP, Swanson DB. Resident-patient interactions: The humanistic qualities of internal medicine residents assessed by patients, attending physicians, program supervisors, and nurses. Acad Med 1994;69:216-24.
McLeod PJ, Tamblyn R, Benaroya S, Snell L. Faculty ratings of resident humanism predict patient satisfaction ratings in ambulatory medical clinics. J Gen Intern Med 1994;9:321-6.
Hoffman B, Lance CE, Bynum B, Gentry WA. Rater source effects are alive and well after all. Pers Psychol 2010;63:119-51.
Govaerts M, Van de Wiel M, Schuwirth LW, Van der Vleuten C, Muijtjens A. Workplace-based assessment: Raters′ performance theories and constructs. Adv Health Sci Educ Theory Pract 2013;18:375-96.
Massagli TL, Carline JD. Reliability of a 360-degree evaluation to assess resident competence. Am J Phys Med Rehabil 2007;86:845-52.
Ogunyemi D, Gonzalez G, Fong A, Alexander C, Finke D, Donnon T, et al
. From the eye of the nurses: 360-degree evaluation of residents. J Contin Educ Health Prof 2009;29:105-10.
Wood J, Collins J, Burnside ES, Albanese MA, Propeck PA, Kelcz F, et al
. Patient, faculty, and self-assessment of radiology resident performance. Acad Radiol 2004;11:931-9.
Eva KW, Regehr G. Self-assessment in the health professions: A reformulation and research agenda. Acad Med 2005;80 (10 Suppl):S46-54.
Davis DA, Mazmanian PE, Fordis M, Van Harrison R, Thorpe KE, Perrier L. Accuracy of physician self-assessment compared with observed measures of competence: A systematic review. JAMA 2006;296:1094-102.
Mabe PA, West SG. Validity of self-evaluation of ability: A review and meta-analysis. J Appl Psychol 1982;67:280-96.
Fletcher C. The implications of research on gender differences in self-assessment and 360 degree appraisal. Hum Resour Manag J 1999;9:39-46.
Van der Heijden BI, Nijhof AH. The value of subjectivity: Problems and prospects for 360-degree appraisal systems. Int J Hum Resour Manag 2004;15:493-511.
Wood L, Hassell A, Whitehouse A, Bullock A, Wall D. A literature review of multi-source feedback systems within and without health services, leading to 10 tips for their successful design. Med Teach 2006;28:e185-91.
Violato C, Lockyer J. Self and peer assessment of pediatricians, psychiatrists and medicine specialists: Implications for self-directed learning. Adv Health Sci Educ Theory Pract 2006;11:235-44.
Kruger J, Dunning D. Unskilled and unaware of it: How difficulties in recognizing one′s own incompetence lead to inflated self-assessments. J Pers Soc Psychol 1999;77:1121-34.
Davies H, Archer J, Bateman A, Dewar S, Crossley J, Grant J, et al
. Specialty-specific multi-source feedback: Assuring validity, informing training. Med Educ 2008;42:1014-20.
Sargeant J, Mann K, Sinclair D, Van der Vleuten C, Metsemakers J. Challenges in multisource feedback: Intended and unintended outcomes. Med Educ 2007;41:583-91.
Crossley J, Johnson G, Booth J, Wade W. Good questions, good answers: Construct alignment improves the performance of workplace-based assessment scales. Med Educ 2011;45:560-9.
Crossley J, Jolly B. Making sense of work-based assessment: Ask the right questions, in the right way, about the right things, of the right people. Med Educ 2012;46:28-37.
[Table 1], [Table 2]