Endoscopic assessment and grading of Barrett's esophagus using magnification endoscopy and narrow-band imaging: Accuracy and interobserver agreement of different classification systems (with videos)
Background: Three different classification systems for the evaluation of Barrett's esophagus (BE) using magnification endoscopy (ME) and narrow-band imaging (NBI) have been proposed. Until now, no comparative and external evaluation of these systems in a clinical-like situation has been performed. Objective To compare and validate these 3 classification systems. Design Prospective validation study. Setting Tertiary-care referral center. Nine endoscopists with different levels of expertise from Europe and Japan participated as assessors. Patients Thirty-two patients with long-segment BE. Interventions From a group of 209 standardized prospective recordings collected on BE by using ME combined with NBI, 84 high-quality videos were randomly selected for evaluation. Histologically, 28 were classified as gastric type mucosa, 29 as specialized intestinal metaplasia (SIM), and 27 as SIM with dysplasia/cancer. Assessors were blinded to underlying histology and scored each video according to the respective classification system. Before evaluation, an educational set concerning each classification system was carefully studied. At each assessment, the same 84 videos were displayed, but in different and random order. Main Outcome Measurements Accuracy for detection of nondysplastic and dysplastic SIM. Interobserver agreement related to each classification. Results The median time for video evaluation was 25 seconds (interquartile range 20-39 seconds) and was longer with the Amsterdam classification (P < .001). In 65% to 69% of the videos, assessors described certainty about the histology prediction. The global accuracy was 46% and 47% using the Nottingham and Kansas classifications, respectively, and 51% with the Amsterdam classification. The accuracy for nondysplastic SIM identification ranged between 57% (Kansas and Nottingham) and 63% (Amsterdam). Accuracy for dysplastic tissue was 75%, irrespective of the classification system and assessor expertise level. Interobserver agreement ranged from fair (Nottingham, κ = 0.34) to moderate (Amsterdam and Kansas, κ = 0.47 and 0.44, respectively). Limitation No per-patient analysis. Conclusions All of the available classification systems could be used in a clinical-like environment, but with inadequate interobserver agreement. All classification systems based on combined ME and NBI, revealed substantial limitations in predicting nondysplastic and dysplastic BE when assessed externally. This technique cannot, as yet, replace random biopsies for histopathological analysis.