Professional Documents
Culture Documents
9012
GI-RADS reporting system for ultrasound evaluation of adnexal masses in clinical practice: a prospective multicenter study
F. AMOR*, J. L. ALCAZAR, H. VACCARO*, M. LEON and A. ITURRA
*Centro Ecograco Ultrasonic Panoramico, Santiago, Chile; Department of Obstetrics and Gynecology, Clinica Universidad de Navarra, University of Navarra, Pamplona, Spain; Department of Obstetrics and Gynecology, Cl nica Las Lilas, Santiago, Chile; Department of Obstetrics and Gynecology, Cl nica Indisa, Santiago, Chile
ABSTRACT
Objective To assess the clinical usefulness of a structured reporting system based on ultrasound ndings for management of adnexal masses. Methods This was a prospective multicenter study comprising 432 adnexal masses in 372 women (mean age, 44.0 (range, 1378) years) over a 36-month period. Ninetythree (25%) women were postmenopausal and 279 (75%) women were premenopausal. Patients were evaluated with transvaginal ultrasound by one of three examiners expert in gynecological ultrasound. Reporting was provided to referring clinicians according to the Gynecologic Imaging Report and Data System (GI-RADS) classication. A predetermined management protocol was offered to referral clinicians. It was suggested that patients classied as GI-RADS 2 be managed with follow-up scan, patients classied as GI-RADS 3 undergo laparoscopic surgery and patients classied as GI-RADS 4 or 5 be referred to a gynecologic oncologist. Denitive histologic diagnosis was available in 370 cases and 62 additional cases were considered as benign because of spontaneous resolution during follow-up. These outcomes were used as the gold standard for calculating the sensitivity, specicity, positive predictive value (PPV), negative predictive value (NPV), positive likelihood ratio (LR+) and negative likelihood ratio (LR) of GI-RADS classication for identifying adnexal masses at high risk of malignancy, considering GI-RADS 4 and 5 as being malignant. Results Of the 432 tumors, 112 were malignant and 320 benign. The GI-RADS classication rate was as follows: GI-RADS 2, 92 (21%) cases; GI-RADS 3, 184 (43%) cases; GI-RADS 4, 40 (9%) cases; GI-RADS 5, (27%) 116 cases. Sensitivity for this system was 99.1% (95% CI, 95.199.8%), specicity was 85.9% (95% CI,
81.789.3%), LR+ was 7.05 (95% CI, 5.379.45) and LR was 0.01 (95% CI, 0.0010.07). PPV and NPV were 71.1% and 99.6%, respectively. Conclusions The GI-RADS reporting system performed well in identifying adnexal masses at high risk of malignancy and seems to be useful for clinical decisionmaking. Copyright 2011 ISUOG. Published by John Wiley & Sons, Ltd.
INTRODUCTION
Ultrasonography is currently considered as the primary imaging modality for identifying and characterizing adnexal masses1 . Several approaches have been proposed for their characterization using this technique, including examiners subjective impression2 , simple descriptive scoring systems3 , mathematically developed scoring systems4 , logistic regression models5 and neural networks6 . Subjective impression of an experienced examiner is currently believed to be the best approach and no other method has been proven its superior7,8 . However, the examiners impression is entirely subjective and recent evidence has shown that this fact affects not only the performance of the method itself9 , but also the examiners condence in providing a diagnosis10 . Furthermore, a recent randomized study demonstrated that examiner experience affects performance and decision-making in clinical practice11 . Due to the subjective nature of the examiners impression there is a need for a standardized nomenclature and denition for all tumor features evaluated by ultrasound. This was provided by the International Ovarian Tumor Analysis (IOTA) consensus12 . Undoubtedly, this
Correspondence to: Dr J. L. Alcazar, Department of Obstetrics and Gynecology, Clinica Universidad de Navarra, Avenida Pio XII, 36, 31008 Pamplona, Spain (e-mail: jlalcazar@unav.es) Accepted: 17 March 2011
ORIGINAL PAPER
GI-RADS reporting of adnexal masses consensus has allowed a better, homogeneous description of adnexal masses. However, there is still signicant variation in the reporting of ultrasound examination results for adnexal masses13 . In fact, a recent consensus conference of the Society of Radiologists in Ultrasound concluded that investigation into structured reporting of adnexal cysts to allow for improved communication of results and recommendations for follow-up is needed14 . In 2009 we proposed a reporting system similar to that used for breast ultrasound (BI-RADS): the Gynecology Imaging Reporting and Data System (GI-RADS), developed to facilitate communication between sonologists/sonographers and referring clinicians15 . This GIRADS classication is based on ultrasound ndings, representing a summarized standardized report of those ndings and also providing an estimated risk of malignancy for a given adnexal mass. The aim of this study was to assess prospectively the use of this reporting system for decision-making in clinical practice.
451
probability of malignancy, based on data from previous studies15 17 . The reporting system includes ve categories (Table 1) and the report includes a description of the mass as well as a nal GI-RADS classication (Figure S1). During the examination, tumor volume was also estimated according to the prolate ellipsoid formula (length width height 0.5233, expressed in mL), but this feature was not taken into consideration for assigning a GI-RADS classication. The meaning and goal of GI-RADS classication was explained to referring clinicians in several clinical sessions before the study started. A management protocol was offered to referral clinicians with the aim of determining whether this reporting system could be useful for deciding patient management and in avoiding confusion for clinicians. However, while we followed up patients to determine how they were managed ultimately, we were not involved in clinical decision-making. The suggested management protocol was based on risk of malignancy as estimated by GI-RADS classication. Those patients classied as GI-RADS 1 (e.g. normal ovaries at ultrasound) were excluded from the study and from further analysis. GI-RADS 2 patients were considered for expectant management by follow-up sonography on the basis that these lesions were assumed to be functional. GI-RADS 3 patients underwent surgery by general gynecologists on the basis that these lesions were considered to be probably benign and expected to persist over time. Laparoscopy was preferable, although the surgeon managing the patient made the nal decision regarding surgical approach (laparoscopy or laparotomy). Patients classied as GI-RADS 4 and 5 were referred to gynecological oncologists for appropriate additional imaging techniques (computed tomography or magnetic resonance imaging) and surgical management, on the basis that these lesions were considered to be probably or very probably malignant. When surgical removal of the tumor was performed, a denitive histologic diagnosis was obtained. Tumors were classied according to World Health Organization criteria18 and malignant tumors were staged according to FIGO criteria19 . Borderline tumors were considered as
Table 1 Gynecologic Imaging Report and Data System (GI-RADS) classication system for adnexal masses GI-RADS grade 1 2 3 Est. prob. malignancy 0% < 1% 14%
Detail Normal ovaries identied and no adnexal mass seen Adnexal lesions thought to be of functional origin, e.g. follicles, corpora lutea, hemorrhagic cysts Neoplastic adnexal lesions thought to be benign, such as endometrioma, teratoma, simple cyst, hydrosalpinx, paraovarian cyst, peritoneal pseudocyst, pedunculated myoma, or ndings suggestive of pelvic inammatory disease Any adnexal lesion not included in GI-RADS 13 and with one or two ndings suggestive of malignancy* Adnexal masses with three or more ndings suggestive of malignancy*
4 5
520%
> 20%
*Thick papillary projections, thick septations, solid areas and/or ascites, dened according to IOTA criteria12 , and vascularization within solid areas, papillary projections or central area of a solid tumor on color or power Doppler assessment5 . Est. prob., estimated probability.
452
Amor et al.
malignant for analytic purposes. STARD guidelines were followed for designing and conducting the study20 .
RESULTS
A total of 372 women with adnexal masses were included in this study (279 from the Cl nica Universidad de Navarra and 93 from Centro Ecograco Ultrasonic Panoramico). Their mean age was 44 (range, 1378) years. Ninetythree (25%) women were postmenopausal and 279 (75%) were premenopausal. Sixty (16%) patients had bilateral tumors, giving a total number of 432 adnexal masses assessed. The prevalence of malignant tumors was 26% (112 malignant tumors in 87 patients). Malignant tumors were more frequent in postmenopausal women (43.2%) than in premenopausal women (13.2%) (P < 0.001). Of the 432 masses assessed, 92 (21%) were classied as GI-RADS 2, 184 (43%) as GI-RADS 3, 40 (9%) as GI-RADS 4 and 116 (27%) as GI-RADS 5. Tumor volume was signicantly smaller in GI-RADS 2 and 3 cases compared with GI-RADS 4 and 5 cases, while there was no difference in tumor volume between GI-RADS 2 and 3 cases or between GI-RADS 4 and 5 cases (Table 2). Most referring clinicians managed their patients according to GI-RADS classication. Figure 1 summarizes the classications, management and nal outcomes of the study population, and nal histological diagnoses are given in Table 3. There was no malignant tumor classied as GI-RADS 2. There was one such case classied as GI-RADS 3; this false-negative case was a 73-year-old woman with a 580 mL cyst diagnosed as benign serous cyst, but histology showed it to be a serous ovarian carcinoma, Stage Ia. The sensitivity for the GI-RADS reporting system in predicting malignancy was 99.1% (95% CI, 95.199.8%), specicity was 85.9% (95% CI, 81.789.3%), LR+ was 7.05 (95% CI, 5.379.45) and LR was 0.01 (95% CI,
432 masses 372 women GI-RADS 4 40 masses 39 women
Statistical analysis
Categorical variables were compared using the chisquare test and tumor volumes were compared using the MannWhitney U-test. We calculated the sensitivity, specicity, positive predictive value (PPV), negative predictive value (NPV), positive likelihood ratio (LR+) and negative likelihood ratio (LR) of the GI-RADS system for identifying adnexal masses at high risk of malignancy, considering GI-RADS 2 and 3 as low risk and GI-RADS 4 and 5 as high risk. The gold standard was histologic diagnosis (benign or malignant) or spontaneous resolution of the cyst during follow-up (benign). To determine how useful they found the GI-RADS reporting system for understanding ultrasound ndings and for making decisions regarding patient management, referral clinicians involved in patient clinical decisionmaking were asked to complete a simple survey. This survey consisted of a single question: How useful do you think GI-RADS reporting system is for understanding ultrasound ndings and giving condence in clinical decisions regarding your patient? and there were ve possible answers: (A) totally useful; (B) quite useful; (C) neither useful nor useless; (D) useless; (E) completely useless. To assess interobserver reproducibility of GI-RADS classication, two examiners (J.L.A. and A.I.) performed a separate analysis in 60 consecutive women who were already included in the study. Both examiners performed a transvaginal scan, blinded to each others results, and each one provided a GI-RADS report. To determine the concordance between examiners we used a weighted Kappa index.
GI-RADS 3 184 masses 155 women Surgery: general gynecologist 182 masses
Follow-up 86 masses
Follow-up 2 masses
Persistence 15 masses
Spontaneous Histology Histology resolution malignant benign 2 masses 1 mass 181 masses
Histology Histology Histology Histology malignant malignant benign benign 29 masses 8 masses 13 masses 103 masses
Figure 1 Flow chart showing classication by Gynecologic Imaging Report and Data System (GI-RADS), management and nal outcome of the study group of 372 women with 432 adnexal masses.
453
Table 4 Diagnostic performance of Gynecologic Imaging Report and Data System (GI-RADS) reporting system in 372 women with 432 adnexal masses Number of tumors classied as: GI-RADS 23 1 275 276 GI-RADS 45 111 45 156 Total 112 320 432
0.0010.07) (Table 4). The PPV and NPV were 71.1% and 99.6%, respectively. All fteen (six in Spain and nine in Chile) referring clinicians considered this reporting system to be quite useful or useful for clinical decision-making in adnexal masses. The interobserver agreement for GI-RADS classication of adnexal masses was very good (weighted kappa index = 0.846) (Table 5).
DISCUSSION
Reporting in ultrasound evaluation of adnexal masses is an important issue. A recent study from Canada has shown that current reporting practices for ultrasound assessments in women with ovarian masses vary considerably and concluded that the use of a synoptic reporting system would be useful13 . Inappropriate reporting may lead to unwarranted concern by the patient and referring
clinician and could lead to unnecessary additional tests and surgery21 . In fact, investigation into structured reporting of adnexal masses to allow for improved communication of results and recommendations for management has been advised recently14 . For this reason we recently developed a simple reporting system based on the concept developed for breast imaging (the BI-RADS classication), which was originally developed for mammographic ndings but has been applied successfully to breast ultrasound. As for BI-RADS, the lexicon of our new system is intended to provide a unied language for ultrasound reporting and to avoid confusion in the communication between the sonographer/sonologist and the clinician. We called this reporting system GI-RADS15 . In the present study we assessed prospectively the use of our GI-RADS reporting system for ultrasound evaluation of adnexal masses and clinical decision-making. A strength of the study is that the ultrasound examiners were not involved in the decision-making process. The GI-RADS reporting system is based on the use of pattern recognition analysis of the tumor2 and the
Table 3 Gynecologic Imaging Report and Data System (GI-RADS) classication according to specic histologic diagnosis in 372 women with 432 adnexal masses Number of tumors classied as: Histologic diagnosis Functional cyst* Serous cystadenoma Mucinous cystadenoma Endometrioma Teratoma Paraovarian cyst Hemorrhagic cyst Cystadenobroma Peritoneal cyst Fibroma Hydrosalpinx Tubo-ovarian abscess Leiomyoma Brenner tumor Low malignant potential tumor Primary ovarian cancer Metastatic cancer Total GI-RADS 2 71 5 0 6 0 0 9 1 0 0 0 0 0 0 0 0 0 92 GI-RADS 3 2 36 10 78 28 3 2 5 3 3 9 2 0 2 0 1 0 184 GI-RADS 4 0 7 4 2 4 0 0 3 2 6 2 0 1 1 2 5 1 40 GI-RADS 5 0 0 3 0 0 0 0 2 0 3 1 1 1 2 12 75 16 116 Total 73 48 17 86 32 3 11 11 5 12 12 3 2 5 14 81 17 432
*Spontaneous resolution at follow-up. Five hemorrhagic cysts and the cystadenobroma comprised the six GI-RADS 2 cases which underwent surgery following diagnosis due to pain symptoms. Two hydrosalpinges and one serous cystadenobroma comprised the GI-RADS 4 cases which underwent laparoscopic surgery by a general gynecologist.
454
Amor et al.
Table 5 Agreement analysis between two observers for assigning Gynecologic Imaging Report and Data System (GI-RADS) classication in 60 women with unilateral adnexal masses Examiner A Examiner B GI-RADS 2 GI-RADS 3 GI-RADS 4 GI-RADS 5 Total GI-RADS 2 10 1 0 0 11 GI-RADS 3 1 21 2 0 24 GI-RADS 4 0 3 9 2 14 GI-RADS 5 0 0 0 11 11 Total 11 25 11 13 60
a-priori risk of malignancy of different tumor features15 17 . Although one could argue that pattern recognition is a subjective assessment, there is evidence that this is the best method for characterizing adnexal masses7,8 and that pattern recognition is reproducible among expert examiners22 24 . In terms of diagnostic performance, this reporting system performed well, with a very high sensitivity and acceptable specicity. This is not surprising bearing in mind that it is based on IOTA criteria, which have been tested extensively in several multicenter studies and shown to be good criteria for discriminating between benign and malignant adnexal masses25 27 . However, one possible selection bias in our study is the relatively high prevalence of malignant tumors, which could affect estimation of sensitivity and specicity. Notwithstanding, both PPV and NPV were high and these gures are not affected by disease prevalence. Our data have shown that the GI-RADS classication system is useful for clinical decision-making and referral. Furthermore, all referring clinicians involved in patient management considered it to be useful. We therefore propose a standardized nomenclature for reporting ultrasound ndings of adnexal masses, applying the same rationale as that of BI-RADS classication for breast ultrasound. While it is true that adequate referral may be achieved using logistic models such as the risk malignancy index28,29 , scoring systems30 or just pattern recognition analysis as does IOTA31 , a standardized reporting nomenclature is lacking. To the best of our knowledge, this is the rst such standardized reporting/classication system applicable to adnexal masses. It is likely that this reporting system would not be needed in those institutions where ultrasound examiners and clinicians participating in clinical decision-making have good and direct communication and decisions about patient management are collegiate, or even in those practices where expert sonologists themselves decide about their own patients management. However, this system could be useful in those settings in which clinicians managing patients do not perform ultrasound examinations, instead reading the report of the morphological description of the tumor. It could also be useful for small hospitals and for private practitioner-gynecologists who must refer patients with suspicious masses to tertiary care hospitals with gynecologic oncology facilities.
There were some limitations to the study. A possible bias is that expert examiners performed all ultrasound examinations; this is known to potentially affect diagnostic performance when using pattern recognition analysis9,32 . Therefore, further research into how this reporting system performs when used by non-expert examiners is needed. Another bias of this study is that a management protocol according to GI-RADS classication was offered to referral clinicians before starting the study. This could have biased their decision as to how to manage the patients. An interesting issue regarding the suggested management protocol is the use of surgery in cases of GI-RADS 3. In fact, expectant management could also be offered safely to these patients33 . A further weakness of this study is the fact that most GI-RADS 4 lesions were benign, although they were classied as being probably malignant. However, there was still a 20% risk of malignancy (8/40). One option for improving the predictive value of this group would be further classication into subgroups depending on degree of likelihood of malignancy according to the examiners impression. In conclusion, this prospective study has shown that GI-RADS classication performs well as a reporting system in adnexal masses and it seems to be useful for clinical decision-making.
REFERENCES
1. ACOG Practice Bulletin. Management of adnexal masses. American College of Obstetricians and Gynecologists. Obstet Gynecol 2007; 110: 201214. 2. Valentin L. Pattern recognition of pelvic masses by gray-scale ultrasound imaging: the contribution of Doppler ultrasound. Ultrasound Obstet Gynecol 1999; 14: 338347. 3. Granberg S, Wikland M, Jansson I. Macroscopic characterization of ovarian tumors and the relation to the histological diagnosis: criteria to be used for ultrasound evaluation. Gynecol Oncol 1989; 35: 139144. 4. Alcazar JL, Merc e LT, Laparte C, Jurado M, Lopez-Garc a G. A new scoring system to differentiate benign from malignant adnexal masses. Am J Obstet Gynecol 2003; 188: 685692. 5. Alcazar JL, Errasti T, Laparte C, Jurado M, Lopez-Garc a G. Assessment of a new logistic model in the preoperative evaluation of adnexal masses. J Ultrasound Med 2001; 20: 841848. 6. Timmerman D, Verrelst H, Bourne TH, De Moor B, Collins WP, Vergote I, Vandewalle J. Articial neural network models for the preoperative discrimination between malignant and benign adnexal masses. Ultrasound Obstet Gynecol 1999; 13: 1725.
455
Irwig LM, Lijmer JG, Moher D, Rennie D, de Vet HC. Standards for Reporting of Diagnostic Accuracy. Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative. Clin Radiol 2003; 58: 575580. Brown DL, Dudiak KM, Laing FC. Adnexal masses: US characterization and reporting. Radiology. 2010; 254: 342354. Timmerman D, Schwarzler P, Collins WP, Claerhout F, Coenen M, Amant F, Vergote I, Bourne TH. Subjective assessment of adnexal masses with the use of ultrasonography: an analysis of interobserver variability and experience. Ultrasound Obstet Gynecol 1999; 13: 1116. Guerriero S, Alcazar JL, Pascual MA, Ajossa S, Gerada M, Bargellini R, Virgilio B, Melis GB. Intraobserver and interobserver agreement of grayscale typical ultrasonographic patterns for the diagnosis of ovarian cancer. Ultrasound Med Biol 2008; 34: 17111716. Guerriero S, Alcazar JL, Pascual MA, Ajossa S, Gerada M, Bargellini R, Virgilio B, Melis GB. Diagnosis of the most frequent benign ovarian cysts: is ultrasonography accurate and reproducible? J Womens Health 2009; 18: 519527. Valentin L, Hagen B, Tingulstad S, Eik-Nes S. Comparison of pattern recognition and logistic regression models for discrimination between benign and malignant pelvic masses: a prospective cross validation. Ultrasound Obstet Gynecol 2001; 18: 357365. Valentin L, Ameye L, Jurkovic D, Metzger U, L ecuru F, Van Huffel S, Timmerman D. Which extrauterine pelvic masses are difcult to correctly classify as benign or malignant on the basis of ultrasound ndings and is there a way of making a correct diagnosis? Ultrasound Obstet Gynecol 2006; 27: 438444. Sokalska A, Timmerman D, Testa AC, Van Holsbeke C, Lissoni AA, Leone FP, Jurkovic D, Valentin L. Diagnostic accuracy of transvaginal ultrasound examination for assigning a specic diagnosis to adnexal masses. Ultrasound Obstet Gynecol 2009; 34: 462470. van den Akker PA, Aalders AL, Snijders MP, Kluivers KB, Samlal RA, Vollebergh JH, Massuger LF. Evaluation of the Risk of Malignancy Index in daily clinical management of adnexal masses. Gynecol Oncol 2010; 116: 384388. Raza A, Mould T, Wilson M, Burnell M, Bernhardt L. Increasing the effectiveness of referral of ovarian masses from cancer unit to cancer center by using a higher referral value of the risk of malignancy index. Int J Gynecol Cancer 2010; 20: 552554. Alcazar JL, Royo P, Jurado M, M nguez JA, Garc a-Manero M, R, Lopez-Garc Laparte C, Galvan a G. Triage for surgical management of ovarian tumors in asymptomatic women: assessment of an ultrasound-based scoring system. Ultrasound Obstet Gynecol 2008; 32: 220225. Yazbek J, Helmy S, Ben-Nagi J, Holland T, Sawyer E, Jurkovic D. Value of preoperative ultrasound examination in the selection of women with adnexal masses for laparoscopic surgery. Ultrasound Obstet Gynecol 2007; 30: 883888. Van Holsbeke C, Daemen A, Yazbek J, Holland TK, Bourne T, Mesens T, Lannoo L, De Moor B, De Jonge E, Testa AC, Valentin L, Jurkovic D, Timmerman D. Ultrasound methods to distinguish between malignant and benign adnexal masses in the hands of examiners with different levels of experience. Ultrasound Obstet Gynecol 2009; 34: 454461. Alcazar JL, Castillo G, Jurado M, Garc a GL. Is expectant management of sonographically benign adnexal cysts an option in selected asymptomatic premenopausal women? Hum Reprod 2005; 20: 32313234.
21. 22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.