Radiomic machine learning: is it really a useful method for the characterization of prostate cancer?
We read with interest and greatly appreciated the article by Dr Bonekamp and colleagues (1) and the editorial by Dr Choyke (2) in the October 2018 issue of Radiology. Dr Bonekamp and colleagues (1) compare radiomic machine learning (RML) and mean apparent diffusion coefficient (ADC) against qualitative assessment based on the Prostate Imaging Reporting and Data System (PI-RADS) for characterizing prostate cancer. In their study, mean ADC and RML were better than qualitative assessment at classifying suspicious prostate lesions as clinically significant prostate cancer.
We believe that these findings, though interesting and promising, should be considered with care and must be further evaluated in future studies. Two points deserve further consideration. First, Dr Bonekamp and colleagues state that PI-RADS assessment was the result of clinical consensus from a panel of eight radiologists; however, they do not provide details on how that consensus was reached (eg, majority voting, weighted voting, or something else). Second, it is
surprising that for all PI-RADS, mean ADC, and RML, the sensitivity was higher in the test than in the training set. This is not what one would reasonably expect and might be the result of improper study design and/or data analysis. The authors conjecture that this outcome could be the consequence of the “radiologists’ learning curve since the introduction of the PI-RADS version 2 system” and “a larger number of patients with small solitary lesions in the training cohort.” Both explanations, however, seem to be major sources of bias to the whole study.