Breast cancer is the most common cancer in women worldwide. Early detection and treatment can lower mortality rates. But clinicians still fail to identify breast cancer about 20 percent of the time (false-negative results). Clinicians also identify cancer, when there is no breast cancer present (false-positive results). Studies suggest 7-12 percent of women will receive false positive results after one mammogram and after 10 years of annual screening, more than half of women will receive at least one false-positive recall.

False-negative results provide a false sense of security and could ultimately hinder treatment effectiveness. False-positive results can cause anxiety and lead to unnecessary tests and procedures. Another hurdle in identifying breast cancer is a shortage of radiologists needed to read mammograms.

Researchers have developed an AI system that surpasses human experts in breast cancer identification. Their study results were recently published in the journal Nature.

We show an absolute reduction of 5.7% and 1.2% (USA and UK) in false positives and 9.4% and 2.7% in false negatives. We provide evidence of the ability of the system to generalize from the UK to the USA. In an independent study of six radiologists, the AI system outperformed all of the human readers…We ran a simulation in which the AI system participated in the double-reading process that is used in the UK, and found that the AI system maintained non-inferior performance and reduced the workload of the second reader by 88%.

The study results are promising. The AI system outperformed six radiologists and also lowered missed cancer diagnoses on the U.S. sample by 9 percent and mistaken readings of breast cancer by 6 percent. It also produced results across populations, something many AI systems have yet to produce. The researchers didn’t go as far as to suggest their AI system would replace humans.

The optimal use of the AI system within clinical workflows remains to be determined. The specificity advantage exhibited by the system suggests that it could help to reduce recall rates and unnecessary biopsies. The improvement in sensitivity exhibited in the US data shows that the AI system may be capable of detecting cancers earlier than the standard of care. An analysis of the localization performance of the AI system suggests it holds early promise for flagging suspicious regions for review by experts.

Beyond improving reader performance, the technology described here may have a number of other clinical applications. Through simulation, we suggest how the system could obviate the need for double reading in 88% of UK screening cases, while maintaining a similar level of accuracy to the standard protocol. We also explore how high-confidence operating points can be used to triage high-risk cases and dismiss low-risk cases. These analyses highlight the potential of this technology to deliver screening results in a sustainable manner despite workforce shortages in countries such as the UK.

At the same time, it becomes more difficult to make the case for approaches that are exclusively human. It is hard to imagine patients, insurance companies, and others won’t demand AI systems augment what humans are doing. This is especially true in healthcare. But will also likely become increasingly true in other domains. What tasks would you want humans to do alone if you know that you can get better results (greater accuracy, faster, etc) when human capability is augmented with AI systems.

Humans will need to learn how to incorporate these type of AI systems into their workflow. The next big step for AI seems to be “operationalizing AI.” This is likely a decade in the works, but slowly you will see individuals figuring how to best work within environments that are being redefined by AI systems.