Stethoscope on a keyboard


IBM Watson recommended ‘unsafe’ cancer treatments, according to internal documents

The Watson AI isn’t ready to replace physicians just yet.


Christina Bonnington


IBM’s Watson AI is a spectacular feat of modern technology. The AI is taking over a number of human jobs in areas such as insurance and law where a computer can pore through documents much faster than a human employee can. A version of Watson was also trained to be an interactive resource and companion on the International Space Station.

Watson has also been trained in medical fields such as oncology, but a recent report suggests that Watson recommended unsafe cancer treatments, raising questions about the readiness and accuracy of the AI.

The data, obtained by Stat News from slide decks shared by IBM Watson Health’s deputy chief health officer last summer, revealed that IBM Watson suggested “unsafe and incorrect treatment recommendations” for cancer treatment. This specifically had to do with IBM’s Watson for Oncology product, which customers said was “often inaccurate.”

According to the internal slide deck, the problem stems from this Watson product being trained on a small number of “synthetic,” hypothetical cancer patients, rather than real-world cases. The accuracy of any AI is dependent on being trained on a large dataset—the larger and more accurate the dataset, the better. Watson’s recommendations were based on expert advice from specialists on each cancer type but not masses of actual cancer treatment cases. The result has been recommendations that are not on part with national treatment guidelines, according to Stat News.

Watson for Oncology is designed to save physicians time by allowing them to spend less time poring through literature and instead “provide clinicians with evidence-based treatment options based on expert training by Memorial Sloan Kettering physicians.” It’s used by 230 hospitals across the globe to help treat 13 types of cancers. In a study published in January 2018, Watson for Oncology was found to be in accordance with one hospital’s multidisciplinary tumor board in more than 90 percent of breast cancer treatment decisions.

Memorial Sloan Kettering, which began working with IBM to train Watson in this area in 2012, told Stat News that it believes the “unsafe” cancer recommendation in the presentation was part of IBM’s system testing—it was not given to an actual patient. Memorial Sloan Kettering also believes that training IBM’s system on synthetic cases is better than using historical cases. “The speed at which standards of care have changed require a more dynamic approach than historical data can provide because historical cases do not necessarily reflect the newest standards of care,” Memorial Sloan Kettering said in a statement. (However, in both the slide decks and IBM’s website, it says that Watson for Oncology is being trained with real patient data.)

IBM is continuing to hone the product, making it more robust and more accurate, including embarking on a pilot program with Cota Healthcare. This should give Watson greater access to real-world outcomes in patient cancer treatments and allow the system to identify and rank various treatment options, which should be a useful tool for clinicians—as long as it’s accurate.

Stat News’ investigation into Watson for Oncology highlights the difficulties we face as we try to offload human knowledge and research into an AI.

Update 3:03pm, June 26: A spokesperson for IBM provided the following statement:

“We have learned and improved Watson Health based on continuous feedback from clients, new scientific evidence and new cancers and treatment alternatives. This includes 11 software releases for even better functionality during the past year,including national guidelines for cancers ranging from colon to liver cancer. We remain absolutely committed to Watson Health to give providers and professionals the technology and expertise to help transform health for people everywhere.”


H/T Stat News

The Daily Dot