If the hype is anything to go by, we’re entering a bright future where “Artificial Intelligence” will revolutionize medicine. This post provides some counterpoints to that hype, focusing on three issues:

  1. patient privacy
  2. data sovereignty
  3. professional knowledge and its exploitation

AI privacy - image (1024x1024) generated using Stable Diffusion v2.0 (Linux, NVIDIA RTX 3090, 24GB) > prompt: creepy humanoid AI doctor looking at medical files, Steps: 20, Sampler: Euler a, CFG scale: 7, Seed: 403319539, Size: 512x512, Model hash: de2f2560, Batch size: 4, Batch pos: 0

Some preliminary notes

  • This post will avoid the use the terms “AI” and “Artificial Intelligence” as they are factually inappropriate to describe machine learning models.
  • In my opinion, the field of machine learning should reserve the term “Intelligence” for systems displaying a capacity to grasp concepts and present some degree of agency. These characteristics are the cornerstones of “strong AI”, which no systems to date have demonstrated.
  • Therefore, the term “machine learning” better describes the models currently in use.

The current state-of-play

Medical data is valuable, both intrinsically and as a commodity. Many technology and medical companies are interested in collecting, collating and using this data to train Large Language Models (LLMs). The aim of this collection is to create domain specific LLMs and then sell these back to the medical industry. It is important to understand what the current models are capable of.

Firstly, they are deterministic models (given the same inputs they will return the same output). Most models relevant to medical uses are based on deep neural networks and utilise unsupervised learning. Essentially the model ingests sequences of words and builds up a statistical model (via weights in layers of a neural network) of the likelihood of a word (or technically a sequence of letter chunks, a “token”) following another in sequence. Due to the very large scale of these models (usually in the billions of parameters and trained on trillions of words), the models can produce output which appears similar to human writing. The interface with the model is usually via a “prompt” which the model then continues. However, these very large models can appear to answer questions by being prompted (usually by a hidden prompt before the user prompt) or produce other types of output (summary medical records for instance). For a good description of how LLMs work, see this Ars Technica Article.

The main innovation over the last 2 years has been an increase in accuracy and flexibility of these models. This is due to the immense size of the training sets and the availability of sufficient computer power to train the models and provide responses to prompts (a process called “inference”). Most of the available models are proprietary and closed source, with some notable open-source alternatives (see https://en.wikipedia.org/wiki/Large_language_model and https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).

Some of these models can be run on consumer hardware (for inference), usually Linux systems with sufficient GPU memory, but most of the higher performing models require significant resources to run. For instance, the author has experimented with the LLAMA 30B parameter model using an NVIDIA 3090 (24GB VRAM). This model produces results similar in quality to GPT3. Training models requires vastly greater computing power, typically thousands of servers with multiple GPUs each. Therefore, the models are highly valuable, often taking 10s of millions of dollars to train from scratch.

Interface of LLMs with the health sector

There are a number of companies using machine learning to augment medical records. Currently, these include three main types of services:

  1. summaries from electronic medical records
  2. speech-to-text and subsequent summary of doctor/patient interactions
  3. machine learning augmented search of health service records

Administrators in health services are often lured by the promise that these models can improve efficiency of their service, by reducing time taken to collect data, write notes and analyse trends. Some companies claim the use of machine learning in patient journeys can also reduce adverse events and shorten admissions.

These stated benefits have led to a number of trials and rollouts of machine learning assisted services in Australia and New Zealand, with little to no consideration of patient privacy or safety issues.

Examples include:

  • Mackay Base Hospital trial of Dragon Medical One (Nuance, owned by Microsoft) - PDF, promotional nuance.com
  • The Orchestral platform from Orion Health, used in Clinical Workstation software
    • services including Cancer Institute NSW, ACT Health, Tasmanian Department of Health, Northeast Health Wangaratta in Victoria and in NSW, Justice Health & Forensic Mental Health Network for its DCR “JHeHS”, and eHealth NSW’s HealtheNet system.
    • machine learning focused on generating patient summaries using Pieces Technologies (based in US)

A history of privacy breaches

There has been a history of inappropriate use of patient data. In a particularly illustrative case study, the Royal Free London NHS Foundation Trust passed 1.6 million health records to the Google subsidiary DeepMind DeepMind faces legal action over NHS data use, BBC News, Oct 2021. Subsequent inquiries found that patients had not given consent to their records being shared with Google.

Another example was the leak of personally identifiable data by Cense AI. 2.5 million records, which appeared to be sourced from insurance companies, were available unsecured online, compromising personal information including medical records and notes.

In both these cases, it is clear that patients never provided consent for their medical records to be used for ingestion and processing by US based companies. This raises issues for how consent is sought as well as the importance of retaining medical records within their home jurisdiction (data sovereignty).

A critical factor in these breaches was the involvement of health services in pursuing agreements with foreign companies without sufficient oversight, de-identification or informed consent of participants. This trend of privacy breaches is often masked by companies arguing that the technologies are a data service rather than a medical service, a position that is increasingly untenable when machine learning is being used to process medical notes.

The future of medical LLMs, providing domain specific knowledge

While the incentive to provide medical records to machine learning companies is often driven by process-improvement aims, it is clear that the motivation for machine learning providers is broader. To train LLMs, the most valuable resource is input data sets. In this case the data are medical notes and medical interactions. For an LLM to be able to produce useful medical summaries and diagnostic information, millions of records need to be ingested. It is also important that these records are consistent, accurate and in a standardised format. Therefore, the corpus of patient records available in health services presents a rich mine of training data.

Agreements between health services and machine learning providers usually fail to consider the value of these records and that they are fundamentally the property of patients, held in trust by the health service. Therefore, patients should have a say in whether their data is used to contribute to proprietary models for the profit of private companies, often with dubious privacy standards.

What can clinicians do?

With the current state of agreements between health provider and machine learning companies, I believe it is important that patients are given the opportunity to opt-in to providing their medical records for machine learning purposes. Clinicians should be aware of the abuse potential of data aggregation and avoid the use of any technologies which record, transpose and aggregate patient data, especially if such services move data across jurisdictional boundaries.

Trials of machine learning technologies in health care should always be accompanied by a full human ethics review and health services need to be aware of the data ownership, use and re-use provisions for the data they handle in-trust.

It is certainly possible that machine learning may contribute positive benefits to patient care, but I would argue that the benefits of these models should be shared by patients by having open-source models and data sets that are handled in the public benefit, with the ability to audit the data sets for bias and accuracy. Currently the models are handled as proprietary data which is stark contrast to the expectations which patients rightly have regarding their medical records.

References