The Foresight AI Model employs information derived from records of hospitals and family practitioners across the UK
Hannah McKay/Reuters/Bloomberg via Getty Images
The developers assert that an AI model trained with medical records of 57 million individuals through the UK’s National Health Service (NHS) could eventually assist physicians in anticipating illness and hospitalization trends. Nonetheless, other academics express significant concerns over privacy and data protection associated with the extensive utilization of health data, acknowledging that even AI developers are unable to ensure the absolute protection of sensitive patient information.
This model, branded as “Foresight,” was initially created in 2023. Its first iteration leveraged OpenAI’s GPT-3, the prominent language model (LLM) that powered the original ChatGPT, using 1.5 million authentic patient records from two hospitals in London.
Recently, Chris Tomlinson from University College London and his team broadened their objectives, claiming to develop the world’s first “national generative AI model for health data” with significant diversity.
Foresight utilizes Meta’s open-source LLM, LLAMA 2, leveraging eight distinct datasets of medical information routinely collected by the NHS between November 2018 and December 2023, including outpatient appointments, hospital visits, vaccination records, and other relevant documents.
Tomlinson notes that his team has not disclosed any performance metrics for Foresight, as it is still undergoing evaluation. However, he believes that its potential extends to various applications, including personalized diagnoses and forecasting broader health trends such as hospital admissions and heart conditions. “The true promise of Foresight lies in its capacity to facilitate timely interventions and predict complications, paving the way for large-scale preventive healthcare,” he stated at a press conference on May 6.
While the foreseeable advantages remain unsupported, the ethical implications of utilizing medical records for AI learning at this magnitude continue to raise alarms. Scholars argue that all medical records undergo a ‘degeneration’ process before integration into AI training, yet the risk of re-identifying these records through data patterns is well-established, especially in expansive datasets.
“Creating a robust generative AI model that respects patient privacy presents ongoing scientific challenges,” stated Luc Rocher at Oxford University. “The immense detail of data advantageous for AI complicates the anonymization process. Such models must operate under stringent NHS governance to ensure secure usage.”
“The data inputted into the model is identifiable, so direct identifiers will be eliminated,” remarked Michael Chapman, who oversees the data fueling Foresight, in a speech at NHS Digital. However, he acknowledged the perpetual risk of re-identification.
To mitigate this risk, Chapman explained that AI functions within a specially created “secure” NHS data environment, guaranteeing that information remains protected and accessible solely to authorized researchers. Amazon Web Services and Databricks provide the “computational infrastructure,” yet they do not have access to the actual data, according to Tomlinson.
Regarding the potential to expose sensitive information, Yves-Alexandre de Montjoye from Imperial College London suggests evaluating whether a model can retain the information it encounters during training. When asked by New Scientist whether Foresight has undergone such testing, Tomlinson indicated that it has not, but they are contemplating future assessments.
Employing such an extensive dataset without engaging the public regarding data usage may erode trust, cautions Caroline Green at Oxford University. “Even anonymized data raises ethical concerns, as individuals often wish to manage their data and understand its trajectory.”
Nevertheless, prevailing regulations offer little leeway for individuals to opt out of the data utilized by Foresight. All information incorporated into the model emanates from NHS datasets gathered on a national scale and remains “identified.” An NHS England representative stated that the existing opt-out provisions do not apply, asserting that individuals not wishing to share their family doctor data will not contribute to the model.
As per the General Data Protection Regulation (GDPR), individuals should retain the option to withdraw their consent concerning personal data usage. However, training methods involving LLMs like Foresight make it impossible to eliminate a single record from an AI tool. An NHS England spokesperson commented, “The GDPR does not pertain since the data utilized to train the model is anonymized, and therefore we do not engage with personal data.”
While the complexity of GDPR concerning the training of LLMs presents novel legal issues, the UK Information Commissioner’s Office indicates that “identified” data should not be viewed as equivalent to anonymous data. “This perspective arises because UK data protection laws lack a definition for the term, which can lead to misunderstanding,” the office emphasizes.
Tomlinson explains that the legal situation is compounded as Foresight is only engaged in studies pertaining to Covid-19. This means that exceptions to data protection laws instituted during the pandemic remain applicable, points out Sam Smith from Medconfidential, a UK data privacy advocacy group. “This Covid-specific AI likely harbors patient data, but such information cannot be extracted from the research environment,” he asserts. “Patients should maintain control over their data usage.”
Ultimately, the conflicting rights and responsibilities surrounding the utilization of medical data in AI developments remain ambiguous. “In the realm of AI innovation, ethical considerations are often overshadowed, prompting a reevaluation beyond merely initial parameters,” states Green. “Human ethics must serve as the foundational element, followed by technological advancements.”
The article was updated on May 7, 2025
Corrections regarding the comments made by the NHS England spokesperson were duly noted.
Topics:
Source: www.newscientist.com
Discover more from Mondo News
Subscribe to get the latest posts sent to your email.