Monitoring AI Systems under the AI Act within Medical Devices

First a brief note regarding the AI Act itself, which has now passed its final approval vote by the European Parliament. A new version of the text has now officially been made publicly available. Next steps include a final legal review and final adjustments (e.g. structuring the articles, checking correctness of references), translations and publication in the official journal are pending. On the 5th of April, 2017 the MDR was officially voted for positively in the EU Parliament to see the official publication of the MDR by the end of May 2017. Noting this timeline, depending on the speed, we may see the AI Act coming into force at the end of April or May. 

Artificial Intelligence is subject to bias, bias that is introduced when training and developing the algorithms. It may go undetected for a variety of reasons, for example lack of data, lack of understanding, inappropriate testing etc. Within the medical field, there are various examples of bias. In diabetic retinopathy, severe imbalance in data to train a model, has shown a strong gap in the diagnostic accuracy for light-skinned vs dark-skinned subjects. Similarly, models in chest radiography for pathology classification have a higher rate of underdiagnosis in underserved sub-populations, e.g. black patients. Clearly such biases may send patients home without receiving the required care (Ricci Lara et al, 2022). Ideally such biases are addressed by having available diverse heterogeneous datasets, with sufficient representation of each sub-group, and careful testing of sub-groups. Nevertheless, biases may continue to exist, and this raises the need for Post Market Monitoring for biases, to detect these at an early stage and determine mitigation strategies (e.g. retraining). 

Recently, we have seen Google taking its Gemini Image Creator offline for not meeting its intended purpose, and seemed to be plagued by a bias most likely introduced when trying to correct for a bias, which is the lack of diverse data. When reviewing the news around Google Gemini Image Creator, it seems that Google attempted to address bias by hard-coding requirements into the system to ensure that the created images were diverse. Consequently, the image creator started generating diverse images, such as historical images of George Washington, visualizing him as an african american (refer to the image below from the referenced Article from Mypost.com). Clearly, this is an incorrect representation of history. Whilst the intention may be plausible, the outcome is slightly worrisome, e.g. as demonstrated in this article. The feedback has been received at Google (clearly) and the system was taken offline. Note, that their AI would not necessarily be considered High-Risk AI, or be subjected to the same requirements as would be applicable to e.g. medical devices.

Monitoring of High-Risk AI Systems

High-Risk AI Systems, those that could significantly (adversely) affect health, safety and or fundamental rights need to be controlled prior to entering the market. As explained in earlier blog posts, such controls include the execution of Risk Management, Quality Management and Technical Documentation.

In addition to the premarket controls, the AI Act lays down requirements for providers, and to some extent deployers of High-Risk AI systems for during the post-market stages.  

What needs monitoring?

Title VIII of the AI Act is dedicated to Post-Market Monitoring, information sharing and market surveillance. Chapter 1 of Title VIII addresses Post-Market Monitoring, Chapter 2 Sharing of information on serious incidents, and Chapter 3 Enforcement. 

So far, Chapter 1 regarding Post-Market Monitoring is oddly lean in comparison to the other sections of the AI Act, and as a reader of the AI Act, you are left with many questions with regards to the expectations. Perhaps, the biggest question of all is related to Article 61.3, where the contents of the Post-Market Monitoring plan are ‘explained’. The explanation may answer the question of why there is little further detail, because it ‘outsources’ the determination of the contents of the Post-Market Monitoring plan to the European Commission, to be published no later than 6 months prior to ‘entry into application’ of the AI Act. Article 17.1(h) regarding the Post-Market Monitoring requirements in the Quality Management System provides little further information.

In continuation of the above, the period specified is rather later, since for most High-Risk AI devices (those in Annex III), the transition period from ‘entry into force’ to ‘entry into application’ is only 2 years. This will leave organizations (specifically those covered by Annex III, point 1) who potentially (in the absence of harmonized standards) are required to obtain CE Marking with the review of a Notified Body potentially in a difficult situation since it is unclear what should be covered when they will submit their application. 

For devices (such as medical devices and in-vitro diagnostic devices) covered by Annex II Section A, the Post-Market Monitoring plan may be integrated into existing plans required by the respective legislation ensuring an equivalent level of protection. 

AI System performance

Article 61.2 does point out some more detail regarding the monitoring activities, one of them being the need to collect information allowing the provider to assess the performance and compliance against the requirements set out in Title III, Chapter 2, of the high-risk AI throughout its lifetime. Another requirement is the need to monitor the interaction of the AI System with other AI systems (where relevant). For example, if multiple AI systems would be used within the same care pathway of a patient. 

Information may be collected by the deployer on behalf of the provider. 

So, that’s it? Wait for the European Commission?

Clearly, waiting for the European Commission is not the answer on your road to evidence compliance with the AI Act. Some of the requirements are hidden throughout the AI Act for high-risk AI systems when it comes to Post-Market Monitoring. 

1. Article 9 Risk Management (Article 9(c) & 9.8)

Risk management is not limited to the pre-market stages of any product. Once a product (AI or non-AI) is brought onto the market, risks previously unknown to the organisation may arise. These risks may relate to security, safety and or fundamental rights. Organisations should ensure a Post-Marketing Monitoring system that systematically evaluates information from the market, both reactive (e.g. systematically documenting feedback such as complaints, and analysing trends), and pro-active (e.g. obtaining feedback from deployers and end-users in the field) to identify risks and evaluate the need for further risk mitigation. 

Particular attention should be paid to the actual risks associated with groups that may be adversely affected such as those aged under 18, or other vulnerable groups of people (as applicable).

2. Bias detection (under Article 10.5)

Although not explicitly stated in the AI Act in regards to Post-Market Monitoring, section 5 of Article 10 notes that providers may for the purposes of detecting bias and correcting bias, process special categories of data. Providers of AI Systems may not be capable of detecting all forms of bias during the pre-market stages due to the lack of data. Consequently, the bias detection may need to continue during the post-market monitoring period (e.g. also to address data drift risks). 

The AI Act will allow organisations to collect such data without having explicit consent of data subjects when such collection is done to detect and correct potential biases only, under the condition the requirements of Article 10.5 are met. 

3. Record-Keeping (Article 12)

Article 12 demands that providers set up a logging system, which can log events: 

  • That identify situations where the AI system presents risks within the meaning of Article 65(1) (i.e. risks to health and safety) or substantial modifications of the AI System;

  • To facilitate the Post-Market Monitoring

  • Monitoring of the AI System in line with Article 29(4) (i.e. risks to health and safety);

  • The period of use of the system (start & end date and time of each use);

  • Reference database against which input data has been checked by the AI system;

  • The input data for which the search led to a match;

  • Identification of the persons involved in the verification of the results.

4. Human Oversight (Article 14)

Although not called out in article 14, human oversight measures need to be determined and likely explained in the Post-Market Monitoring plan and communicated to the user through the Instructions for Use. Clearly the Post-Market Monitoring plan should specify what level of human oversight is required, and the information to be collected either automatically by the system (through logging) and reviewed by a human or the collection and review directly by the human user of the AI system

Perhaps we can expect that the risks determined pre-market which may require human oversight are to be called out in the Post Market Monitoring Plan, including the relevant risk controls. 

As an example, a tool that has a risk to discriminate against men, due to underrepresentation in the training dataset and the potential for an inherent bias in the testing dataset, may require a human to monitor the AI system’s performance as specified in the Post Market Monitoring plan (e.g. specific data collection and analysis of post market data on male subjects). 

In addition per Article 14. 4 (d) & (e) it should be specified what actions need to be undertaken if the AI system were to derail, and what is monitored to detect such detailing, and what human actions need to be undertaken. 

5. Role of deployers (article 29)

Per Article 29, there is an additional role for the deployer of the AI system, who is tasked to monitor the AI system based on the instructions for use provided by the provider. This requires the provider clearly to consider the role of the deployer when constructing the instruction for use, and considering what the provider would need to monitor in terms of post-market monitoring. Clearly, this should include considerations with regards to the detection of risks to safety or health of the data subject and also with regards to potential implications on fundamental rights, the deployer is tasked with an information duty towards the provider, distributor and relevant market surveillance authorities. 

The tasks to be performed in the view of the provider by the deployer will need to be explained in the Post-Market Monitoring plan, including communication between the parties, and potentially other Economic Operators (e.g. authorised representatives and distributors).

6. Notification of serious incidents

Serious incidents need to be reported to the market surveillance authorities (yet to be defined per member state) where the incident occurred. Under the AI Act, a serious incident is considered: 

“any incident or malfunctioning of an AI system that directly or indirectly leads to any of the following: 

(a) the death of a person or serious damage to a person’s health; 

(b) a serious and irreversible disruption of the management and operation of critical infrastructure; 

(c) breach of obligations under Union law intended to protect fundamental rights; 

(d) serious damage to property or the environment. “

Such an incident shall be reported within 15 days after establishing a (reasonable) causal link between the incident and the AI System. If there is a widespread infringement or a serious incident regarding (b) above, the incident is to be reported immediately and not later than 2 days after becoming aware of the incident. If a person has died and there is a suspected causal relation with the AI system, it shall be reported within 10 days after becoming aware of the incident. 

For devices covered under the MDR or IVDR, only the incidents per (c) need to be notified to the competent authorities of choice by each member state. For some member states this may be the competent authority within the healthcare sector, for others it may be a dedicated AI competent authority, or potentially another body. Interestingly, incidents under (c), may include risks to health and safety, security and fundamental rights all at the same time. Which may make it complex to understand who to report to and within what timelines. For example, bias may be associated with an unanticipated lower accuracy for a specific sub-group, and consequently they would be subjected to indirect harm, whilst at the same time access to healthcare for a sub-group might be affected (fundamental rights). Finally, this may be the result of a data breach (could become apparent at a later stage), caused by a security breach where an attacker has altered the AI model, consequently altering its performance (data breach & security incident). Manufacturers of AI enabled medical devices will need to carefully review the requirements of each legislation and define clear reporting procedures.

Although there are unknowns with regards to the expectation of the European Commission towards Post-Market Monitoring requirements, the AI Act clarifies items that may need to be included in those plans. It is recommended to consider those aspects and clearly explain in Post-Market Monitoring plans what information will be collected, how such information will be analyzed and who is responsible for such analysis. 

Another important aspect will be to clarify the Post-Market Monitoring actions relevant to the risks identified in the risk assessment. Specifically, the manufacturer should monitor for potential unknown biases during the Post-Market stages.

About the Author
Leon Doorn
Independent Consultant