Facebook

Safety-Critical Systems: Software Safety in Healthcare (Level M)

Abstract:

This assignment examines the challenges faced by healthcare systems and clinical software because they have lot of risks. In the absence of a standardized method of risk management, the assignment explores the effects of hazards and risks in healthcare systems and software and explores methods to overcome them. For this an existing case of Therac-25 mishap is identified and the risk methods are studied using the method of FTA tool. The effects of risks and hazards in software and devices are studied and results are summarized to emphasise risk management as an inherent process in clinical systems development.

Contents


Introduction


Healthcare Systems


Risks in healthcare systems


Method


Results


Conclusion


References


Introduction

The assignment explores safety aspects in the use of software systems in healthcare setting. Mosby’s medical dictionary explains health care as a complete network that encompasses facilities, agencies and provides of health care in a location. In healthcare, nursing services are built-in at all patterns of healthcare and nurses form the largest number of providers in a healthcare system (Mosby 2009). A healthcare system refers to an organized plan to provide health services to patients. Health care systems make use of computers and IT which help in providing healthcare services. In general the components of healthcare systems include healthcare services at personal level, available from hospitals, clinics, health centre, etc. Health care services are also provided in the form of public health services to maintain a healthy environment, research activities are conducted in disease prevention and detection and treatment of the disease. Health care also can imply the insurance that covers health system services (Miller-Keane 2003). Health care systems measure critical parameters in the human body for further diagnosis or for monitoring the health of a person. These systems are made of complex hardware, sensors and software that support clinicians in diagnosis and in further therapy or procedures. Hence, the healthcare systems must be error free and must not pose a risk to the patient in its working.

Healthcare Systems

The critical analysis of the governance mechanism in Nigeria raises serious questions about how the governance practices could be made more sustainable and be able to contribute to the development of the society. Strengthening of the democratic institutions like the Legislature, Judiciary is the way forward to improving the governance structure. If the state institutions are weak then they cannot secure the lives of the people and ensure their development. The electoral process in Nigeria is highly rigged and due to this efficient operation of the governance mechanism is not happening as desired. The creation of institutions that would uphold transparency and law is required. The structural adjustments must be made to the public institutions and positions of power in the government so that corruption is reduced to a large extent and public goods reach the intended recipients. The democratic principles of governance must therefore be deepened to improve the system.

Risks in healthcare systems

The use of IT in health care is intended to reduce medical errors and to improve health outcomes of patients. Health care information systems are developed by many vendors and they are intended for use in health centres (hospitals) for the purpose of reducing errors in prescribing medicines, maintaining patient diagnostic information and to improve health for patients. Health care systems aide in the management of health related data such as electronic scanning, storage and so on, these systems are intended to maximize productivity of administration, support in insurance claim processing and centralize electronic records for reference and management. Currently, medicines are evolving in terms of complexity and have moved beyond human retention capacity. This is because there are thousands of drugs, diagnosis and medical diagnosis, surgical procedures available today, health care information systems help medical professionals in providing information related to patient care and service quickly (Finnegan 2015).
DelVecchio (2014) explains that Clinical Decision Support Systems (CDSS) is an application that helps to analyse data and support healthcare providers to take clinical decisions quickly. CDSS is an extended form of DSS used in business management for management decision making. CDSS systems work by the method of using a knowledge base. Rules are applied to the knowledge base by the medical professional. The system uses an inference engine to extract results based on rules from the knowledge database and presents it to the user. In spite of these advantages in health care systems, there are many disadvantages and issues faced by IT personnel and drawbacks faced by medical professionals in relying information provided by healthcare software systems. A software designed inadequately might offer inaccurate diagnostic advice and delay interventions for medical professionals and clinicians. Clinicians have the major responsibility if an incorrect advice is followed based on software. In this assignment the aspects of safety in health care systems, devices and applications are assessed for their risks and methods explored to overcome the risks of incorrect results. For this an existing case of Therac-25 is analysed and discussed for its risks and discussions provided on overcoming them.

Risks in healthcare systems

Cammack et al, (1999) provided four steps for identifying risks in healthcare systems. They are:
Identifying hazard: In this step the health condition is determined for adversity, for example the use of chemicals, thyroid dysfunction, cancer, birth defects, etc. Daniels et al, (2005) explained that hazards are identified by the method of epidemiological investigation that has an effect on human population.

Assessment of dose response: Dose response is the amount of agent that will not cause an effect. In dosage there is an upper limit or allowable limit (Gad and McCord 2008) based on patient’s condition, duration of contact and level of toxicity in the agent (Cammack et al 1999).

Exposure assessment: This risk estimates the quantifiable dose of human exposure to an available agent. Duration, frequency and intensity of exposure is included in the estimation. This assessment includes the subject’s exposure to commercial products like pesticides, cosmetics or medications.

Risk characterisation: In this final step, risk assessment includes evaluation of results in the earlier three steps. This step measures the total risk for the human due to exposure of the agent. To determine the agent’s safety and to check the health effects for adverse conditions the allowable limit is compared with estimated limit (Gad and McCord 2008).

The levels of exposure and hazard are assessed in direct proportion in risk estimation of health care systems (Dumbrique 2010).
The next step in risk analysis is to consider software failures that cause hazards. Risk analysis is done at systems analysis stage that is, when role of the software is defined. In this case much of the issues or drawbacks of software cannot be determined up till implementation. In some cases rework is possible, and some of the causes for software risks are lack of cohesion, coupling or unable to carry out identified mitigations. The team involved in designing the software must look into possible risks at the design stages. In some instances software failure can be presented by a hardware wrapper (Schmidt 2007).
The risks in healthcare systems are understood by a case example. Therac-25 is a computer controlled radiation therapy machine and is a medial linear accelerator. The system generates high energy beams that can destroy tumours in humans without damaging the surrounding tissue. Therac-25 accelerates electrons or photons (x-rays) and there are 11 machines installed in USA and Canada. In 1982 the commercial version of Therac-25 was available and used in hospitals for cancer treatment. During 1985-1987 there were six known accidents due to Therac-25 and this was mainly due to excessive overdose which killed patients or severely injured them (Miller 1987). Investigations were made by engineers to understand the cause and these were traced to conditions in the reading operator unit. The safety of software used in the system was also investigated thoroughly. In addition to software testing for errors and risks in operations, the errors caused by malfunctioning hardware resulted in incorrect execution and there were random errors due to noise.
Leveson (2005) explains that the software is made up of four main elements namely, stored data, scheduler, critical and non-critical tasks and interrupt services. Concurrent access to shared memory was allowed by software and the system does not have real synchronization from data stored in variables that are shared. Investigations revealed that safety analysis of the system did not include software (Jones et al 2001). There were assumptions made like software will automatically replace hardware interlock facility. For instance the electron beam is operable in the correction position of collimator. Leveson and Turner (1993) explained that the system had inadequate monitoring, like the saturation of ion chambers were not indicated and there the radiation was not measured. The system was quite complicated and there were also problems of operator and/or human error (Hyman 2002). These were the issues faced by Therac-25 healthcare system in its operations. This case explains the errors resulting out of not conducting a comprehensive risk analysis procedure in both software and hardware. Standard such as ISO14971 explain the lack of consideration of risks in software. Healthcare systems involve complex hardware and software and hence risk analysis is critical prior to its implementation (Schell 2004). The draw backs or issues in health care systems are investigated with the method of Fault Tree analysis.

Method

The risk assessment tool chosen in this assignment is Fault Tree Analysis (FTA). FTA is very useful for analysing potential mishaps which can vary from events that are either adverse or sentinel or never. In healthcare systems, risks cannot be resolved during the design stages and hence FTA is one useful tool in terms of patient safety. FTA is used in many application areas or systems for analysing diagnostic, administrative and management systems. FTA offers the tool for root cause analysis (Andersen and Fagerhaug 2006) to identify errors in intervention, medication, prevention of wrong-site surgery, human factors analysis and to prevent accidents in medical devices. A mishap is not frequent and not a routine event. A mishap is a result of multiple failures of components or elements which can be cross verified by FTA which makes it valuable in preventing errors. The technical experts by knowing the event sequences that lead to mishap and take necessary steps to make the assessment process effective. As mentioned earlier, the first step is the identification of mishap in FTA. An expert can select a mishap that is quite likely to happen or even analyse an occurred mishap with causes that are unknown.

FTA is constructed using the top down approach. The event at the top is known as ‘hypoventilation’ which refers a patient on ventilator for a condition. The events in the next level in identified by means of relationship symbols known as AND Gate or the OR Gate. FTA uses many symbols in its development. There are symbols for AND Gate, OR Gate, etc., a rectangle is used for the top event, and circle is used for component level faults. A diamond shape is used to mention an event has terminated, and triangle shape shows the branch of the tree is continuing in another place (Raheja and Escano 2009). For example, in healthcare systems it is important to identify events such as administering wrong medication, failure of a component in the system. These events are shown as a circle in FTA. In FTA there exist several modifications and symbols used in specific circumstances.

  • Malfunctions of failure of equipment. For example, a MRI may not provide accurate results
  • Faults in material, like in un-sanitised surgical trays
  • Errors by humans. Human errors may be inadequate procedure, misdiagnosis or inadequate process followed in preventing infections
  • Risks due to environmental effects such as toxic chemicals, bacteria in the air
  • Risks can also be due to inadequate policies and accountability, which is management deficiency.
  • In accuracies in health records, or communication errors, etc.
  • Errors in measurement, such as inaccurate scales used in weight of a drug

In addition to the above, FTA also includes aspects such as source of error, communication failures, patient’s mental condition, the tendency of patient to attempt suicide, inadequate information during discharge and mistakes made by indirect personnel (Raheja and Allocco 2006).

An illustration of FTA is shown in figure 1. This figure shows that the top event a fails, this may result in the failure of next level events, shown by OR Gate. It must be noted here that not all lower level events will fail at the same time. Similarly when event D in the second level fails this will result in failure of more events shown by the AND Gate.

”11”

In the Therac-25 case, it can be found that errors are due to software and device malfunctions have cost lives. The safety analysis of Therac-25 failure can be explained using FTA diagram and processes. For instance, imagine a patient at the top level who is undergoing radiation treatment in Threac-25 system. This patient is receiving the wrong electromagnetic energy beam which can be adverse. At the operator level, the diagram in figure 2 which shows the top level of the tree. This diagram illustrates the failure by user. The rectangles shown in figure point to different trees respectively whose locations are not shown a part of figure 2. The level shows the Therac-25 system is providing wrong energy output due to which the patient is having discomfort in respiration. This failure is may be due to algorithm failure, time based failure, software failure, or user provided the wrong input to result in wrong energy levels. In this case the alarm fails to evoke itself because for the system the operating conditions are ideal base on input (Hyman and Johnson 2008). Figure 2 shows two AND Gates each referring to multiple events that may go wrong simultaneously before a mishap initiated by the user occurs. For example, the respiratory therapist may choose the wrong controls for the amount of radiation, the system fails to warn. Here systematic failure is not detected to provide adequate risk avoidance measures.

”11”

In the Therac-25 case explained by Leveson and Turner (1993), FTA is applied in its investigations shown in figure 3.

It can be found that the probability of choosing wrong energy is 10-11. The investigations related to causes of failure for the computer to choose wrong mode is the probability value 4 x 10-9. In the Therac-25 system software failure was not identified and these probability values were obtained from historical performance of the Therac-25 system (Sakai et al, 2013).

Result

The safety analysis for Therac-25 identified multiple failures and AECL (Atomic Energy of Canada Limited) quantified the results using FTA. For example in the mishap of Therac-25 the top events are overdose per pulse and illegal gantry motion. Using AND/OR gates the immediate causes for the event are generated. The causes are determined using basic understanding of the system’s operations. In risk analysis for example, a basic event is defined as a quantified event. The probability value 10 -4 derived by AECL is the generic failure rate for software events. The final report by AECL explains that many of the FTA trees have computer malfunction that produces the effect of wrong energy output. The failure rate chosen for the software is dependent on the outcome of quantification, because it is difficult to justify the same for every type of error in software or system behaviour. The results of the Therac-25 case also implies that despite the fact that software uncertainties are quantified, FTA provides information that is valuable showing single and multiple paths of failure (Bowman et al, 1991). In healthcare systems, software with errors, or a malfunctioning component can be very adverse for the patient. Vellojin (2011) suggests that quantifying is quite challenging because the potential defects in software is difficult to identify. In some systems devices share common software components and this adds to the complexities in quantifying risks. US FDA (1996) provided quality system regulations for manufacturers of healthcare systems to incorporate the areas of risk management in their design. These standards provide direction to device manufacturers for establishing processes to avoid risks in their devices. Device manufaturers must consider the following safety critical aspects in software components (Wood 1999), this includes,

  • Software must be fail safe, becaue defective software can compromise safety requirements
  • Software must be tested throughly because in some healthcare systems it will directly access critical data
  • Software must support the safety of critial component calls
  • Safety mut be ensured when data is displayed by the program
  • Data available in databases, algrithms or calculations lead to information displays, software computations must be accurate
  • Data must be verified to determine for the occurrence of potential hazard
  • Patient demograhic data must be maintaind and organized without errors

These safety aspects are important to be tested in software used in healthcare systems. It is importnt to note that safety critial software might require more examination like extensive testing to ensure patient safety and accurate clinical diagnosis. In healthcare systems risk management is a cradle to grave activity that requires involvement of experts from different fields to ensue safety. It is imortant for device manufacturers to make risk management as a critical task in their product development process. The post production processes explained by ISO14971 explains that even after the device is deployed risk monitoring and manement must continue as a corporate mindset (Rakitin 2006). Healthcare systems manufacturers can consider the following points to overcome the big challenge of risks in their products:

  • The focus must be on severity and not probability of occurrence. This is one common problem. The risks must be identified to combine risks in both software components and in other components.
  • The risk identification and risk management process must start very early, during the design phase.
  • It will be useful if a hazards list is created. Hazard as defined by ISO 14971 states that potential source of hazard can be harm like damage to health, injury, or to the environment. These hazards must be foreseen by device manufacturers and quantify each of them to measure its possible consequences.
  • Understanding the clinical environment is highly important. Manufacturers along with identifying potential hazards must also understand the clinical environment in which the device will be used.
  • Creation of multidisciplinary team may be useful in many circumstances. Risk management needs an experienced team with diverse disciplines and skills. Such a team can provide value addition in the risk management and mitigation process.

Hence, it can be understood that risk management and safety analysis in healthcare systems is a broad area and must be an integral part of healthcare systems development and usage.

Conclusion

In this assignment the safety of healthcare system in clinical process is analysed and discussed. Safety in healthcare systems is extremely critical because this has become more complex due to embedded software in medical systems. Further since software engineering is an inherent human process it is always not possible to produce software with zero errors. The assignment explains benefits of healthcare systems in diagnosis and for its support in patient care and monitoring of health. The risks in health care systems are also explained from data collected over the years. The case of Therac-25 system event is examined to understand the hazards and adverse effects in the event of hardware and software failure. The method section explains FTA tool for its use in identifying and analyzing risks after an event has occurred. The illustration of FTA in Therac-25 case is also examined. Finally the results section provides emphasis on developing a risk management mindset to be followed by healthcare device manufacturers and different aspects to consider in overcoming risks. The challenges for healthcare systems manufacturers are to overcome risks, establish risk analysis procedures, and make risk management as an inherent process in their corporate planning. In healthcare systems development process risk management must ideally be started from early development through the end life of a product.

References

1. Andersen, Bjorn and Tom Fagerhaug, (2006). Root Cause Analysis: Simplified Tools and Techniques. 2nd ed.: Asq Quality Press.

2. Bowman, W.C., et al. (1991). An Application of Fault Tree Analysis to Safety-Critical Software at Ontario Hydro. Conference on Probabilistic Safety Assessment and Management.

3. Cammack, J.N., R.J. Eyre, R.D. White and D.M. Wilson, (1999). Medical Device Risk Assessment Paradigm: Use and Case History. International Journal of Toxicology. 18 (4).

4. Daniels, S.R., W.D. Flanders and R.S. Greenberg, (2005). Medical Epidemiology. 4th ed: McGraw Hill Companies Inc.

5. DelVecchio, Alex (2014). Clinical Decision Support Systems (CDSS). [ONLINE] Available at: http://searchhealthit.techtarget.com/definition/clinical-decision-support- system-CDSS. [Last Accessed 03-Mar-2015].

6. Dumbrique, Rachelo, (2010). Implementation of Risk Management in the Medical Device Industry. Master's Theses. San Jose State University, SJSU Scholar Works.

7. Finnegan (2015). Healthcare IT. [ONLINE] Available at: http://www.finnegan.com/HealthcareITIndustry/. [Last Accessed 03-Mar-2015].

8. Gad, S.C. and M.G. McCord, (2008). Safety Evaluation in the Development of Medical Devices and Combination Products. 3rd ed. New York: Informa Healthcare USA Inc.

9. Hyman, William A. (2002). A generic fault tree for medical device error. Journal of Clinical Engineering. 27, pp.134-140

10. Hyman, William A. and Erin Johnson, (2008). Fault Tree Analysis of Clinical Alarms. Journal of Clinical Engineering. pp.85-95

11. Jones, P.L., et al, (2001). Risk Management in the Design of Medical Device Software Systems. Biomedical Instrumentation & Technology.pp.237-266

12. Leveson, Nancy G. and Clark S.Turner, (1993). An Investigation of the Therac-25 Accidents. Computer. 26 (7), pp.18-41

13. Leveson, Nancy, (2005). Medical Devices: The Therac-25. Report by the University of Washington.

14. Miller, ED, (1987). The Therac-25 experience. In Conference of State Radiation Control Program Directors.

15. Miller-Keane (2003). Encyclopaedia and Dictionary of Medicine, Nursing and Allied Health. [ONLINE] Available at: http://medical- dictionary.thefreedictionary.com/health+care+system. [Last Accessed 03-Mar-2015].

16. Mosby (2009). Mosby's Medical Dictionary. [ONLINE] Available at: http://medical- dictionary.thefreedictionary.com/health+care+system. [Last Accessed 03-Mar-2015].

17. Raheja, Dev and Maria C. Escano, (2009). Reducing Patient Healthcare Safety Risks through FTA. Journal of System Safety. pp.13-17 18. Raheja, Dev and Michael Allocco, (2006). Assurance Technologies Principles and Practice. Wiley.

19. Rakitin, Steven R., (2006). Coping with Defective Software in Medical Devices. Computer, Published by IEEE Computer Society. pp.40-46

20. Sakai, Yoshio, Seiko Shirasaka and Yasuharu Nishi, (2013). An Extended Notation of FTA for Risk Assessment of Software Intensive Medical Devices. . The 24th IEEE International Symposium on Software Reliability Engineering.

21. Schmidt, Brian, (2007). Software Fault Tree Analysis. Technical Report.

22. Schell, Catherine, (2004). An Investigation of Therac-25 Accidents. Presentation CSC508.

23. US FDA Centre for Devices and Radiological Health, (1996). Medical Devices, Current Good Manufacturing Practice (CGMP). Final Rule; Quality System Regulation.

24. Wood, B.J., (1999). Software Risk Management for Medical Devices. Medical Device and Diagnostic Industry.pp.139-156

25. Vellojin, Laila Nadime Cure, (2011). Analytical Methods to Support Risk Identification and Analysis in Healthcare Systems. University of South Florida, Scholar Commons.