Term Paper Undergraduate 5,291 words Human Written

Radiology Data and Natural Language Processing

Last reviewed: ~25 min read Technology › Radiology
80% visible
Read full paper →
Paper Overview

Harnessing Unstructured Data in Radiology The harnessing of unstructured data is vital to moving the field of radiology forward. There are methods used for the mining of unstructured data, with one of the most common being Natural Language Processing (NLP). However, there are some difficulties with the use of NLP in the radiology field, because NLP lacks the...

Full Paper Example 5,291 words · 80% shown · Sign up to read all

Harnessing Unstructured Data in Radiology The harnessing of unstructured data is vital to moving the field of radiology forward. There are methods used for the mining of unstructured data, with one of the most common being Natural Language Processing (NLP). However, there are some difficulties with the use of NLP in the radiology field, because NLP lacks the capacity to analyze free-text radiology reports and images. There is too much uncertainty to be addressed with NLP, but there may be ways in which it can be useful.

In order to make that determination, this paper examines the current usage of NLP and other methods such as RadLex and Annotation and Image Markup for unstructured data mining in the radiology field, as well as the desired and sought out use of the mining of unstructured data. Both clinical decision support and research analysis could benefit from unstructured data mining in the field of radiology, but only if the data can be mined correctly and the value can be extracted from it.

With that in mind, various forms and methods used for the mining of unstructured data in radiology reports must be carefully considered and compared to one another, in order to find the method or combination of methods that works best and provides the most success for translation of unstructured data into valuable information for clinical decision support and research analysis.

Outline Introduction Historical and Theoretical Background Natural Language Processing RadLex Annotation and Image Markup Use and Intended Impact NLP RadLex Annotation and Image Markup Interaction with Other Topics and Themes Comparison and Contrast A Comparison of Unstructured Data Mining Methods The Contrasting Values of Unstructured Data Mining Methods Strengths and Weaknesses Organizational and Technical Risks Conclusion References Bibliography Harnessing Unstructured Data in Radiology Introduction Radiology is the use of imaging to look into the human body and see disease processes taking place (Chapman, et al., 2011).

Both diagnosis and treatment can be improved when radiology is used. There are a number of techniques used by radiologists, including CT scans, X-rays, ultrasounds, MRIs, and PET scans, among others (Hong, et al., 2013). There are also interventional radiology techniques that are generally minimally invasive but that work well in diagnosing and treating specific ailments (Chapman, et al., 2011).

However, there is one area in which radiology is severely lacking, and that is in the mining of unstructured data in order to present a clearer picture of the patients' issues and provide more information about what those patients may be facing. There is a great deal of data provided within radiology reports, but without collecting this data and processing it, it can be of no real use to the patients or the doctors.

However, the collection and processing of the unstructured data found in those reports can be difficult and is not without its own pitfalls (Chapman, et al., 2011). The mining of that data has to be done, and there are several different types of programs that can be used to do that successfully. Natural Language Processing (NLP) is one of the most commonly used options for collecting data, but it does not always work well on unstructured data.

There are many errors when using it that way, so it has not been found to be completely reliable. With that in mind, this paper will explore NLP, RadLex, and Automated Image Markup that can be used for unstructured data mining in radiology reports. This will provide information on which of these methods is the best one, or how they can be used in conjunction with one another to be more effective overall.

Historical and Theoretical Background Natural Language Processing Natural Language Processing (NLP) is used to mine unstructured data (Gerstmair, et al., 2012; Hong, et al., 2013). This particular concept is based on human-computer interaction, and provides a way for computers to learn natural (human) language in order to process information that is provided by humans. The more computers understand about language, the more they can process information without barriers (Johnson, et al., 1997).

That can be highly beneficial in medicine, because it provides doctors, nurses, radiologists, and other medical professionals with more information than they would have previously be able to collect without the use of NLP. However, NLP is not without its downsides, which also have to be addressed in order to acquire a full understanding of whether NLP should be used in radiology and what adjustments can be made in order for that to be more viable.

Generally, complex sets of hand-written rules were used in order to allow machines to translate, but in the 1980s programmers began to write complex algorithms that allowed machines to learn and process language (Demner-Fushman, Chapman, & McDonald, 2009; Torres, et al., 2012). This was a major breakthrough, and interest in machine translation was renewed. The original algorithms were relatively primitive and not much better than the hand-written rules, but they did show that algorithms were possible, and that they did work for translation purposes (Chapman, et al., 2011; Weiss & Langlotz, 2008).

As computing power became stronger, more success was seen with translation and the mining of data, allowing computers to actually "learn" language in a way that was not possible in the past. Algorithms today can be semi-supervised, in that they can learn some information from other information they are supplied (Chapman, et al., 2011). The theory surrounding NLP is that computers can be "taught" to translate language in the same way a person can (Reiner, 2009; Torres, et al., 2012).

Once machines are able to do this, computers will be able to handle a number of tasks that are currently offered only to humans. That can free up human beings for other tasks, and can result in much faster translations because computers are capable of rapid calculations that are much faster than what humans can create. However, there are issues with this particular theory that have to be considered. The main concern is that the NLP goals are not completely realistic (Gerstmair, et al., 2012; Weiss & Langlotz, 2008).

Computers are not people, and because they do not "think" in the same way human beings do, they can only follow sets of rules and use those rules to process information (Demner-Fushman, Chapman, & McDonald, 2009). RadLex RadLex is another way of addressing unstructured data mining. There are several different methods currently used, and the main problem with them is that they are all different.

In other words, when healthcare organizations use different methods for extracting and categorizing data and for keeping records, it becomes confusing when information needs to be transferred from one organization to another (Gerstmair, et al., 2012). RadLex is designed to stop all of that, through the creation of a single lexicon that can be used by all healthcare organizations and agencies (Gerstmair, et al., 2012). There are more than 68,000 terms contained within it, so it can be applied to the entire field of radiology successfully.

DICOM and SNOMED-CT are two of the current standards and lexicons that are used, but RadLex is able to address and work with both of those, to unify the experience (Weiss & Langlotz, 2008). The idea behind RadLex came from a group of committees that were created to find a better way to mine data from radiology reports. The RSNA formed these in 2005, and they were comprised of individuals from more than 30 organizations that focused on radiology and standards (Chapman, et al., 2011).

In 2007, another six committees were formed to help the continued development of RadLex, and to ensure that as many terms as possible were included in it (Chapman, et al., 2011). Without that level of information, RadLex would not be any better than the other lexicons it was hoping to outshine, and it would not have been able to take those other data mining software options and tie them all together in one convenient package that can be used by radiologists everywhere.

The theory behind it was to create an option that would allow all other programs to be merged, and it appears that RadLex will succeed with its goal of making that possible. Annotation and Image Markup Another choice for handling the mining of unstructured data in radiology reports is Annotation and Image Markup. This is, as the name would suggest, focused on the images found in reports.

However, there is much more to the issue than just identifying pictures, as there are captions and tags that can be attached to these images and that will supply readers of the report with a great deal of data that might otherwise be lost (Chapman, et al., 2011). Annotation and Image Markup, or AIM, is not new. The history of it goes back a number of years when it comes to the planning stages and what it can offer.

However, it is also not focused on the same types of unstructured data as other systems such as NLP. The focus of AIM remains on the images themselves, as not providing useable, translatable data with these images can cause them to be overlooked (Chapman, et al., 2011). The theory behind this type of software for the mining of radiology reports is that a great deal of information is lost in the pictures and images themselves (Chapman, et al., 2011).

When a report is "read" through the use of a computer that is mining data from it, the software program reads the language in the report itself. The program is able to make sense of the words, terms, sentences, and other information that is written in the report. However, there must also be information in the images themselves, as that is the area on which radiology is based.

Being able to quickly access these pictures is very important, because they can provide extra insight into the disease or condition of the patient, which can make a significant difference in the speed and accuracy of the diagnosis and treatment (Chapman, et al., 2011). Use and Intended Impact The use of unstructured data mining is varied, but radiology reports are a very popular area in which it is seen. For consideration here is the use of NLP, RadLex, and AIM in the radiology field, based on the mining of unstructured data.

There is a great deal of unstructured data in radiology reports, and much of it can provide a benefit to the radiologist and to the doctors who look over the report in order to make a diagnosis and determine what method of treatment would be the best choice (Chapman, et al., 2011; Johnson, et al., 1997). The goal would be to use software to mine all of that unstructured data and provide it so it could be used by anyone who read the radiology report (Chapman, et al., 2011).

That would allow doctors to have information at a glance that they might not otherwise notice, and would also allow them to provide more information in patient records that could be viewed by other doctors (Weiss & Langlotz, 2008). Making clinical decisions requires all the information possible, and the use of unstructured data mining could provide a higher level of information that could lead to better diagnostic success and a higher chance of the right treatments for every patient who is seen by radiology.

There is a risk with this type of data collection, however, because of uncertainty issues that currently exist in its ability to translate correctly and efficiently all the time (Chapman, et al., 2011; Demner-Fushman, Chapman, & McDonald, 2009; Do, et al., 2013). For the use of software for unstructured data mining to be acceptable, that issue would have to be completely corrected and thoroughly tested out so that patients' lives and well-being were not being put at risk from incorrect translation.

Doctors must be able to trust what they read on a chart or diagnostic report, regardless of whether it is provided by another medical professional or translated by a computer (Chapman, et al., 2011). With the mining of unsecured data, there is an excellent opportunity to collect more information that can help provide patients with the best care possible (Demner-Fushman, Chapman, & McDonald, 2009). As long as the unsecured data is collected and translated properly, there will be great benefits seen (Chapman, et al., 2011).

NLP Using NLP will have an excellent benefit for radiology, provided the translation of any unstructured data that is mined is correct (Weiss & Langlotz, 2008). The major impact will be on the patients themselves, because they will be the ones who will really benefit from more data about their diagnosis and treatment that is provided to their doctors and other medical providers in a structured way. Unstructured data is unorganized data, and does not lend itself to helping to diagnose or treat a patient, no matter his or her illness.

The structured data in radiology reports is what matters, and if NLP can mine unstructured data and turn it -- accurately -- into structured data, there will be a significant impact on the value to both doctors and patients (Chapman, et al., 2011; Torres, et al., 2012). This impact is very important for the field, since it can save lives along with helping doctors diagnose and treat even mild conditions that are causing difficulty for a patient (Chapman, et al., 2011; Demner-Fushman, Chapman, & McDonald, 2009).

However, if NLP is not used correctly it could have a very negative impact on radiology and other areas of healthcare because of inaccurate information. RadLex All of the best features from the existing systems for terminology in radiology are incorporated into RadLex, but the software also fills in gaps that were critical to the unstructured data mining of radiology reports but that were missing in the other methods that were used.

This is vitally important, as the goal is to reach a type or style of software that can be used for data mining and that can handle unstructured data as well as structured data and pictures. While most of the options for data mining are helpful, none of them fully address all of the issues faced by those who are attempting to collect all of the data a radiology report has to offer (Chapman, et al., 2011).

RadLex is not perfect, but because it fills in the most critical gaps in the ability to mine data and because it provides a link between all of the previous options that were used for the creation of radiology reports and the mining of their information through software, it is an excellent choice for unstructured data mining. Annotated Image Markup The value of AIM is an important one for use in collecting unstructured data as it relates to the pictures included in a radiology report.

It is necessary to know the value of those pictures, and to ensure that they are providing the proper information (Chapman, et al., 2011). Without pictures of the patient to help identify the disease or condition with which he or she is dealing, the radiology report does not provide as much value to the doctor. That is where AIM comes in, and where it can be seen to have the most importance for the medical field.

Being able to collect information from the pictures in the report and have them be part of the record that can be read at any other medical institution where the patient may need assistance is vital to the quality of care the patient needs (Chapman, et al., 2011). None of the methods used to mine unstructured data from radiology reports are perfect, but there are many ways in which various methods can work together in order to provide a highly successful outcome.

Interaction with Other Topics and Themes The use of data mining relates to a number of other topics and themes that are seen in the field of health informatics. When healthcare meets information systems, informatics are created (Chapman, et al., 2011). This area requires a lot of study, because there are many different subareas that lie within it. In order to see where data mining falls in the context of health informatics, it is important to take a look at some of the other issues seen.

The computers and programs that are used to handle data mining are part of informatics (Chapman, et al., 2011). These computers are required, or it would not be possible for the mining of data to take place at the current level. In order to translate the information from radiology reports and mine unstructured data properly, computers are vital to the process (Gerstmair, et al., 2012; Hong, et al., 2013). Computing power is increasing, and healthcare options are increasing as well.

One of the ways patients can benefit from this is in the way they interact with their doctors and other healthcare providers. When information from a radiology report can be translated through data mining it can be made available electronically to doctors and medical personnel across the city, country, or world.

When data mining is used effectively, it can change the entire face of health informatics because it can use computers to provide doctors, radiologists, nurses, patients, and others with access to information they may have had to hunt for in the past (Weiss & Langlotz, 2008). That can affect how a person is treated and can also have an effect on the diagnosis the person receives. Getting the right diagnosis -- and getting it faster -- then affects the way the person receives treatment (Gerstmair, et al., 2012).

Because quick treatment can be the difference between saving a life or failing in that regard, data mining has an important place in health informatics. Additionally, even less severe health problems can be mitigated, treated faster, and handled better when everything from the radiology report is available to everyone who needs the information (Chapman, et al., 2011; Do, et al., 2013). People who are interested in the field of health informatics see the value of data mining and how much computers can do for people in the medical field.

Many doctors are using computers to store patient records, and to send those records through HIPAA-compliant channels to other medical professionals who need them in order to make sure patients receive proper treatment (Chapman, et al., 2011). With all the information systems that exist in healthcare, the one area that is lagging is the mining of unstructured data in various reports -- including radiology (Torres, et al., 2012; Weiss & Langlotz, 2008).

The goal with any data mining system is to get the translation so good that there will generally not be any errors when the unstructured data is mined from radiology reports. Comparison and Contrast A Comparison of Unstructured Data Mining Methods Throughout the literature, different ways of using data mining have been addressed. Most of the information collected has addressed the way in which NLP, RadLex, and AIM work with radiology reports, but there are different aspects of radiology and different types of reports to consider (Gerstmair, et al., 2012).

For example, bone fractures are one of the areas in which radiology can be very useful (Do, et al., 2013). With AIM, the pictures of those bone fractures can be properly translated into something that can be more easily used by anyone in the medical profession who has reason to view the pictures. That can be compared to other examples of the use of radiology, including reporting templates that are designed to be used across the board for all types of issues and all types of radiology reports (Chapman, et al., 2011).

This allows a more plug-and-play approach to collecting radiology data, and can help software options like RadLex streamline the process of data mining. That is due to the fact that doctors and other healthcare practitioners have specific ways in which they can search for something and locate it so they are able to make a diagnosis or request treatment (Chapman, et al., 2011; Do, et al., 2013).

While not every piece of information from radiology is going to be easily located, and miscoding or mislabeling is possible, taking both text and pictures and using them in a template format is going to make them easier to locate overall, and will provide opportunities for healthcare practitioners to locate things more easily (Do, et al., 2013).

This is highly valuable for those practitioners, and also provides help to patients who may be struggling with a particular disease or condition with which doctors want to compare radiology information in order to create or confirm a diagnosis and treatment plan (Do, et al., 2013). The Contrasting Values of Unstructured Data Mining Methods The literature on the subject of data mining contains contrasting ways of using the technology, as well.

It is not just popular in radiology, and can be used in many other types of healthcare practice (Chapman, et al., 2011). This contrast is important, because some medical uses are easier for data mining to address than others, and some medical uses are more complex. That does not mean NLP, RadLex, or AIM should not be used for radiology and other medical diagnostic options, however.

The issue is only that these are all still being perfected in some areas of the medical field, where it is more likely to be accurate in other areas (Gerstmair, et al., 2012; Hong, et al., 2013). For example, when it comes to coding, categorizing, and retrieving pictures from radiology reports, NLP works very well (Chapman, et al., 2011; Demner-Fushman, Chapman, & McDonald, 2009). Using AIM is a better choice for radiology pictures and allows doctors and others to retrieve pictures quickly and efficiently (Chapman, et al., 2011).

That provides better patient outcomes, and can save and improve lives when patients have diagnostic tests. To ensure everything works on every system in different healthcare settings, RadLex is the best choice. However, it is important to contrast the value and the quality of the job NLP, RadLex, and AIM are able to provide when it comes to the mining of the unstructured parts of radiology reports.

That is the data doctors and others really want to start using, because it can provide a lot more insight into various issues that are faced by patients. This unstructured data may provide information about patients, their conditions, and the treatments that would otherwise be overlooked because it is buried in data that may or may not seem significant (Demner-Fushman, Chapman, & McDonald, 2009; Weiss & Langlotz, 2008).

By mining that data and pulling out information that can be placed into a more structured environment, those in the healthcare field can use the data instead of having it remain basically useless. There is little point to collecting data if it is not going to be used, and that has been the case with unstructured data in radiology and other medical specialties for some time.

Strengths and Weaknesses There are both strengths and weaknesses of NLP, AIM, and RadLex when it comes to the use of them for radiology reports. The largest strength is how well they work with the categorization of pictures and structured data (Demner-Fushman, Chapman, & McDonald, 2009). Both of these areas of the report are very valuable, and it is important that doctors and other medical professionals collect as much information as possible in order to ensure the best outcome for the patient (Chapman, et al., 2011).

Because NLP works well in these areas of a radiology report, it has commonly been used to provide fast information to doctors (Chapman, et al., 2011). This is done through computer processing of the report itself, so the important data can be pulled from the report and provided to the requesting doctor right away (Hong, et al., 2013). With structured data and pictures, there are fewer chances that NLP would be incorrect, leading to mistakes in diagnoses and treatment plans (Chapman, et al., 2011; Demner-Fushman, Chapman, & McDonald, 2009).

Unfortunately for those who are interested in using NLP for unstructured data mining in radiology reports, there are some problems with moving ahead in that area. The more unstructured the data, the less accurate the translation generally appears to be (Chapman, et al., 2011). Naturally, that is a serious concern for doctors, radiologists, and patients.

While NLP is a very valuable tool in radiology, it has not yet advanced far enough to be able to support the translation of unstructured data in an error-free way all the time (Demner-Fushman, Chapman, & McDonald, 2009). That is where other options such as AIM and RadLex come in. Using these two options with unstructured data can help mine information more quickly and easily, and can also provide a more accurate interpretation across a number of healthcare organizations.

When these organizations do not commonly use the same software for information, RadLex can help ensure that vital data is not lost in translation. As further advances take place in computing abilities, there will be a higher chance of satisfaction when it comes to the mining of unstructured data (Chapman, et al., 2011). For those who see the value of unstructured data in radiology reports, that day cannot come soon enough. Organizational and Technical Risks There are organizational risks avoided when data mining is used.

For example, one of the biggest risks in a healthcare organization is incorrect patient diagnosis and care (Chapman, et al., 2011).

1059 words remaining — Conclusions

You're 80% through this paper

The remaining sections cover Conclusions. Subscribe for $1 to unlock the full paper, plus 130,000+ paper examples and the PaperDue AI writing assistant — all included.

$1 full access trial
130,000+ paper examples AI writing assistant included Citation generator Cancel anytime
Sources Used in This Paper
source cited in this paper
6 sources cited in this paper
Sign up to view the full reference list — includes live links and archived copies where available.
Cite This Paper
"Radiology Data And Natural Language Processing" (2014, March 22) Retrieved April 21, 2026, from
https://www.paperdue.com/essay/radiology-data-and-natural-language-processing-185707

Always verify citation format against your institution's current style guide.

80% of this paper shown 1059 words remaining