Radiology Data and Natural Language Processing Term Paper

Download this Term Paper in word format (.doc)

Note: Sample below may appear distorted but all corresponding word document files contain proper formatting

Excerpt from Term Paper:

Harnessing Unstructured Data in Radiology

The harnessing of unstructured data is vital to moving the field of radiology forward. There are methods used for the mining of unstructured data, with one of the most common being Natural Language Processing (NLP). However, there are some difficulties with the use of NLP in the radiology field, because NLP lacks the capacity to analyze free-text radiology reports and images. There is too much uncertainty to be addressed with NLP, but there may be ways in which it can be useful. In order to make that determination, this paper examines the current usage of NLP and other methods such as RadLex and Annotation and Image Markup for unstructured data mining in the radiology field, as well as the desired and sought out use of the mining of unstructured data. Both clinical decision support and research analysis could benefit from unstructured data mining in the field of radiology, but only if the data can be mined correctly and the value can be extracted from it. With that in mind, various forms and methods used for the mining of unstructured data in radiology reports must be carefully considered and compared to one another, in order to find the method or combination of methods that works best and provides the most success for translation of unstructured data into valuable information for clinical decision support and research analysis.



Historical and Theoretical Background

Natural Language Processing


Annotation and Image Markup

Use and Intended Impact



Annotation and Image Markup

Interaction with Other Topics and Themes

Comparison and Contrast

A Comparison of Unstructured Data Mining Methods

The Contrasting Values of Unstructured Data Mining Methods

Strengths and Weaknesses

Organizational and Technical Risks




Harnessing Unstructured Data in Radiology


Radiology is the use of imaging to look into the human body and see disease processes taking place (Chapman, et al., 2011). Both diagnosis and treatment can be improved when radiology is used. There are a number of techniques used by radiologists, including CT scans, X-rays, ultrasounds, MRIs, and PET scans, among others (Hong, et al., 2013). There are also interventional radiology techniques that are generally minimally invasive but that work well in diagnosing and treating specific ailments (Chapman, et al., 2011). However, there is one area in which radiology is severely lacking, and that is in the mining of unstructured data in order to present a clearer picture of the patients' issues and provide more information about what those patients may be facing. There is a great deal of data provided within radiology reports, but without collecting this data and processing it, it can be of no real use to the patients or the doctors.

However, the collection and processing of the unstructured data found in those reports can be difficult and is not without its own pitfalls (Chapman, et al., 2011). The mining of that data has to be done, and there are several different types of programs that can be used to do that successfully. Natural Language Processing (NLP) is one of the most commonly used options for collecting data, but it does not always work well on unstructured data. There are many errors when using it that way, so it has not been found to be completely reliable. With that in mind, this paper will explore NLP, RadLex, and Automated Image Markup that can be used for unstructured data mining in radiology reports. This will provide information on which of these methods is the best one, or how they can be used in conjunction with one another to be more effective overall.

Historical and Theoretical Background

Natural Language Processing

Natural Language Processing (NLP) is used to mine unstructured data (Gerstmair, et al., 2012; Hong, et al., 2013). This particular concept is based on human-computer interaction, and provides a way for computers to learn natural (human) language in order to process information that is provided by humans. The more computers understand about language, the more they can process information without barriers (Johnson, et al., 1997). That can be highly beneficial in medicine, because it provides doctors, nurses, radiologists, and other medical professionals with more information than they would have previously be able to collect without the use of NLP. However, NLP is not without its downsides, which also have to be addressed in order to acquire a full understanding of whether NLP should be used in radiology and what adjustments can be made in order for that to be more viable.

Generally, complex sets of hand-written rules were used in order to allow machines to translate, but in the 1980s programmers began to write complex algorithms that allowed machines to learn and process language (Demner-Fushman, Chapman, & McDonald, 2009; Torres, et al., 2012). This was a major breakthrough, and interest in machine translation was renewed. The original algorithms were relatively primitive and not much better than the hand-written rules, but they did show that algorithms were possible, and that they did work for translation purposes (Chapman, et al., 2011; Weiss & Langlotz, 2008). As computing power became stronger, more success was seen with translation and the mining of data, allowing computers to actually "learn" language in a way that was not possible in the past. Algorithms today can be semi-supervised, in that they can learn some information from other information they are supplied (Chapman, et al., 2011).

The theory surrounding NLP is that computers can be "taught" to translate language in the same way a person can (Reiner, 2009; Torres, et al., 2012). Once machines are able to do this, computers will be able to handle a number of tasks that are currently offered only to humans. That can free up human beings for other tasks, and can result in much faster translations because computers are capable of rapid calculations that are much faster than what humans can create. However, there are issues with this particular theory that have to be considered. The main concern is that the NLP goals are not completely realistic (Gerstmair, et al., 2012; Weiss & Langlotz, 2008). Computers are not people, and because they do not "think" in the same way human beings do, they can only follow sets of rules and use those rules to process information (Demner-Fushman, Chapman, & McDonald, 2009).


RadLex is another way of addressing unstructured data mining. There are several different methods currently used, and the main problem with them is that they are all different. In other words, when healthcare organizations use different methods for extracting and categorizing data and for keeping records, it becomes confusing when information needs to be transferred from one organization to another (Gerstmair, et al., 2012). RadLex is designed to stop all of that, through the creation of a single lexicon that can be used by all healthcare organizations and agencies (Gerstmair, et al., 2012). There are more than 68,000 terms contained within it, so it can be applied to the entire field of radiology successfully. DICOM and SNOMED-CT are two of the current standards and lexicons that are used, but RadLex is able to address and work with both of those, to unify the experience (Weiss & Langlotz, 2008).

The idea behind RadLex came from a group of committees that were created to find a better way to mine data from radiology reports. The RSNA formed these in 2005, and they were comprised of individuals from more than 30 organizations that focused on radiology and standards (Chapman, et al., 2011). In 2007, another six committees were formed to help the continued development of RadLex, and to ensure that as many terms as possible were included in it (Chapman, et al., 2011). Without that level of information, RadLex would not be any better than the other lexicons it was hoping to outshine, and it would not have been able to take those other data mining software options and tie them all together in one convenient package that can be used by radiologists everywhere. The theory behind it was to create an option that would allow all other programs to be merged, and it appears that RadLex will succeed with its goal of making that possible.

Annotation and Image Markup

Another choice for handling the mining of unstructured data in radiology reports is Annotation and Image Markup. This is, as the name would suggest, focused on the images found in reports. However, there is much more to the issue than just identifying pictures, as there are captions and tags that can be attached to these images and that will supply readers of the report with a great deal of data that might otherwise be lost (Chapman, et al., 2011). Annotation and Image Markup, or AIM, is not new. The history of it goes back a number of years when it comes to the planning stages and what it can offer. However, it is also not focused on the same types of unstructured data as other systems such as NLP. The focus of AIM remains on the images…[continue]

Cite This Term Paper:

"Radiology Data And Natural Language Processing" (2014, March 22) Retrieved December 6, 2016, from

"Radiology Data And Natural Language Processing" 22 March 2014. Web.6 December. 2016. <>

"Radiology Data And Natural Language Processing", 22 March 2014, Accessed.6 December. 2016,

Other Documents Pertaining To This Topic

  • Healthcare in Saudi Arabia Project

    Lack of accountability, transparency and integrity, ineffectiveness, inefficiency and unresponsiveness to human development remain problematic (UNDP). Poverty remains endemic in most Gulf States with health care and opportunities for quality education poor or unavailable, degraded habitats including urban pollution and poor soil conditions from inappropriate farming practices. Social safety nets are also entirely inadequate and all form part of the nexus of poverty that is widely prevalent in Gulf countries.

Read Full Term Paper
Copyright 2016 . All Rights Reserved