Endoscopy Practice and Safety
Peter B. Cotton ed.
5. Digital documentation in endoscopy
Gastrointestinal endoscopy is a visual clinical discipline. All examinations, findings, descriptions, and recommendations
are based on the images that are created during the endoscopy. In interventional work, the images are our sole guiding material
for correct procedures.
Fiberoptic imaging was introduced into endoscopes in the 1960s. The mere view into the intestine was a revolution. However,
the revolution was a very private one, conveyed through the eyepiece of the endoscope, without dissemination, sharing, or
storing options. Endoscopic teaching and clinical practice was somewhat anecdotal and inconsistent, simply because the endoscopists
had little or no means of communicating what they saw, apart from the written endoscopy reportalready an interpretation of the images.
Teaching attachments and photography
Twin eyepieces and mountable cameras (still and video) were steps in the right direction, allowing exchange and discussion
of image information, but these were cumbersome gadgets with limited dissemination, and archiving solutions were mostly non-existent.
The introduction of video-based imaging systems created a host of new opportunities. The eyepiece was replaced with the greatly
enhanced viewing experience of a large monitor screen. The endoscopic examination became a shared experience with colleagues
and assistants, and, in some cases, even with the patients themselves. Important findings could be recorded in print form.
The video signals that are received and processed in the endoscopy rack can be utilized further: they can be stored electronicallyas captured electronic images, or as digital video. In combination with other existing technologies, this enables access and
utilization of our endoscopic images far beyond what was previously feasible (Fig. 1).
The increasing availability of electronic image-capturing systems opens up new ways for documenting our procedures. Where
we were previously confined to the endoscopist's concept of a 'large ulcer', 'profuse bleeding', or 'moderate inflammation' in a text report, the addition of images allows the reader of the endoscopy report to get a better understanding of what
is actually found, sometimes even take part in the interpretation. This is a development completely parallel to what our radiologists
have been doing for a long time, i.e. relating their diagnostic considerations directly to demonstrations of their image material.
There is no compelling reason why the endoscopist should not now be doing the same thing.
Standardized image terminology
This enhanced information flux has a very interesting side-effect: we are beginning to understand what our colleagues are talking about. The exposure of
how we label our findings with medical terms has brought to attention the need for language standardization; the same words
should have the same meaning. The content of a written report will be of value only if the 'image-to-word' coding algorithm is the same. The task of establishing a common language for gastrointestinal endoscopy has been taken on
by the OMED, and later by the European and US Endoscopy Societies.
Once the lexicon is agreed upon, the collected information needs also to be structured. The endoscopy report should be composed
in a standardized way, similar to what we have come to expect for other encounters, e.g. the medical history and physical
findings of a patient on admission. The introduction of computerized reporting systems for endoscopy likewise calls for this
type of structuring. The use of these systems for cumulative reporting and statistics requires rigorous coding. Even more
standardization is required if our endoscopy reports and images are to be implemented in a complete electronic medical record.
The opportunities and challenges of the digital revolution
The digital revolution in endoscopy has the potential to change the way we work and communicate, offering great improvement
in the service we can give our patients and referring doctors. However, this pay-off requires a significant investment of
money, time, and thought on the part of the endoscopist. This paper deals with some of these issues.
Imaging the gastrointestinal tract using a videoendoscope requires several steps
- Illumination by fiberoptic light transmission
- Surface reflectance
- CCD conversion of the reflected light to an electrical signal
- Reconstruction of the signals to an image
- Projection onto a monitor
PCs with image capture cards and network capabilities permit these images to be captured, stored, printed, and transmitted.
The physical quantities of the colors that represent an image are defined chromatically by wavelength, and the luminance is defined by the amount of light. The colors detected by a videoendoscope are continuous values. In the digital domain,
color must be converted from this continuous or analog value to a discrete digital value.
The representation of color can be based on one of three color models:
Most of the visible color spectrum can be represented by mixing the three primary colors, Red, Green and Blue, known as the
RGB color model. This model is the one used by most computer monitors, TV screens, graphics cards, and lighting effects. Color
mixing is analogous to illumination of an area with red, green, and blue bulbs of different intensity. Mixing different amounts
of the red, green, or blue creates different colors, and each can be measured on a scale ranging from 0 to 255. If red, green,
and blue are all set to 0, the color is black; if all are set to 255, the color is white (Fig. 2).
The CMYK color model is based on printing ink being absorbed into paper. It gives the greatest number of printable colors
from the fewest number of inks. By using varying amounts of cyan, magenta, yellow, and black, a great number of colors can
be printed. Most full-color printed materials, including magazines, posters, and packaging, are printed using just the four
CMYK inks. Here the level of ink is measured from 0% to 100%. As an example, orange would be represented by 0% cyan, 50% magenta,
100% yellow, and 0% black.
With the HSB model all colors are described in terms of three fundamental characteristics, hue, saturation, and brightness.
This is a useful model for image processing, because calculations need be applied only to one HSB axis as opposed to three
RGB axes. Therefore, it is often used in imaging software in computers.
Hue is the wavelength of light reflected or transmitted from an object, although more commonly, hue is known as the actual color,
such as red, yellow, or blue. Hue is measured as a position on the standard color wheel, and is described as an angle in degrees,
between 0 and 360.
Saturation is the amount or strength of the color (or hue). It is measured as a percentage. At 0% the color would contain no hue, and
would be gray; at 100%, the color is fully saturated.
- Brightness is the lightness or darkness of the color, again measured as a percentage. If any hue has a brightness of 0%, it becomes
black; with 100% it becomes fully light (Fig. 3).
Each of these three models has advantages and shortcomings, but there is good reason to know they exist, in particular to
understand the pitfalls in converting computer screen images to printed images. To accurately match a color print with what
you see on screen, special expertise from a print-shop is usually recommended. Practical experience and trial-and-error exercises
make a good alternative approach.
Digitization of color
The number of unique colors that can be represented by the coordinate system depends on the length of each axis. Because the
digital world is binary, the number of possible values is represented by an exponential exponent of 2 or 2x. If a color is represented in RGB space by 8 unique binary digits (bits), then there are only 28 = 256 colors to choose from. Increasing the number of digits representing a color increases the color range, i.e. 16 or 24 bits
define 216 = 65 536 and 224 = 16 777 216 colors, respectively. Computer screens are typically able to display 224 colors ('millions of colors'), but the color range still has an impact on file size.
An image is presented as a continuous signal, which is converted or transduced by an analog-to-digital device. To create a
digital image, a specific device in the computer called a frame grabber or capture board converts the video signal into a
digital form. The resulting digital values are mapped to specific locations and stored as a two-dimensional array of numbers.
The frame grabber performs two functions: sampling and quantification.
- Sampling captures evenly spaced data points that represent the image.
- Quantification assigns each data point a binary value. The evenly spaced data points for an image represents specific two-dimensional locations
called picture elements or pixels. The pixel is the basic unit of a digital image and each pixel stores the value produced
by the quantification described above.
The number of discrete colors available to present an image is the color depth or color resolution. A grayscale image digitized by an 8-bit image capture card is represented by assigning values to each pixel making black
= 0 and white = 256, because 28 = 256. Color is more complex. The range of colors depends on the number of bits that can be stored at the pixel location. Thus,
an 8-bit frame grabber can capture 8 bits/pixel or 256 colors/pixel. Most frame grabbers today capture 24 bits per pixel (i.e. 8 bits for each of the three colors red, green, and blue).
This allows a total of 224 combinations, 'millions of colors'. It is important to recognize that the actual color range (number of discrete colors) of an endoscopic image is small. This
is the reason why the appreciable difference between 16 and 24 bits/pixel images is minimal. The limited range of colors present in an endoscopic image also affects how such an image can be
Pixel density (sampling density) is the number of pixels into which an image is divided by the frame grabber. The greater
the number of pixels/unit area the higher the resolution of the image (Fig. 4). For an image of a given size, sampling density can be defined by the dimension of the image in pixels. For example, 640 × 480 represents an image that is 640 pixels wide and 480 pixels high (VGA resolution). If this same image is sampled at 1024 × 768 (XGA resolution) then the number of pixels/unit area is higher and the resolution is greater (Fig. 5). Sampling becomes important when images are enlarged because there is a discrete separation between adjacent points in the
image. Thus, zooming an image which has been sampled at a low density quickly reveals the individual pixels, a phenomenon
called pixelation. On the other hand, pixel resolution beyond that of your viewing mechanism (e.g. an 800 × 600 computer screen (SVGA)) requires extra storage space without any utility.
The final size of an uncompressed image is calculated simply by the formula width (in pixels) by height by color depth. A
VGA resolution 24-bit image (typical for an endoscopic image) would be 640 × 480 × 8 × 3 = 7 372 800 bits, approximately 900 kilobyte (1 byte = 8 bits).
File size affects storage requirements, display delays, and transfer times, and so becomes important in the everyday use of
the images. Transferring a 900 kbyte image with a 28.8 kbyte modem requires 4.3 min, and a 1 gigabyte disk drive would be filled with 1100 such images .
Thus, all the factors determining the file size should be considered to optimize the composition of endoscopic images.
What detail is needed?
In some clinical situations resolution is not important, e.g. a large mass or a pedunculated polyp may be easily identified
as such even at low resolution. On the other hand, subtle findings such as the granularity of the mucosa or disruption of
the vascular pattern may require a higher pixel ratio. It is also of interest how the image will be utilized. To show the
image on a computer screen, the resolution of the screen determines the optimal resolution (e.g. SVGA), but for printing with
a high-quality printer (e.g. glossy prints for a journal manuscript), a higher resolution is needed, typically 23 times the screen requirements.
At the present time, there is definitely an upper limit to the resolution that is feasible for endoscopic images. The CCD
chip in the tip of the endoscope has a pixel resolution in the SVGA range. Thus, even if we had capture cards with higher
resolution, the image quality would be but marginally better. However, high-resolution endoscopes are being developed that
may change this situation.
For practical purposes, uncompressed images are almost theoretical relics of the past. With the increasing utility of network-based
and internet-based computer applications, the need for smaller files is indisputable.
File compression is a computational processing technique that effectively reduces the size of a file by removing redundancies
in large binary data sets. Full motion video requires a display rate of 30 frames/s. If each frame is 0.5 megabytes then one second of digital video contains 15 megabytes of data. Disk storage would be rapidly
exceeded and image transmission even on high-speed networks would be slow. Compression is measured as a ratio of the size
of the original data divided by the compressed data.
There are two general categories of compression techniques: lossless and lossy.
- Lossless compression techniques preserve all the information in the compression/decompression process. This may be vital for compressing documents or computer program files, but these techniques can only
achieve moderate compression ratios, which may not be sufficient for medical images, especially for radiological grayscale
images. However, when images are used as a means of primary diagnosis, they require lossless compression, storage, and transmission.
Most PACS systems utilize lossless compression, but require high-end hardware and dedicated high-speed networks.
For the purpose of practical archival storage and transmission of medical images, compression ratios of 20 : 1 or higher are required. In order to achieve this amount of file size reduction, lossy compression techniques need to be
employed. Lossy compression implies that some information is lost in the compression/decompression process, but algorithms can be designed to minimize the effect of data loss on the diagnostic features of the
Image file formats
JPEG (Joint Photographic Experts Group) compression is one of the three file formats used for graphical images on the World
Wide Web (the others being GIF (Graphical Interchange Format) and PNG (Portable Network Graphics)). JPEG files have the advantage
of retaining 24-bit true color files during compression, while GIF files are limited to 8-bit color (256 colors). The PNG
file format shows promise as a lossless compression method for the Web, but has not yet gained acceptance. The issue of standard
Web formats is an important one, because an increasing number of relevant software solutions rely on browser technology for
screen display (Figs 6, 7).
Color and black and white compression
While color images using JPEG can typically achieve 10 : 1 to 20 : 1 compression ratios without visible loss and can compress to 30 : 1 to 50 : 1 with small to moderate defects, black and white (grayscale) images do not compress so well by such large factors. Because
the human eye is much more sensitive to brightness variations than to hue variations, JPEG can compress hue (color) data more
heavily than brightness (grayscale) data. A grayscale JPEG file is generally only about 1025% smaller than a full-color JPEG file of similar visual quality. But the uncompressed grayscale data is only 8 bits/pixel, or 1/3 the size of the color data, so the calculated compression ratio is much lower. The threshold of visible loss is often around
5 : 1 compression for grayscale images, substantially different from color images .
JPEG 2000 and beyond
The importance of image handling and compression for Internet applications creates a huge momentum for development. The JPEG
working group has developed a new standard which is only just becoming available (accepted as an ISO standard December 2000).
This standard is called JPEG 2000, with the file extension .jp2. This standard offers a host of advantages over the existing
JPEG standard, the most significant being lack of pixelation at high compression rates, and significantly more effective compression
Although the file sizes of individual endoscopic images are not a major issue at this point, we should keep in mind that when
the display and transfer of large numbers of images and videos becomes a significant part of our daily workflow, even minute
delays for every picture will have an impact. Further developments for more efficient file compression will be of major significance
for medical imaging. PACS development currently suffers from the heavy cost of high-end workstations and networks to handle
huge image data sets.
DICOM (Digital Imaging and Communications in Medicine) is a standard for imaging that contains very specific information about
the images, as well as the images themselves. DICOM relies on explicit and detailed models of how the features (patients,
images, reports, etc.) of an imaging operation are described, how they are related, and what should be done with them. This
model is used to create Information Object Definitions (IODs) for all of the imaging modalities covered by DICOM.
An Information Object is a combination of Information Entities and each Entity consists of specific modules. A Service Class
defines the service that can take place on an Information Object, e.g. print, store, retrieve. In DICOM a Service is combined
with an Information Object to form a Service/Object Pair, or SOP. For example, storing a CT scan or printing an ultrasound is an SOP. A device that conforms to the DICOM
standard can perform this function. Thus, in a DICOM-conforming network the devices must be capable of executing one or more
of the operations the SOP definition prescribes. Each imaging modality has an IOD. The result is that different imaging modalities
such as CT, MRI, digital angiography, ultrasound, endoscopy, pathology; imaging workstations; picture archiving systems; and
printing devices can be networked and execute a high level of cooperation. In addition, these imaging networks can be connected
to other networks found in a hospital or facility.
The modules that comprise an Information Entity (IE) are precisely defined and may be common to multiple entities. The Patient
Entity is common to all IODs. However, the Image Entity must be capable of supporting different imaging modalities. An IOD
that supports endoscopy will of necessity include modules unique to endoscopy and be distinct from a CT IOD. The Patient IE
defines the characteristics of a Patient who is the imaging subject of one or more procedures that produce images. The Patient
IE is modality independent, i.e. it is common to all imaging modalities. The Patient IE consists of only one module, which
is illustrated in Fig. 9. Each module is a table consisting of four attribute elements:
The attribute name and description define the attribute precisely.
The attribute tag uniquely identifies that attribute among all of the many other attributes present. The tag (0010,0010) always identifies
the fact that this is the patient name. The attribute type specifies whether this attribute is mandatory or optional. For example, it is not necessary for an image to be transmitted
with the patient's name. In fact, DICOM requires only a few mandatory attributes that give the study a unique identifier,
define the modality, e.g. CT, MRI, ultrasound, and provide information about the image, e.g. pixel data, number of rows and
columns. DICOM also provides a dictionary that specifies the form in which the value of each attribute must be presented.
Patient name attributes
The patient name attribute (0010,0010) uses Person Name (PN) as its value representation. PN contains five components in the
following order: family name, given name, middle name, name prefix, and name suffix. Thus, any system that complies with DICOM
knows that (0010,0010) is a person name and that the format of the information transmitted is defined by the DICOM standard.
It is not sufficient simply to define a standard. It is also necessary to develop a mechanism to enable vendors and purchasers
to understand whether a particular system conforms to the standard. DICOM defines a conformance statement that must be associated
with a specific implementation of the DICOM standard. It specifies the Service Classes, Information Objects, Communication
Protocols, and Media Storage supported by the implementation.
DICOM in endoscopy
The American Society for Gastroeintestinal Endoscopy (ASGE), in collaboration with other medical and surgical societies such
as the European Society for Gastrointestinal Endoscopy (ESGE), American College of Radiology, the College of American Pathologists,
the American Academy of Ophthalmology, and the American Dental Association, has defined a new Supplement to the DICOM standard
. This specifies a DICOM Image Information Object Definition (IOD) for Visible Light (VL) images. This standard enables specialists
to exchange color images between different imaging systems using direct network connections, telecommunications, and portable
media such as CD-ROM and magneto-optical disk.
The DICOM standard for endoscopy is part of a larger standard for color images in medicine which has been provisionally approved
by the DICOM Committee. The current version will go through a process of public comment and testing. This period ensures that
any interested party may review the document and suggest changes to a committee that is responsible for creating the final
version. This process is time-consuming but it ensures that the standard is comprehensive and meets the needs of a broad group
Expanding the scope of DICOM
The endoscopy community, through the ASGE and ESGE, has also suggested that the DICOM standard be expanded to incorporate
other information associated with the imaging study. These expanded standards would include image labels and overlays, sound,
and waveform. The goal of a true multimedia report will be achieved only when these standards have been thoroughly tested
and implemented as part of the daily clinical activities of endoscopists throughout the world. The cooperation of endoscopists,
professional societies, and industry is absolutely necessary for improved endoscopic information systems and will result in
improved patient care.
How much compression is clinically acceptable?
Because of the specific nature of endoscopic images, the amount of compression that can be employed without compromising important
information must be determined by the endoscopist. The acceptable compression rate when we are looking at a polyp would likely
differ substantially from that for a case of mild gastritis. These issues have major impact on the utility of digital images.
We have to be involved in deciding what imaging is required to be useful for clinical purposes.
Studies of compression acceptability
The topic was excellently reviewed by Kim , but very few relevant studies have been published.
Vakil and Bourgeois
Vakil and Bourgeois  conducted a trial to determine the amount of color information required for a diagnosis from an endoscopy image. The least
amount of color information in an endoscopic image that carries sufficient diagnostic information was unknown. Ten lesions
of upper gastrointestinal lesions were presented in an 8-bit format, 16-bit format, and a 24-bit format blindly side-by-side
on a Macintosh II system with a 19 inch monitor that could display 24-bit color. Eleven observers (6 nurses and 5 endoscopists)
were asked to rank each format for each lesion (i.e. which of the two was the higher quality one). There were a total of 330
observations, and for each format and total the results were similar: the observers could not tell a difference on 41% of
the images; identified the best image correctly in 22%; and identified incorrectly in 37% of the images. All the lesions were
correctly diagnosed from both images. From this study for endoscopic images, the color resolution does not appear to affect
an endoscopist's ability to make a diagnosis.
Kim (personal communication)
Kim presented a set of six images to 10 expert gastroenterologists using software that allowed them to determine their personal
cut-off level of acceptable compression for each of the images. Different types of lesions were studied. The acceptable compression
ratio varied markedly, as expected, but in general, a compression ratio of between 1:40 and 1:80 was deemed acceptable (Fig. 10).
This type of study gives us important information concerning the order of magnitude of acceptable compression. However, the
clinical context is of interest as wellthe arterial bleed in the above study was probably identified easily as such at a high rate of compression, but a therapeutic
endoscopist would likely need additional details as to the exact location, structures next to the vessel, etc.
Developments in compression
Compression schemes are evolving quickly and, at the same time, the requirements for minute files are becoming less crucial.
Storage space is rapidly becoming cheaper, and networks faster. The 28.8 kbyte modem is no longer a reasonable yardstick for download time. The virtue of compressing our images remains, but there
is no reason to compromise the quality of our images to achieve the tiny file sizes that yesterday's technology urged us to
aim at. The endoscope manufacturers have been struggling hard to offer us high-resolution endoscopes, structure enhancement,
and magnification, and it would be counterproductive to take that advantage away for a few kilobytes of file size reduction.
As for clinical utility, we will need to establish a general standard for compression and formats that will work across diagnoses.
This will have to aim at a quality sufficient for our most difficult diagnoses, e.g. subtle, diffuse lesions like mild gastritis
or tiny erosions, or delineation of the vascular pattern in colitis.
Still pictures or live video?
Digital video is increasingly becoming an option for endoscopic documentation. Many capture cards have the capability of storing video
as well as still images, and in certain situations, video may definitely offer an advantage. This is particularly true for
teaching purposes, but even clinical documentation can be enhanced by live footage in certain situations. Obvious examples
are documentation of distensibility or propagating waves of the stomach, spasticity of the colon, or imaging in difficult
areas (the cardia).
However, video clips come at a cost in terms of processing, storing, and even presentation. While still images can be vividly
reproduced in our printed endoscopy report together with our recommendations, a video clip is forever tied to the computer
or network. Down the roadwhen electronic medical records become mainstream and wide area networks (WANs) a tool for medical purposesthese concerns may vanish, but for now a paper-based report is a prerequisite in most endoscopy labs.
Then there is the issue of storage and transfer. Studio quality video shows at 25 or 30 frames per second (fps). Although
we can get reasonable quality video at 1015 fps, this still produces enormous files quickly, and we need to determine if and when the value of digital video justifies
Video storage developments
Again, fortunately, things are moving rapidly in the right direction. Compression algorithms allow significant compression
of digital video file size with acceptable results. Best known to date are probably the Quick Time and MPEG-1 formats, but
this is a field of continuous development, MPEG-4 being the one of the most promising option at the moment.
Most of the compression algorithms utilize similar techniques to those discussed above for still images. For example: if a
segment of the movie image is unchanged for a period of time (the sky, or the black portion to the left of the endoscopic
picture), all the information that needs to be stored is the boundaries of the area, the color value, and the start and stop
With this type of compression, a video, for example, of a news reader can be reduced to a still picture with a small moving
segment representing the mouth. This technique, in addition to a multitude of others, allows for increasing compression of
video clips, offering efficient storage, as well as network-based distribution, with no or minimal depreciation of the diagnostic
What images should be recorded in practice?
In parallel to the technological developments in digital imaging and video, there are important decisions that need to be
made by the endoscopic community. A crucial one is: What pictures are needed?
If we want to report a polyp in the sigmoid colon, a single picture might be sufficient if it is a good oneshowing the size and shape, stalk, amount of luminal obstruction, surface texture, etc. But what about a distal rectal lesion?
Maybe an extra picture of the relation to the anal verge would be important, not least if a surgeon is to remove it. A retroflexed
view, as well as a standard forward-viewing depiction, would be reasonable for that. For diffuse pathology, typically more
than one image might be preferable, and maybe high resolution becomes an issue for minimal changes.
Recording negative examinations
More complex still are the issues raised by 'negative' examinations. Which images are needed to rule out a lesion, e.g. to document a normal colonoscopy? We obviously cannot picture every single fold, let alone behind them, but
there may still be reasons to document normality, e.g. to show what kind of view, cleansing, and distension was available
to the endoscopist and to confirm that the examination was complete (e.g. by digital images of the ileocecal value).
The virtue of this becomes even more obvious in the context of referrals and second-opinion cases. When we are asked to evaluate
a patient who has had a procedure at another hospital, too often we distrust the results because the images that we receive
are inadequate for independent assessment, or even lacking. Standardization of documentation will reduce the need to repeat
many procedures. In addition, the availability of relevant images from a prior examination will make follow-up studies much
more meaningful (e.g. in the assessment of the activity of colitis or esophagitis).
Structured image documentation
The ESGE has made an attempt to establish guidelines for image recording, proposing fixed sets of images for various procedures
. Figure 11 illustrates the standard set of images for upper endoscopy, and a similar set has been prepared for colonoscopy. The requirements
are similar for ERCP and EUS, but may be slightly more complex to describe. This is obviously a process that will continue,
but the importance of initiating this work in parallel with the implementation of digital documentation systems in the endoscopy
lab cannot be over-emphasized.
Costs of image documentation
Previously, the cost of color reproduction of a large number of pictures for every procedure was a concern. However, with
electronic storage and display, this concern is diminishing, and picture documentation should be the rule now rather than
the exception. Having these images in a readily searchable management system is essential.
The impact of video endoscopes has been substantial, but the images produced are still just natural light images showing the
gastrointestinal mucosa in a life-like manner. Novel technologies are now emerging, offering modification of the original
images that may increase the diagnostic output of the endoscopic procedure. These technologies do not relate to the digital
imaging as such, but they all rely on such imaging as the core technology for endoscopy.
Color manipulation methods deal primarily with the color characteristics of the pixels representing the image. This is a simple
way of enhancing the contrast features of the image, but sometimes at the cost of resolution. These methods are so far available
only for manipulation of still images, and a live version of the technology would be needed to make this applicable clinically
Narrow band imaging and spectroscopy
Narrow band imaging and spectroscopy are just two examples of a host of other technologies that will enhance our diagnostic
yield. In these technologies, parallel 'imaging' is utilized to extract information about the imaged tissue, and the regular digital images are used primarily to guide the
process of advanced tissue characterization.
Endoscopic findings are conveyed with words, although the findings themselves are images. Thus, the coupling between what
we see and how it is described becomes crucial, and standardization of our endoscopic language is an integral part of this concept.
Endoscopic teaching includes descriptions of what is found, but the definitions of terms used have been weak or non-existent.
If the conclusion of the endoscopy report is the only item of value then the specifics of the findings are of less importance.
However, if the findings themselves are important, then the descriptive language becomes interesting too. For research purposes,
in particular collaborative research, the utility of this is obvious, but even for general clinical purposes the objective
description of lesions may be of interest, e.g. in the situation of a second-opinion referral of a case where the referral
center needs to decide whether a repeat endoscopy is needed. Likewise, follow-up endoscopy in a patient with a known lesion
will profit from an unequivocal initial description of what was seen, at least when no image documentation is available.
OMED standardized terminology
The world organization of digestive endoscopy (OMED) initiated the efforts to standardize our language based on the pioneer
work of Professor Zdenek Maratka who developed the first 'Terminology, definitions and diagnostic criteria in digestive endoscopy.' This terminology is a codified list of terms with explicit definitions that allow endoscopic findings to be fitted into a
hierarchical nomenclature and assigned a code, thus enabling international collaboration. This terminology has since been
supplemented with images to exemplify the various terms. Despite deficiencies, this remains the de facto standard for describing the various findings of digestive endoscopy.
Minimal standard terminologyMST
The OMED terminology, while defining the framework for the terminology efforts within digestive endoscopy, proved too complex
for practical utilization in everyday endoscopy. A simplification was needed, and the European Society for Digestive Endoscopy
(ESGE) teamed up with its US counterpart (ASGE) to develop minimal standard terminology (MST) for endoscopy . This terminology is completely based on the OMED terminology, but the term lists are limited, aimed at covering 95% of the
terms needed for typical endoscopic practice, and omitting the definitions, which are available when needed in the OMED terminology
book. The MST is meant to be a standardizing prerequisite for software companies developing reporting software for digestive
endoscopy, ensuring that a joint language is used in the various available software solutions. The MST work has been endorsed
and supported by all the major vendors of such systems (Figs 13, 14).
The initial version of the MST was thoroughly tested within the GASTER project  and this experience led to a number of adjustments as to the selection and definition of terms. Version 2.0 of the MST has
been released, and is presently undergoing a similar clinical evaluation.
In addition, term definitions are now being included, and an image library is being developed through a joint European effort,
to help illustrate the various terms of the MST by high-quality sample pictures.
Problems with MST
The principles of the MST work have been endorsed almost universally, and the utility of a joint standardized language of
endoscopy is readily acknowledged. Still, the knowledge, dissemination, and implementation of the MST is at the present time
insufficient, even disappointing. Why is this?
One issue is that the MST term lists are still not perfect. They are designed to be 'minimal lists', and this means you may not find the precise term that you need. This is partly a software issue, because the lists were
never meant to be all-inclusive, and individual additions will be needed in most centers. Still, incomplete choice lists are
difficult to accept.
More fundamental, though, is the whole concept of structuring the language of the endoscopist. We are used to formulating
our findings and recommendations in natural language, and any superimposed structure may take extra time, be considered cumbersome
and limiting, and even as something that yields less informative reports.
The solution to this has not yet been found, and the MST is at present primarily only an excellent initiative. The utility
of standardized terms is indisputable, but the challenge is to embed them in software that allows their use to be sufficiently
transparent. Also, it is probably unnecessary that the endoscopy report be produced exclusively by 'point-and-click'. Certain segments should probably remain free text blocks with natural language.
Outstanding issues and future trends
Endoscopic recording has come a long way since Rudolph Schindler employed an artist to paint watercolor pictures of the images
he saw with his semiflexible gastroscope. We are now on the threshold of easy and comprehensive digital (still and video)
documentation of all of our procedures. This should provide enormous enhancement of the clinical value of our examinations,
and of our ability to both teach and communicate with colleagues. Tele-endoscopy (distance diagnosis) has been tested, and
could have substantial clinical and educational benefits.
Image manipulation and automated analysis eventually will add another dimension to our practice. Image data, when collected,
stored, annotated, and validated, will provide a memory bank far greater than the human brain can handle, or perhaps even
contemplate. Our endoscopes will soon start to recognize pathology ('optical biopsy'), and will provide instant differential diagnoses, along with access to examples of similar conditions (and a comprehensive
knowledge base about them).
It will be some time before the brain of the endoscopic processor replaces the brain of the endoscopist, but the potential
for development is enormous. Neural networks and artificial intelligence will facilitate and optimize our effectiveness. Initially
these developments will have greatest impact in diagnostic endoscopy. Already the video capsule maximizes the digital information
whilst minimizing endoscopic expertise. Eventually these concepts will be applied to therapeutic procedures; endoscopes may
recognize lesions (and even their depth), and apply therapy automatically. The future is as exciting as it is unpredictable.
I would like to thank Doctors Louis Korman and Chris Kim for valuable input to specific segments of this manuscript, and for
their efforts in the field in general.
1 Kim, CY. Compression of color medical images in gastrointestinal endoscopy: a review. Medinfo 1998; 9 Part 2: 104650.
2 Korman, LY & Bidgood, WD Jr Representation of the Gastrointestinal Endoscopy Minimal Standard Terminology in the SNOMED DICOM microglossary. Proceedings of the AMIA Annu Fall Symp, 1997: 4348.
3 Vakil, N & Bourgeois, K. A prospective, controlled trial of eight-bit, 16-bit, and 24-bit digital color images in electronic endoscopy. Endoscopy 1995; 27 (8): 58992. PubMed
4 Rey, JF & Lambert, R. ESGE recommendations for quality control in gastrointestinal endoscopy: guidelines for image documentation in upper and lower
GI endoscopy. Endoscopy 2001; 33 (10): 9013. PubMed
5 Delvaux, M, Korman, LY, Armengol-Miro, JR, Crespi, M, Cass, O & Hagenmuller, F et al. The minimal standard terminology for digestive endoscopy: introduction to structured reporting. Int J Med Inf 1998; 48 (13): 21725. PubMed
6 Delvaux, M, Crespi, M, Armengol-Miro, JR, Hagenmuller, F, Teuffel, W & Spencer, KB et al. Minimal standard terminology for digestive endoscopy: results of prospective testing and validation in the GASTER project. Endoscopy 2000; 32 (4): 34555. PubMed
Copyright © Blackwell Publishing, 2004