What is Digital Image Processing (DIP)?

Almost all picture interpretation and analysis in today’s technologically advanced world involves some type of digital processing, as the majority of remote sensing data is collected in digital format. Digital image processing can include a variety of steps, such as data formatting and correction, digital enhancement to improve visual interpretation, or even computer-based target and feature classification. Digital processing of remote sensing imaging requires that the data be captured and made accessible in a format that can be stored on a computer disk or tape. A computer system, also known as an image analysis system, with the necessary hardware and software to handle the data is obviously the other prerequisite for digital image processing. A number of software programs that are sold commercially have been created especially for processing and analyzing images from remote sensing. The majority of the standard image processing features found in image analysis systems can be divided into the following four groups for discussional reasons.

 Preprocessing
 Image Enhancement
 Image Transformation
 Image Classification and Analysis

A.Preprocessing functions, which are typically classified as geometric or radiometric adjustments, are those activities that are typically necessary before the primary data analysis and information extraction. In order to appropriately depict the reflected or emitted radiation measured by the sensor, radiometric corrections involve transforming the data and adjusting for undesired sensor or ambient noise. Geometric adjustments include converting the data to real-world coordinates (such as latitude and longitude) on the surface of the Earth and correcting for geometric distortions brought on by differences in sensor-Earth geometry.

B.Image enhancement The objective of the second group of image processing functions grouped under the term of image enhancement is solely to improve the appearance of the imagery to assist in visual interpretation and analysis. Examples of enhancement functions include contrast stretching to increase the tonal distinction between various features in a scene, and spatial filtering to enhance (or suppress) specific spatial patterns in an image.

C.Image transformations are operations similar in concept to those for image enhancement.

However, unlike image enhancement operations which are normally applied only to a single channel of data at a time, image transformations usually involve combined processing of data from multiple spectral bands. Arithmetic operations (i.e. subtraction, addition, multiplication, division) are performed to combine and transform the original bands into “new” images which better display or highlight certain features in the scene. We will look at some of these operations including various methods of spectral or band ratioing, and a procedure called principal components analysis which is used to more efficiently represent the information in multichannel imagery.

D.Image classification and analysis operations are used to digitally identify and classify pixels in the data. Classification is usually performed on multi-channel data sets (A) and this process assigns each pixel in an image to a particular class or theme (B) based on statistical characteristics of the pixel brightness values. There are a variety of approaches taken to perform digital classification. We will briefly describe the two generic approaches which are used most often, namely supervised and unsupervised classification.

A.Pre-processing

Pre-processing operations, sometimes referred to as image restoration and rectification, are intended to correct for sensor- and platform-specific radiometric and geometric distortions of data. Radiometric corrections may be necessary due to variations in scene illumination and viewing geometry, atmospheric conditions, and sensor noise and response. Each of these will vary depending on the specific sensor and platform used to acquire the data and the conditions during data acquisition. Also, it may be desirable to convert and/or calibrate the data to known (absolute) radiation or reflectance units to facilitate comparison between data.

B.Radiometric correction

The radiance measured by a sensor for a specific object on the earth is influenced by factors as changes in scene illumination, atmospheric conditions, viewing geometry, and instrument response characteristics. Through the application of radiometric correction techniques we are able to reduce and calibrate the influence of these aspects. In this paragraph we will explain the different steps of the radiometric correction process which are dependent upon the characteristics of the sensor used to acquire the image data. Cloud cover is often a problem in optical remote sensing. This problem can be overcome by taking a sequence of images (say on five consecutive days) and cutting and pasting together an image that represents a cloud-free composite.

However, for this application it is necessary to correct for differences in sun elevation and earth-sun distance. The sun elevation correction accounts for the seasonal position of the sun relative to the earth. The earth-sun distance correction is applied to normalize for seasonal changes in the distance between the earth and the sun. The parameters for these corrections are normally part of the ancillary data supplied with the image and depend on date and time of image acquisition.

1.Image noise removal (Cosmetic corrections)

A special kind of error in remote sensing images related to sensor characteristics is called image noise. Image noise is any unwanted disturbance in image data due to limitations in the sensing, signal digitization, or data recording process. It can be the result of periodic drift or malfunction of a detector, electronic interference between sensor components and intermittent data losses in data transmission and recording sequence of an image.

Noise can either degrade or totally mask the true radiometric information content of a digital image. In most cases these kinds of errors can already be deduced from a visual check of the raw DN value. Specialized procedures are available to remove or restore image noise features. When it is know that certain types of image noise occur for a sensor, often this information will be provided by the data provider or it is restored before delivery of the image. Well-known types of image noise in remote sensing are:

Striping or banding is a systematic noise type and is related to sensors that sweep multiple scan lines simultaneously. This stems from variations in the response of the individual detectors used within each band. For example the radiometric response of one of the six detectors of the early Landsat MSS sensor tended to drift over time this resulted in relatively higher or lower values along every sixth line in the image data. A common way to destripe an image is the histogram method.

Another line-oriented noise problem is line drop, where a number of adjacent pixels along a line (or an entire line) may contain erroneous DNs. This problem is solved by replacing the defective DNs with the average of the values for the pixels occurring in the lines just above and below.

Bit errors are a good example of random noise within an image. Such noise causes images to have a “salt and pepper” or “snowy” appearance. This kind of noise can be removed by using moving neighborhood windows, where all pixels are compared to their neighbors. If the difference between a given pixel and its surroundings exceeds a certain threshold, the pixel is assumed to contain noise and is replaced by an average value of the surrounding pixels.

2.Atmospheric correction

The composition of the atmosphere has an important effect on the measurement of radiance with remote sensing. The atmosphere consists mainly of molecular nitrogen and oxygen (clean dry air). In addition, it contains water vapour and particles (aerosols) such as dust, soot, water droplets and ice crystals. For certain applications of remote sensing, information on the atmospheric conditions are required to determine ozone and N2 concentrations as indicator for smog or for weather forecast. However, for most land applications the adverse effects of the atmosphere needs to be removed before remotely sensed data can be properly analyzed. The atmosphere affects the radiance measured at any pixel in an image in two different ways. On the one hand, it reduces the energy illuminating the earth surface for example through absorption of light. This affects the direct reflected light. On the other hand, the atmosphere acts as reflector itself, the resulting diffuse radiation is caused by scattering. This means that the most important step in atmospheric correction is to distinguish “real” radiance as reflected by the earth surface from the disturbing path radiance originating from atmospheric scattering.

When meteorological field data on the composition of the atmosphere during the image acquisition are available, it is possible to reduce these effects using atmospheric correction models.

Several atmospheric correction models are available which vary a great deal in complexity. In principle they correct for two main effects: scattering and absorption. Scattering can be described as disturbance of the electromagnetic field by the constituents of the atmosphere resulting in a change of the direction and the spectral distribution of the energy in the beam. Absorption takes place due to the presence of molecules in the atmosphere. Their influence on the attenuation of radiation varies highly with wavelength.

A simple method to correct for atmospheric effects like haze in an image is the so called darkest pixel method. In this method objects are identified from which we know they have very low reflectance values (and thus have a dark appearance in the image). For example, the reflectance of deep clear water is essentially zero in the near-infrared region of the spectrum. Therefore any signal measured over this kind of water represents signal originating from the atmosphere only (path radiance).

To correct for the atmospheric haze, the measured signal value is subtracted from all image pixels in that band. The figure below shows the result of a more complex atmospheric correction method for a Landsat TM image. This method also accounts for aerosol composition of the atmosphere and so-called adjacency effects. The final result of an atmospheric correction procedure are surface reflectance values for all image pixels which can be used for further image processing, e.g., classification and variable estimation (LAI).

3. Geometric corrections

Geometric correction of remote sensing is normally implemented as a two-step procedure. First, those distortions are considered that are systematic. Secondly, distortions that are random, or unpredictable, are corrected. Systematic errors are predictable in nature and can be corrected using data from the orbit of the platform and knowledge of internal sensor distortion. Common types of systematic distortions are: scan skew, mirror-scan velocity, panoramic distortions, platform velocity, earth rotation, perspective (Jensen, 1996). Most commercially available remote sensing data (e.g., Landsat, SPOT) already have much of the systematic error removed.

Unsystematic errors are corrected based on geometric registration of remote sensing imagery to a known ground coordinate system (e.g., topographic map). The geometric registration process involves identifying the image coordinates (i.e. row, column) of several clearly discernible points (e.g., road crossings), called ground control points (or GCPs), in the distorted image (A – A1 to A4), and matching them to their true positions in ground coordinates (e.g. latitude, longitude). The true ground coordinates are typically measured from a map (B – B1 to B4), either in paper or digital format. This is image-to-map registration. Once several well distributed GCP pairs have been identified, the coordinate information is processed by the computer to determine the proper transformation equations to apply to the original (row and column) image coordinates to map them into their new ground coordinates. Geometric registration may also be performed by registering one (or more) images to another image, instead of to geographic coordinates. This is called image-to-image registration.

In order to actually geometrically correct the original distorted image, a procedure called resampling is used to determine the digital values to place in the new pixel locations of the corrected output image. The resampling process calculates the new pixel values from the original digital pixel values in the uncorrected image. There are three common methods for resampling: nearest neighbor, bilinear interpolation, and cubic convolution.

Nearest neighbor resampling uses the digital value from the pixel in the original image which is nearest to the new pixel location in the corrected image. This is the simplest method and does not alter the original values, but may result in some pixel values being duplicated while others are lost. This method also tends to result in a disjointed or blocky image appearance.

Bilinear interpolation resampling takes a weighted average of four pixels in the original image nearest to the new pixel location. The averaging process alters the original pixel values and creates entirely new digital values in the output image. This may be undesirable if further processing and analysis, such as classification based on spectral response, is to be done. If this is the case, resampling may best be done after the classification process.

Cubic convolution resampling goes even further to calculate adistance weighted average of a block of sixteenpixels from the original image which surround thenew output pixel location. As with bilinear interpolation, this method results in completely newpixel values. However, these two methods both produce images which have a much sharperappearance and avoid the blocky appearance of the nearest neighbor method.

B.Image Enhancement

Enhancements are used to make it easier for visual interpretation and understanding of imagery. The advantage of digital imagery is that it allows us to manipulate the digital pixel values in an image. Although radiometric corrections for illumination, atmospheric influences, and sensor characteristics may be done prior to distribution of data to the user, the image may still not be optimized for visual interpretation. Remote sensing devices, particularly those operated from satellite platforms, must be designed to cope with levels of target/background energy which are typical of all conditions likely to be encountered in routine use. With large variations in spectral response from a diverse range of targets (e.g. forest, deserts, snowfields, water, etc.) no generic radiometric correction could optimally account for and display the optimum brightness range and contrast for all targets. Thus, for each application and each image, a custom adjustment of the range and distribution of brightness values is usually necessary.

In raw imagery, the useful data often populates only a small portion of the available range of digital values (commonly 8 bits or 256 levels). Contrast enhancement involves changing the original values so that more of the available range is used, thereby increasing the contrast between targets and their backgrounds. The key to understanding contrast enhancements is to understand the concept of an image histogram. A histogram is a graphical representation of the brightness values that comprise an image. The brightness values (i.e. 0-255) are displayed along the x-axis of the graph. The frequency of occurrence of each of these values in the image is shown on the y-axis.

By manipulating the range of digital values in an image, graphically represented by its histogram, we can apply various enhancements to the data. There are many different techniques and methods of enhancing contrast and detail in an image; we will cover only a few common ones here. The simplest type of enhancement is a linear contrast stretch. This involves identifying lower and upper bounds from the histogram (usually the minimum and maximum brightness values in the image) and applying a transformation to stretch this range to fill the full range. In our example, the minimum value (occupied by actual data) in the histogram is 84 and the maximum value is 153. These 70 levels occupy less than one-third of the full 256 levels available. A linear stretch uniformly expands this small range to cover the full range of values from 0 to 255. This enhances the contrast in the image with light toned areas appearing lighter and dark areas appearing darker, making visual interpretation much easier. This graphic illustrates the increase in contrast in an image before (top) and after (bottom) a linear contrast stretch

A uniform distribution of the input range of values across the full range may not always be an appropriate enhancement, particularly if the input range is not uniformly distributed. In this case, a histogram-equalized stretch may be better. This stretch assigns more display values (range) to the frequently occurring portions of the histogram. In this way, the detail in these areas will be better enhanced relative to those areas of the original histogram where values occur less frequently. In other cases, it may be desirable to enhance the contrast in only a specific portion of the histogram. For example, suppose we have an image of the mouth of a river, and the water portions of the image occupy the digital values from 40 to 76 out of the entire image histogram. If we wished to enhance the detail in the water, perhaps to see variations in sediment load, we could stretch only that small portion of the histogram represented by the water (40 to 76) to the full grey level range (0 to 255). All pixels below or above these values would be assigned to 0 and 255, respectively, and the detail in these areas would be lost. However, the detail in the water would be greatly enhanced.

Spatial filtering encompassesanother set ofdigital processingfunctions whichare used toenhance theappearance of animage. Spatial filters are designed to highlight or suppress specific features in an image basedon their spatial frequency. Spatial frequency is related to the concept of image texture, whichwe discussed in section 4.2. It refers to the frequency of the variations in tone that appear inan image. “Rough” textured areas of an image, where the changes in tone are abrupt over asmall area, have high spatial frequencies, while “smooth” areas with little variation in toneover several pixels, have low spatial frequencies. A common filtering procedure involvesmoving a ‘window’ of a few pixels in dimension (e.g. 3×3, 5×5, etc.) over each pixel in theimage, applying a mathematical calculation using the pixel values under that window, andreplacing the central pixel with the new value. The window is moved along in both the row andcolumn dimensions one pixel at a time and the calculation is repeated until the entire imagehas been filtered and a “new” image has been generated. By varying the calculationperformed and the weightings of the individual pixels in the filter window, filters can bedesigned to enhance or suppress different types of features.

A low-pass filter is designed to highlight larger, homogeneous areas of similar tone and reduce the smaller detail in an image. Thus, low-pass filters generally serve to smooth the appearance of an image. These filters will be used mainly to remove noises from an image. Noises in an image can be classified in two according their structure: random noise, and systematic or repetitive noise. Systematic noise is usually due to problems in the sensor. Random noise might occur (for example) on a camera film, when some points on the film itself were damaged.

Such filters can be used also to lower the variability between pixels that belong to the same category, and thus might help in performing a better classification.

High-pass filters do the opposite and serve to sharpen the appearance of fine detail in an image. One implementation of a high-pass filter first applies a low-pass filter to an image and then subtracts the result from the original, leaving behind only the high spatial frequency information. High-pass filters examine the difference between adjacent pixels. As these differences are usually small, a contrast stretch should be performed on the output filtered image, in order to see better the details. These filters can be used in algorithms for pattern recognition, when the frequency of an “edge” in unit of area can indicate certain objects. The output filtered image can be also used as another band helping in the classification procedure.Directional, or edgedetection filters are designed to highlight linear features, such as roads or field boundaries. These filters can also be designed to enhance features which are oriented in specific directions. These filters are useful in applications such as geology, for the detection of linear geologic structures.

C.Image Transformations

Image transformations typically involve the manipulation of multiple bands of data, whether from a single multispectral image or from two or more images of the same area acquired at different times (i.e. multitemporal image data). In either way, image transformations generate “new” images from two or more sources which highlight particular features or properties of interest, better than the original input images.

Basic image transformations apply simple arithmetic operations to the image data. Image subtraction is often used to identify changes that have occurred between images collected on different dates. Typically, two images which have been geometrically registered (see section 4.4) are used with the pixel (brightness) values in one image (1) being subtracted from the pixel values in the other (2). Scaling the resultant image (3). In such an image, areas where there has been little or no change (A) between the original images, will have the same resultant brightness values, while those areas where significant change has occurred (B) will have values higher or lower than the original value i.e. brighter or darker depending on the ‘direction’ of change in reflectance between the two images. This type of image transform can be useful for mapping changes in urban development around cities and for identifying areas where deforestation is occurring, as an example.

Image division or spectral ratioing is one of the most common transforms applied to image data. Image ratioing serves to highlight subtle variations in the spectral responses of various surface covers. By ratioing the data from two different spectral bands, the resultant image enhances variations in the slopes of the spectral reflectance curves between the two different spectral ranges that may otherwise be masked by the pixel brightness variations in each of the bands. The following example illustrates the concept of spectral ratioing. Healthy vegetation reflects strongly in the near-infrared portion of the spectrum while absorbing strongly in the visible red. Other surface types, such as soil and water, show near equal reflectance’s in both the near-infrared and red portions. Thus, a ratio image of Landsat MSS Band 7 (Near-Infrared – 0.8 to 1.1 mm) divided by Band 5 (Red – 0.6 to 0.7 mm) would result in ratios much greater than 1.0 for vegetation, and ratios around 1.0 for soil and water. Thus the discrimination of vegetation from other surface cover types is significantly enhanced. Also, we may be better able to identify areas of unhealthy or stressed vegetation, which show low near-infrared reflectance, as the ratios would be lower than for healthy green vegetation.

D. Image Classification and Analysis

A human analyst attempting to classify features in an image uses the elements of visual interpretation to identify homogeneous groups of pixels which represent various features or land cover classes of interest. Digital image classification uses the spectral information represented by the digital numbers in one or more spectral bands, and attempts to classify each individual pixel based on this spectral information. This type of classification is termed spectral pattern recognition. In either case, the objective is to assign all pixels in the image to particular classes or themes (e.g. water, coniferous forest, deciduous forest, corn, wheat, etc.). The resulting classified image is comprised of a mosaic of pixels, each of which belong to a particular theme, and is essentially a thematic “map” of the original image.

Information classes are those categories of interest that the analyst isactually trying to identify in the imagery, such as different kinds of crops, different forest typesor tree species, different geologic units or rock types, etc.

Spectral classes are groups ofpixels that are uniform (or near-similar) with respect to their brightness values in the differentspectral channels of the data. The objective of image classification is to match the spectral classes in the data to theinformation classes of interest. Rarely is there a simple one-to-one match between these twotypes of classes. Rather, unique spectral classes may appear which do not necessarilycorrespond to any information class of particular use or interest to the analyst. Alternatively, abroad information class (e.g. forest) may contain a number of spectral sub-classes withunique spectral variations. Using the forest example, spectral sub-classes may be due tovariations in age, species, and density, or perhaps as a result of shadowing or variations inscene illumination. It is the analyst’s job to decide on the utility of the different spectral classesand their correspondence to useful information classes.

Common classification procedures can be broken down into two broad subdivisions based on the method used: supervised classification and unsupervised classification.

In a supervised classification, the analyst identifies in the imagery homogeneous representative samples of the different surface cover types (information classes) of interest. These samples are referred to as training areas. The selection of appropriate training areas is based on the analyst’s familiarity with the geographical area and their knowledge of the actual surface cover types present in the image. Thus, the analyst is “supervising” the categorization of a set of specific classes. The numerical information in all spectral bands for the pixels comprising these areas are used to “train” the computer to recognize spectrally similar areas for each class. The computer uses a special program or algorithm (of which there are several variations), to determine the numerical “signatures” for each training class. Once the computer has determined the signatures for each class, each pixel in the image is compared to these signatures and labeled as the class it most closely “resembles” digitally. Thus, in a supervised classification we are first identifying the information classes which are then used to determine the spectral classes which represent them.

Unsupervised classification in essence reverses the supervised classification process. Spectral classes are grouped first, based solely on the numerical information in the data, and are then matched by the analyst to information classes (if possible). Programs, called clustering algorithms, are used to determine the natural (statistical) groupings or structures in the data. Usually, the analyst specifies how many groups or clusters are to be looked for in the data. In addition to specifying the desired number of classes, the analyst may also specify parameters related to the separation distance among the clusters and the variation within each cluster. The final result of this iterative clustering process may result in some clusters that the analyst will want to subsequently combine, or clusters that should be broken down further – each of these requiring a further application of the clustering algorithm. Thus, unsupervised classification is not completely without human intervention. However, it does not start with a pre-determined set of classes as in a supervised classification.