IPA - Image Processing Algorithms
IPA stands for Image Processing Algorithms and represents the library of image processing operators and functions. IPA is based on the Prima toolkit ( ), which in turn is a perl-based graphic library. IPA is designed for solving image analysis and object recognition tasks in perl.
IPA works mostly with grayscale images, which can be loaded or created by means of Prima toolkit. See Prima::Image for the information about Prima::Image
class functionality. IPA methods are grouped in several modules, that contain the specific functions. The functions usually accept one or more images and optional parameter hash. Each function has its own set of parameters. If error occurs, the functions call die
, so it is advisable to use eval
blocks around the calls.
A code that produces a binary thresholded image out of a 8-bit grayscale image is exemplified:
use Prima;
use IPA;
use IPA::Point;
my $i = Prima::Image-> load('8-bit-grayscale.gif');
die "Cannot load:$@\n" if $@;
my $binary = IPA::Point::threshold( $i, minvalue => 128);
The abbreviations for pixel types are used, derived from the im::XXX
image type constants, as follows:
im::Byte - 8-bit unsigned integer
im::Short - 16-bit signed integer
im::Long - 32-bit signed integer
im::Float - float
im::Double - double
im::Complex - complex float
im::DCOmplex - complex double
Each function returns the newly created image object with the result of the operation, unless stated otherwise in API.
module contains functions that perform single point transformations and image arithmetic.
Single-point processing is a simple method of image enhancement. This technique determines a pixel value in the enhanced image dependent only on the value of the corresponding pixel in the input image. The process can be described with the mapping function
s = M(r)
where r
and s
are the pixel values in the input and output images, respectively.
- combine [ images, conversionType = conversionScale, combineType = combineSum, rawOutput = 0]
Combines set of images of same dimension and bit depth into one and returns the resulting image.
Supported types: Byte, Short, Long.
- images ARRAY
Array of image objects.
- conversionType INTEGER
An integer constant, one of the following, that indicates how the resulting image would be adjusted in accord to the minimal and maximal values of the result.
constants cut off the output values to the bit maximum, for example, a result vector in 8-bit image [-5,0,100,300] would be transformed to [0,0,100,255].Scale
constants scale the whole image without the cutoff; the previous example vector would be transformed into [0,4,88,255]. TheAbs
suffix shows whether the range calculation would use the whole domain, including the negative values, or the absolute values only.conversionTruncAbs conversionTrunc conversionScale conversionScaleAbs
Default is
. - combineType INTEGER
An integer constant, indicates the type of action performed between pixels of same [x,y] coordinates.
combineMaxAbs - store the maximal absolute pixel value combineSignedMaxAbs - compute the maximal absolute value, but store its original ( before abs()) value combineSumAbs - store the sum of absolute pixel values combineSum - store the sum of pixel values combineSqrt - store the square root of the sum of the squares of the pixel values
Default is
. - rawOutput BOOLEAN
parameter and performs no conversion. If set to true value, the conversion step is omitted.Default is 0.
- threshold IMAGE [ minvalue, maxvalue = 255]
Performs the binary thresholding, governed by
. The pixels, that are belowminvalue
and abovemaxvalue
, are mapped to value 0; the other values mapped to 255.Supported types: Byte
- gamma IMAGE [ origGamma = 1, destGamma = 1]
Performs gamma correction of IMAGE by a product of
.Supported types: Byte
- remap IMAGE [ lookup ]
Performs image mapping by a passed
array of 256 integer values. Example:IPA::Point::remap( $i, lookup => [ (0) x 128, (255) x 127]);
is an equivalent of
IPA::Point::threshold( $i, minvalue => 128);
Supported types: 8-bit
- subtract IMAGE1, IMAGE2, [ conversionType = conversionScale, rawOutput = 0]
Subtracts IMAGE2 from IMAGE1. The images must be of same dimension. For description of
see combine.Supported types: Byte
- mask IMAGE [ test, match, mismatch ]
Test every pixel of IMAGE whether it equals to
, and assigns the resulting pixel with eithermatch
value. Alltest
, andmismatch
scalars can be either integers ( in which casemask
operator is similar to threshold ), or image objects. If the image objects passed, they must be of the same dimensions and bit depth as IMAGE.Supported types: Byte, Short, Long.
- average LIST
Combines images of same dimensions and bit depths, passed as an anonymous array in LIST and returns the average image.
Supported types: Byte, Short, Long, 64-bit integer.
- equalize IMAGE
Returns a histogram-equalized image.
Supported types: Byte
Contains functions that operate in the vicinity of a pixel, and produce image where every pixel is dependant on the values of the source pixel and the values of its neighbors. The process can be described with the mapping function
s = M |... |
|r(j+1,i) ... |
where r
and s
are the pixel values in the input and output images, respectively.
- crispening IMAGE
Applies the crispening algorithm to IMAGE and returns the result.
Supported types: Byte
- sobel IMAGE [ jobMask = sobelNWSE|sobelNESW, conversionType = conversionScaleAbs, combineType = combineMaxAbs, divisor = 1]
Applies Sobel edge detector to IMAGE.
Supported types: Byte
- jobMask INTEGER
Combination of the integer constants, that mask the pixels in Sobel 3x3 kernel. If the kernel is to be drawn as
| (-1,1) (0,1) (1,1) | | (-1,0) (0,0) (1,0) | | (-1,-1)(0,-1)(1,-1)|
Then the constants mask the following points:
sobelRow - (-1,0),(1,0) sobelColumn - (0,1),(0,-1) sobelNESW - (1,1),(-1,-1) sobelNWSE - (-1,1),(1,-1)
(0,0) point is always masked.
- divisor INTEGER
The resulting pixel value is divided to
value after the kernel convolution is applied.
and <combineType> parameters described in combine. - GEF IMAGE [ a0 = 1.3, s = 0.7]
Applies GEF algorithm ( first derivative operator for symmetric exponential filter) to IMAGE.
Supported types: Byte
- SDEF IMAGE [ a0 = 1.3, s = 0.7]
Applies SDEF algorithm ( second derivative operator for symmetric exponential filter) to IMAGE.
Supported types: Byte
- deriche IMAGE [ alpha ]
Applies Deriche edge detector.
Supported types: Byte
- filter3x3 IMAGE [ matrix, expandEdges = 0, edgecolor = 0, conversionType = conversionScaleAbs, rawOutput = 0, divisor = 1 ]
Applies convolution with a custom 3x3 kernel, passed in
.Supported types: Byte
- matrix ARRAY
Array of 9 integers, a 3x3 kernel, to be convoluted with IMAGE. Indexes are:
|0 1 2| |3 4 5| |6 7 8|
- expandEdges BOOLEAN
If false, the edge pixels ( borders ) not used in the convolution as center pixels. If true, the edge pixels used, and in this case
value is used to substitute the pixels outside the image. - edgecolor INTEGER
Integer value, used for substitution of pixel values outside IMAGE, when
parameter is set to 1. - divisor INTEGER
The resulting pixel value is divided to
value after the kernel convolution is applied. - conversionType
See combine
- rawOutput
See combine
- median IMAGE [ w = 0, h = 0 ]
Performs adaptive thresholding with median filter with window dimensions
. - unionFind IMAGE [ method, threshold ]
Applies a union find algorithm selected by
. The only implemented method is average-based region grow ( 'ave' string constant ). Its only parameter isthreshold
, integer value of the balance merger function.Supported types: Byte
Contains methods that produce images, where every pixel is a function of all pixels in the source image. The process can be described with the mapping function
s = M(R)
where s
is the pixel value in the output images, and R is the source image.
- close_edges IMAGE [ gradient, maxlen, minedgelen, mingradient ]
Closes edges of shapes on IMAGE, according to specified
image. The unclosed shapes converted to the closed if the gradient spot between the suspected dents falls undermaxlen
maximal length increment,mingradient
the minimal gradient value and the edge is longer thanminedgelen
.Supported types: Byte
- fill_holes IMAGE [ inPlace = 0, edgeSize = 1, backColor = 0, foreColor = 255, neighborhood = 4]
Fills closed shapes to eliminate the contours with holes in IMAGE.
Supported types: Byte
- inPlace BOOLEAN
If true, the original image is changed
- edgeSize INTEGER
The edge breadth that is not touched by the algorithm
- backColor INTEGER
The pixel value used for determination whether a pixel belongs to the background.
- foreColor INTEGER
The pixel value used for hole filling.
- neighborhood INTEGER
Must be either 4 or 8. Selects whether the algorithm must assume 4- or 8- pixel connection.
- area_filter IMAGE [ minArea = 0, maxArea = INT_MAX, inPlace = 0, edgeSize = 1, backColor = 0, foreColor = 255, neighborhood = 4]
Identifies the objects on IMAGE and filters out these that have their area less than
and more thanmaxArea
. The other parameters are identical to those passed to fill_holes. - identify_contours IMAGE [ edgeSize = 1, backColor = 0, foreColor = 255, neighborhood = 4]
Identifies the objects on IMAGE and returns the contours as array of anonymous arrays of 4- or 8- connected pixel coordinates.
The parameters are identical to those passed to fill_holes.
Supported types: Byte
See also IPA::Region.
- fft IMAGE [ inverse = 0 ]
Performs direct and inverse ( governed by
boolean flag ) fast Fourier transform. IMAGE must have dimensions of power of 2. The resulted image is always of DComplex type.Supported types: all
- fourier IMAGE [ inverse = 0 ]
Performs direct and inverse ( governed by
boolean flag ) fast Fourier transform. If IMAGE dimensions not of power of 2, then IMAGE is scaled up to the closest power of 2, and the result is scaled back to the original dimensions.The resulted image is always of DComplex type.
Supported types: all
- band_filter IMAGE [ low = 0, spatial = 1, homomorph = 0, power = 2.0, cutoff = 20.0, boost = 0.7 ]
Performs band filtering of IMAGE in frequency domain. IMAGE must have dimensions of power of 2. The resulted image is always of DComplex type.
Supported types: all
Boolean flag, indicates whether the low-pass or the high-pass is to be performed.
- spatial BOOLEAN
Boolean flag, indicates if IMAGE must be treated as if it is in the spatial domain, and therefore conversion to the frequency domain must be performed first.
- homomorph BOOLEAN
Boolean flag, indicates if the homomorph ( exponential ) equalization must be performed. Cannot be set to true if the image is in frequency domain ( if
parameter set to true ). - power FLOAT
Power operator applied to the input frequency.
- cutoff FLOAT
Threshold value of the filter.
- boost FLOAT
Multiplication factor used in homomorph equalization.
- butterworth IMAGE [ low = 0, spatial = 1, homomorph = 0, power = 2.0, cutoff = 20.0, boost = 0.7 ]
Performs band filtering of IMAGE in frequency domain. If IMAGE dimensions not of power of 2, then IMAGE is scaled up to the closest power of 2, and the result is scaled back to the original dimensions.
The resulted image is always of DComplex type.
Supported types: all
The parameters are same as those passed to band_filter.
Quote from
Morphological operators often take a binary image and a structuring element as input and combine them using a set operator (intersection, union, inclusion, complement). They process objects in the input image based on characteristics of its shape, which are encoded in the structuring element.
Usually, the structuring element is sized 3x3 and has its origin at the center pixel. It is shifted over the image and at each pixel of the image its elements are compared with the set of the underlying pixels. If the two sets of elements match the condition defined by the set operator (e.g. if the set of pixels in the structuring element is a subset of the underlying image pixels), the pixel underneath the origin of the structuring element is set to a pre-defined value (0 or 1 for binary images). A morphological operator is therefore defined by its structuring element and the applied set operator.
Morphological operators can also be applied to gray-level images, e.g. to reduce noise or to brighten the image.
- BWTransform IMAGE [ lookup ]
Applies 512-byte
LUT string ( look-up table ) to image and returns the convolution result ( hit-and-miss transform). Each byte oflookup
is a set of bits, each corresponding to the 3x3 kernel index:|4 3 2| |5 0 1| |6 7 8|
Thus, for example, the X-shape would be represented by offset 2**0 + 2**2 + 2**4 + 2**6 + 2**8 = 341 . The byte value, corresponding to the offset in
string is stored in the output image.IPA::Morphology
defines several basic LUT transforms, which can be invoked by the following code:IPA::Morphological::bw_METHOD( $image);
or its alternative
IPA::Morphology::BWTransform( $image, lookup => $IPA::Morphology::transform_luts{METHOD}->());
Where METHOD is one of the following string constants:
- dilate
Morphological dilation
- erode
Morphological erosion
- isolatedremove
Remove isolated pixels
- togray
Convert binary image to grayscale by applying the mean filter
- invert
Inversion operator
- prune
Removes 1-connected end points
- break_node
Removes node points that connect 3 or more lines
Supported types: Byte
- dilate IMAGE [ neighborhood = 8 ]
Performs morphological dilation operation on IMAGE and returns the result.
determines whether the algorithm assumes 4- or 8- pixel connectivity.Supported types: Byte, Short, Long, Float, Double
- erode IMAGE [ neighborhood = 8 ]
Performs morphological erosion operation on IMAGE and returns the result.
determines whether the algorithm assumes 4- or 8- pixel connectivity.Supported types: Byte, Short, Long, Float, Double
- opening IMAGE [ neighborhood = 8 ]
Performs morphological opening operation on IMAGE and returns the result.
determines whether the algorithm assumes 4- or 8- pixel connectivity.Supported types: Byte, Short, Long, Float, Double
- closing IMAGE [ neighborhood = 8 ]
Performs morphological closing operation on IMAGE and returns the result.
determines whether the algorithm assumes 4- or 8- pixel connectivity.Supported types: Byte, Short, Long, Float, Double
- gradient IMAGE [ neighborhood = 8 ]
Returns the result or the morphological gradient operator on IMAGE.
determines whether the algorithm assumes 4- or 8- pixel connectivity.Supported types: Byte, Short, Long, Float, Double
- algebraic_difference IMAGE1, IMAGE2 [ inPlace = 0 ]
Performs the algebraic difference between IMAGE1 and IMAGE2. Although this is not a morphological operator, it is often used is conjunction with ones. If the boolean flag
is set, IMAGE1 contains the result.Supported types: Byte, Short, Long, Float, Double
- watershed IMAGE [ neighborhood = 4 ]
Applies the watershed segmentation to IMAGE with given
.Supported types: Byte
- reconstruct IMAGE1, IMAGE2 [ neighborhood = 8, inPlace = 0 ]
Performs morphological reconstruction of IMAGE1 under the mask IMAGE2. Images can be two intensity images or two binary images with the same size. The returned image, is an intensity or binary image, respectively.
If boolean
flag is set, IMAGE2 contains the result.neighborhood
determines whether the algorithm assumes 4- or 8- pixel connectivity.Supported types: Byte, Short, Long, Float, Double
- thinning IMAGE
Applies the skeletonization algorithm, returning image with binary object maximal euclidian distance points set.
Supported types: Byte
Contains function for mapping pixels from one location to another
- mirror IMAGE [ type ]
Mirrors IMAGE vertically or horizontally, depending on integer
, which can be one of the following constants:IPA::Geometry::vertical IPA::Geometry::horizontal
Supported types: all
- shift_rotate IMAGE [ where, size ]
Shifts image in direction
, which is one of the following constantsIPA::Geometry::vertical IPA::Geometry::horizontal
by the offset, specified by integer
.Supported types: all, except that the horizontal transformation does not support 1- and 4- bit images.
Contains miscellaneous helper routines.
- split_channels IMAGE, MODE = 'rgb'
Splits IMAGE onto channels, with the selected MODE, which currently is
only. Returns channels as anonymous array of image objects.Supported types: RGB
- histogram IMAGE
Returns anonymous array of 256 integers, each representing number of pixels with the corresponding value for IMAGE.
Supported types: 8-bit
M.D. Levine. Vision in Man and Machine. McGraw-Hill, 1985.
R. Deriche. Using Canny's criteria to derive a recursively implemented optimal edge detector. International Journal on Computer Vision, pages 167-187, 1987.
R. Boyle and R. Thomas Computer Vision. A First Course, Blackwell Scientific Publications, 1988, pp 32 - 34.
Image Processing Learning Resources.
William K. Pratt. Digital Image Processing. John Wiley, New York, 2nd edition, 1991
John C. Russ. The Image Processing Handbook. CRC Press Inc., 2nd Edition, 1995
L. Vincent & P. Soille. Watersheds in digital spaces: an efficient algorithm based on immersion simulations. IEEE Trans. Patt. Anal. and Mach. Intell., vol. 13, no. 6, pp. 583-598, 1991
L. Vincent. Morphological Grayscale Reconstruction in Image Analysis: Applications and Efficient Algorithms. IEEE Transactions on Image Processing, vol. 2, no. 2, April 1993, pp. 176-201.
Anton Berezin <>, Vadim Belman <>, Dmitry Karasik <>
The Prima toolkit
iterm - interactive tool for the IPA library.
2 POD Errors
The following errors were encountered while parsing the POD:
- Around line 363:
'=item' outside of any '=over'
- Around line 516:
You forgot a '=back' before '=head2'