Image Processing Technology Based On Machine Learning
Image Processing Technology Based On Machine Learning
fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/MCE.2022.3150659, IEEE
Consumer Electronics Magazine
Abstract—Machine learning is a relatively new field. With algorithm to train genetic function network and then used
the deepening of people's research in this field, the application the trained model for satellite video categorization, which
of machine learning is increasingly extensive. On the other solved the problem of inaccurate satellite image
hand, with the advancement of science and technology,
categorization. Maa et al. [8] analyzed an object-based
graphics have been the an indispensable medium of
information transmission, and image processing technology is supervised land cover image classification algorithm. Liu et
also booming. However, the traditional image processing al. [9] used polyscale depth functions to classify scenes
technology, more or less has some defects, this paper from satellite images at high resolution, and the simulation
introduces machine learning into image processing, and accuracy was significantly improved. D. Meraa et al. [10]
studies the image processing technology based on machine used feature selection methods to detect image targets,
learning. This paper summarizes the current popular image
which can achieve real-time detection and a high detection
processing technology, compares various image technology in
detail, and explains the limitations of each image processing rate.
method. In addition, on the basis of image processing, this This paper summarizes the image processing technology,
paper introduces machine learning algorithm, applies compares various image technology in detail, and explains
convolution neural network to feature extraction of image the limitations of each image processing method. In
processing, and carries out simulation test. In the test, we addition, this paper introduces machine learning algorithm
select voc2007 dataset for image segmentation, Imagenet
in image processing technology, and applies convolution
dataset for target detection, cifar100 dataset for image
classification, and ROC curve for performance evaluation. neural network to feature extraction in image processing, so
The results show that the algorithm based on deep learning as to effectively improve the accuracy of image
can achieve high accuracy in image segmentation, segmentation, image classification and target detection,
classification and target detection. The accuracy of image which proves the superiority of image processing
segmentation is 0.984, the accuracy of image classification is technology based on machine learning.
0.987, and the accuracy of target detection is 0.986. Thus,
image processing based on machine learning has great
advantages. II. MACHINE LEARNING AND IMAGE PROCESSING
A. Image processing
Index Terms—Machine learning, Image Processing,
Convolution Neural Network, Feature Extraction (1)Image Enhancement
Image enhancement technology adjusts various attributes
I. INTRODUCTION of the image to make the image clearer, such as adjusting
the brightness, contrast, saturation, and hue of the image to
For image extraction useful information has become
increase its clarity and reduce noise. The method of image
vital, and image processing technology has become vital.
enhancement is to add some information or transform data
Image processing technology has been widely used in
to the original image by certain means, selectively highlight
various fields, including video surveillance, automatic
the features of interest in the image or suppress some
vehicle driving, industrial defect detection, agriculture,
unwanted features in the image, so that the image matches
transportation, medicine, military and other fields [1].
the visual response characteristics. Sometimes the acquired
Following the growth of science and technology, machine
image is dark, low contrast and noisy. Among them, image
training techniques have rapidly returned to the forefront of
enhancement can be divided into two categories: frequency
people's minds. Machine learning technology provides
domain method and space domain method. The former
convenience for many aspects of modern society.
regards the image as a two-dimensional signal and
Digital picture processing is now widely used. Due to the
performs signal enhancement based on the two-dimensional
significance of picture handling skills, there has been a
Fourier transform. The representative algorithms in the
great advancement in image processing technology. Z. Zhu
latter spatial domain method include local averaging
et al. [2] suggested a new multimodal approach to image
method and median filtering method. At this time, the
merging based on image factorization and thin presentation,
image needs to be enhanced. Histogram equalization is an
which can effectively fuse images. L. Mauryaa et al. [3-4]
important image enhancement technology that can be used
proposed a social spider optimized image fusion method,
for the entire image or the extracted parts of the image.
which can increase contrast while maintaining brightness
Both genetic algorithm and particle swarm algorithm have
while fusing images.K. Li et al. [5] used multi-peak feature
the limitation of falling into local minimum. There is also a
fusion for map labeling, which can effectively improve
direct transformation of the image to achieve the effect of
labeling accuracy. Montesinos et al. [6] use Bayesian
image enhancement, such as: Laplacian transformation of
network to classify images, and the classification accuracy
the image using Laplacian operator, Log transformation
can be more than 90%. Singh et al. [7] used genetic
and gamma transformation of the image, etc. Table 1
Authorized licensed use limited to: Politecnico di Milano. Downloaded on June 27,2023 at 09:19:31 UTC from IEEE Xplore. Restrictions apply.
2162-2248 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/MCE.2022.3150659, IEEE
Consumer Electronics Magazine
selects several image enhancement algorithms for operator uses the Laplace operator for image strengthening.
comparison. And digital image processing has rapidly The major thought is to degrade the image by using the
developed into an independent subject with strong vitality second differential of the image. In the image field, the
in more than 40 years. Image enhancement technology has differentiation is sharpening, and the integral is blurring.
gradually involved all aspects of human life and social The use of second-order differentiation to degenerate the
production, and it plays a role in the fields of aerospace, image is the use of neighboring pixels to improve contrast.
biomedicine, industrial production, and public security. There is also a Laplacian function in OpenCV, because
TABLE 1 OpenCV performs Laplace transform on an image. The
SEVERAL IMAGE ENHANCEMENT ALGORITHMS
image is a grayscale image, so it is equivalent to extracting
Author work Algorithm
Y.C. Chang [11] Contrast and brightness Histogram more edge information of the image. The Laplace transform
enhancement transplantation of digital images generally uses a 3×3 convolution kernel to
S. Suresh and S. Contrast and brightness Modified differential convolve the image, and then an enhanced image can be
Lal [12] enhancement evolution
obtained. Among them, the main medium for human
In recent years, the combination of various optimisation transmission of information is language and image.
techniques has also been of great interest. One method is According to statistics, visual information accounts for 80%
the combination of a socket search (CS) with a particle of the various information received by humans, so image
cluster optimization algorithm. The cuckoo search is a information is a very important information transmission
global search algorithm based on population. Combining medium and method. The convolution kernel is self-defined
with the particle cluster algorithm and the genetic according to experimental needs. The convolution kernel
algorithm, has better results than the separate particle used in this article is shown in Figure 1.
algorithm and the genetic algorithm in the best solution
approach[13].
Next, we select several of the most common and simple
0 -1 0
image enhancement algorithms for detailed description.
1) Image enhancement based on histogram equalization -1 5 -1
The main principle of the tool for histogram equalization
based image enhancement algorithm is to redistribute the
pixel values of the image. Its general application scenario is
0 -1 0
to increase the local contrast of the image. The image Figure 1. Laplacian convolution kernel
applied by the algorithm needs to have similar contrast 3)Image enhancement based on gamma transformation
between the local images of the interested part. For Gamma transform mainly rectifies pictures having high
example, histogram equalization can be used to make the or weak grayscale values to strengthen the comparison and
contrast of the over exposed and underexposed images achieve the effect of image correction. The calculation
more prominent, And the image with obvious difference formula is as follows:
between foreground and background. Among them, the s = Cr γ rϵ[0,1] (2)
image enhancement algorithm still has certain defects. Where C is a constant, γ is the gamma coefficient, and
Some images have high peaks and the contrast will not be S is the pixel value after transformation. Choose different
naturally enhanced after processing; and the grayscale of γ to get different gamma curves as shown in Figure 2.
the transformed image is reduced, and some details are
reduced.
The calculation process of histogram equalization
algorithm is as follows:
The first step, equalization process: histogram
equalization ensures that the original size relationship
remains unchanged in the process of image pixel mapping,
that is, the brighter area is still brighter, the darker area is
still darker, but the contrast is increased, and the brightness
cannot be reversed; the value range of pixel mapping
function is between 0 and 255. The cumulative distribution
function is a single growth function, and the range is 0 to 1.
The second step is to realize the cumulative distribution Figure 2. Gamma curve
function From Figure 2, we can see some rules. γ = 1 is the
Comparing the probability distribution function with the dividing line. When γ < 1, the small gray value of the
cumulative distribution function, the two-dimensional image has a strong expansion effect. The smaller the value,
image of the former is uneven, and the latter is the stronger the effect. In addition, when the value of γ > 1,
monotonically increasing. In the process of histogram the expansion effect of large gray value of the image will
equalization, the calculation formula of mapping method is also be enhanced, and the larger the value, the stronger the
as follows: effect. In this way, we can change the value of gamma to
nj
sk = ∑kj=0 n K = 1,2,3, . . L − 1 (1) achieve the purpose of enhancing low gray.
2) Image enhancement based on Laplacian operator (2)Feature Extraction
The image strengthening algorithm based on the Laplace Image features contain the basic information of the
Authorized licensed use limited to: Politecnico di Milano. Downloaded on June 27,2023 at 09:19:31 UTC from IEEE Xplore. Restrictions apply.
2162-2248 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/MCE.2022.3150659, IEEE
Consumer Electronics Magazine
process of image analysis and processing. According to the Where H(x, y) is the value at (x, y), and G(x, y) and
previous research experience, the number of extracted α(x, y) are amplitude and phase respectively.
image features is not the more the better. The more image In the fourth step, the image is divided into image
features, on the contrary, will increase the time of feature blocks, and then the image blocks are divided into cell
extraction in the process of recognition and detection, thus units. The shape of cell unit can be set by itself. For
reducing the efficiency of detection. Among them, there are example, the size of image block is divided into 16 × 16,
many factors that affect the clarity of image quality. The and then each image block is divided into four 8 × 8 cell
uneven outdoor illumination will cause the image gray to units;
be too concentrated; the image obtained by the camera The fifth step is to count the gradient histogram of each
undergoes digital/analog conversion, and noise pollution cell to form the feature descriptor of each cell; suppose that
occurs during line transmission. The image quality will the histogram is divided into 9 bin from 0 to 360 degrees to
inevitably decrease. In the lighter case, the image is count the gradient information of 8 × 8 pixels to form a
accompanied by noise and it is difficult to see the details of 9-dimensional feature vector.
the image; the more severe case, the image is blurred, and In the sixth step, four cells are formed into an image
even the outline of the object is difficult to see clearly. The block, and the hog feature descriptors of all cells in an
features of images should have the following properties: image block are put together to get the hog feature
1) Scale invariance; descriptor of the image block.
2) Rotation invariance; 2)LBP features
3) It has strong anti noise ability and stable robustness to The LBP feature is a local feature of an image. Its core
illumination; idea is to use the pixel value of the center pixel of the
4) At the same time, it has lower feature dimension. image block as the threshold, and then compare the
Table 2 is a comparison of some feature extraction. surrounding pixel values with the threshold. If the value
TABLE 2 exceeds the limit, it shall be recorded as 1, otherwise
FEATURE EXTRACTION ALGORITHM recorded as 0. Classify these 1 and 0 to create a binary
Algorithm Feature extraction limitation number to represent the text information of the image. If
Multi-image saliency Extract ROI Unwanted
analysis background
the image block size is 3×3, an eight-bit binary number is
information appears generated. Taking a 3×3 image block as an example, the
Unified capability Rotation and Different matching LBP extraction process is shown in Figure 3. Calculated as
feature extraction scale-invariant local results for different follows:
features data 7
Digital surface model Pixel and feature level Multi-feature 𝐿𝐵𝑃 = ∑𝑝=0 2𝑝 𝑓(𝑔𝑝 − 𝑔𝑐 ) (6)
extraction of urban classification is Among them, g c and g p are the pixel value of the
scenes difficult
Reversible jump Extract features such Too sensitive to central pixel and the pixel value of the p-th domain pixel
Markov chain Monte as rivers, channels and experimental settings respectively. f(x) is a step function, and its expression is as
Carlo sampler roads follows:
These are artificially designed features. With the 1 𝑥≥0
𝑓(𝑥) { (7)
popularity of deep learning, the most popular feature 0 𝑥<0
extraction method is feature extraction based on CNN. 78 65 26 1 0 0
Now we will explain the feature extraction of hog and LBP.
1)HOG features 100 75 98 1 1
The HOG characteristic is a local feature and the HOG
characteristic is obtained from computing and counting the 63 18 83 0 0 1
grade histogram of the image. The HOG feature has image
geometry and optical invariance characteristics because it
operates on a partial image. The procedure of HOG
function selection is as below:
In the first step, the image is converted into a gray
(10011001)2=(153)10
image, and the conversion formula is as follows:
Figure 3. LBP feature extraction
Gary = 0.3R + 0.59G + 0.11B (3)
3) SIFT features
In the second step, gamma correction is used to
Sift is a scale invariant feature inspection method. To
normalize the color information of the image to eliminate
achieve scale invariance, Sift constructs an image scale
the color interference. The calculation formula is as
space, which is constant to the scaling, rotation and
follows:
radiation shift of the image.
I2 (x, y) = I1 (x, y)Gamma (4)
Firstly, the image scale space is generated, that is, the
Where I1 (x, y) is the value before correction, I2 (x, y) original image is sampled according to different
is the value after correction, and Gamma is the correction frequencies to obtain multiple zoomed images; then, the
coefficient. local extremum points in the scale space are detected,
Then use the results of formulas (5) and (6) to calculate which may include edge response points and some points
the amplitude and phase. The calculation formula is as with low contrast, which need to be excluded to leave the
follows: local extremum points which can reflect the image features
Authorized licensed use limited to: Politecnico di Milano. Downloaded on June 27,2023 at 09:19:31 UTC from IEEE Xplore. Restrictions apply.
2162-2248 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/MCE.2022.3150659, IEEE
Consumer Electronics Magazine
more accurately, The histogram gradient direction of the Gaussian filtering is mainly used to eliminate Gaussian
region centered on the extreme point is counted, and the noise. Its filtering process is to weighted average the gray
maximum direction is taken as the main direction to value of the image, that is, a two-dimensional scale factor
generate the feature descriptor. The calculation method of of the Gaussian kernel convolutes with the pixels in the
feature descriptor is to take 16×16 window around the image to remove the noise.
feature and use Gaussian weighting to draw histogram of 2) Median filtering
gradient direction in 8 directions on 4×4 image block, and Median filter is a classical nonlinear filtering method,
count the accumulated value of each gradient direction to which is very effective to eliminate salt and pepper noise,
form a seed point. The gradient histogram of each seed and has a special role in the phase analysis of optical
point contains 8 values, a total of 128 values, which are measurement fringe image. The implementation process is
combined into a 128 dimensional SIFT feature vector. as follows: the first step is to take the odd number of
The matching of feature points is calculated by nearest sampling points from a given sampling window in the
neighbor algorithm, that is, the Euclidean distance between image. The second step is to assign the middle data to the
feature description vectors is calculated, and the point with current pixel according to the sorted values.
the smallest distance is selected for matching. 3) Edge filtering
(3) Image segmentation This method combines the compromise of geometric
Fragmentation is the classification of image cells into space proximity and pixel difference. Based on Gaussian
different categories, so that there are some relevant parts in filtering, bilateral filtering has an extra Gaussian variance.
each category. This is very important when you are trying Therefore, pixels far away from the edge will not have
to identify some important areas in an image, such as forest much impact on the edge pixels.
cover, pedestrians and vehicles. With the introduction of
B. Machine Learning
post-European, researchers extend the combination of these
algorithms to the segmentation field. (1) Machine learning
Table 3 shows the comparative analysis results of some What is machine learning? Machine learning is a
image segmentation. multidisciplinary cross-specialty, covering knowledge of
TABLE 3 probability theory, statistics, approximate theory and
COMPARISON OF IMAGE SEGMENTATION ALGORITHMS complex algorithm knowledge. Use computer as a tool and
Image segmentation Advantage devote to real and real-time simulation of human learning
Cuckoo search, McCulloch High computational efficiency and methods, and divide existing content into knowledge
method good convergence structure to effectively improve learning efficiency. We
Markov Random Field Algorithm Use fewer features to achieve
higher accuracy know that human beings will have a variety of experiences
Deep Convolutional Neural Efficient boundary extraction in the process of growth and life, and we will regularly
Network improves classification summarize these past history or experience, and draw
Zhenghang Firefly Algorithm Multi-level subdivision and less certain "rules of law". Sometimes we will label the results
calculation time
of these "rule rules", such as successful, failed, correct,
The methods used include quantification, clustering and wrong and so on. In this way, when we are faced with some
the possibility of finding the minimum number of clusters. new things that need to be judged and speculated, we will
In DCCN algorithm, limit detection is first added to a naturally search for and use some of the "rules" we
fragmented encoder to create a new model. It's a summarize to guide our future life and work. Machine
combination of a collective network and a coder, but the learning simulates the "learning" mode of human beings. It
main disadvantage of this model is that it's a very large establishes a set of "models" by training the existing data,
model, which merits attention for researchers. using this so as to predict the new input data. Among them, the
model. And the export limit is very vague. common algorithms of machine learning include decision
(4) Image target detection tree algorithm, naive Bayes algorithm, support vector
The target detection method could be classified into two machine algorithm, random forest algorithm and artificial
sections: target placement and target identification. The neural network algorithm. And machine learning has a wide
position of the target and the target category are precisely range of applications. Whether in the military or civilian
in the known image. Under normal circumstances, the fields, there are opportunities for machine learning
target in the image is uncertain, length, width, height, algorithms, including data analysis and mining, pattern
angle, etc. is random or there is a situation where the target recognition, and bioinformatics.
is not uniform, but includes multiple categories, which (2) Convolution neural network
bring recognition and the target position of the day. A The research on convolutional neural networks can be
traced back to the neocognitron model proposed by
certain degree of complexity.
(5) Image filtering Japanese scholar Kunihiko Fukushima. Convolutional
Filtering is a common method to eliminate interference neural network is a widely used machine learning model,
which is mainly used for classification and prediction. The
in image preprocessing. It can not only suppress the image
noise, but also ensure that the edge information of the structure of convolutional neural network includes input
target in the image is not destroyed. There are many layer, hidden layer and output layer. Because of its
multi-layer network structure, it can be used to approximate
methods for image filtering, including Gaussian filtering,
mean filtering and other linear filtering, as well as median some more complex functions. The traditional image
filtering and bilateral filtering in nonlinear filtering. processing technology is relatively seriously affected by the
1) Gaussian filtering environment. The convolution neural network has strong
robustness, which greatly improves the recognition
Authorized licensed use limited to: Politecnico di Milano. Downloaded on June 27,2023 at 09:19:31 UTC from IEEE Xplore. Restrictions apply.
2162-2248 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/MCE.2022.3150659, IEEE
Consumer Electronics Magazine
Authorized licensed use limited to: Politecnico di Milano. Downloaded on June 27,2023 at 09:19:31 UTC from IEEE Xplore. Restrictions apply.
2162-2248 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/MCE.2022.3150659, IEEE
Consumer Electronics Magazine
negatives, taking false negatives into account. The results show that the image processing technology
based on machine learning can effectively improve the
B. Performance Evaluation Analysis
accuracy of image segmentation, classification and target
ROC curve, also known as receiver operating detection. Artificial intelligence will continue to be a hot
characteristic curve, is also a performance evaluation topic in the future decades, and as the most important
standard. This curve is the effect under all possible research field of artificial intelligence, machine learning,
classification thresholds. This curve is used to draw TP and evolutionary algorithm will also be greatly brilliant. We
FP when different classification thresholds are used. believe that the application of machine learning in image
Lowering the classification threshold will result in more processing technology is more and more extensive, and the
samples being classified as positive, thereby increasing the research results of this paper will also provide some
number of false positive and true examples. reference value for this.
The ROC curve is shown in Figure 6. Figure 6(a) in the
figure shows that under ideal conditions. REFERENCES
1 [1] Zhang Hui, Wang Kunfeng, Wang Feiyue. Application and progress
of deep learning in target detection . Acta Automatica Sinica, 2017, 43(8):
true positive rate
Authorized licensed use limited to: Politecnico di Milano. Downloaded on June 27,2023 at 09:19:31 UTC from IEEE Xplore. Restrictions apply.
2162-2248 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.