Deep Learning For Smart Fish Farming: Applications, Opportunities and Challenges
Deep Learning For Smart Fish Farming: Applications, Opportunities and Challenges
net/publication/340939333
CITATIONS READS
0 348
6 authors, including:
Chao Zhou
National Engineering Research Center for Information Technology in Agriculture
19 PUBLICATIONS 149 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Chao Zhou on 01 July 2020.
Xinting Yang1,2,3, Song Zhang1,2,3,5, Jintao Liu1,2,3,6, Qinfeng Gao4, Shuanglin Dong4, Chao Zhou1,2,3*
1. Beijing Research Center for Information Technology in Agriculture, Beijing 100097, China
2. National Engineering Research Center for Information Technology in Agriculture, Beijing 100097, China
3. National Engineering Laboratory for Agri-product Quality Traceability, Beijing, 100097, China
4. Key Laboratory of Mariculture, Ministry of Education, Ocean University of China, Qingdao, Shandong Province, 266100, China
5. Tianjin University of Science and Technology, Tianjin 300222, China
6. Department of Computer Science, University of Almeria, Almeria, 04120, Spain
DOI:https://doi.org/10.1111/raq.12464
Abstract
The rapid emergence of deep learning (DL) technology has resulted in its successful use in various
fields, including aquaculture. DL creates both new opportunities and a series of challenges for
information and data processing in smart fish farming. This paper focuses on applications of DL in
aquaculture, including live fish identification, species classification, behavioral analysis, feeding
decisions, size or biomass estimation, and water quality prediction. The technical details of DL
methods applied to smart fish farming are also analyzed, including data, algorithms, and performance.
The review results show that the most significant contribution of DL is its ability to automatically
extract features. However, challenges still exist; DL is still in a weak artificial intelligence stage and
requires large amounts of labeled data for training, which has become a bottleneck that restricts further
DL applications in aquaculture. Nevertheless, DL still offers breakthroughs for addressing complex
data in aquaculture. In brief, our purpose is to provide researchers and practitioners with a better
understanding of the current state of the art of DL in aquaculture, which can provide strong support
for implementing smart fish farming applications.
In 2016, the global fishery output reached a record high of 171 million tons. Of this output, 88% is
consumed directly by human beings and is essential for achieving the Food and Agriculture
Organization of the United Nations (FAO)'s goal of building a world free from hunger and malnutrition
(FAO, 2018). However, as the population continues to grow, the pressure on the world’s fisheries will
continue to increase (Merino et al., 2012 ; Clavelle et al., 2019).
Smart fish farming refers to a new scientific field whose objective is to optimize the efficient use
of resources and promote sustainable development in aquaculture through deeply integrating the
Internet of Things (IoT), big data, cloud computing, artificial intelligence and other modern
information technologies. Furthermore, the real-time data collection, quantitative decision-making,
intelligent control, precise investment and personalized service, have been achieved, finally forming a
new fishery production mode (Figure 1).
Figure 1. The role of deep learning and big data in smart fish farming
In smart fish farming, data and information are the core elements. The aggregation and advanced
analytics of all or part of the data will lead to the ability to make scientifically based decisions.
However, the massive amount of data in smart fish farming imposes a variety of challenges, such as
multiple sources, multiple formats and complex data. Multiple sources include information regarding
the equipment, the fish, the environment, the breeding process and people. The multiple formats
include text, image and audio. The data complexities stem from different cultured species, modes and
stages. Addressing the above high-dimensional, nonlinear and massive data is an extremely
challenging task.
More attention is being paid to data and intelligence in current fish farming than ever before. As
shown in Figure 1, data-driven intelligence methods, including artificial intelligence and big data, have
begun to transform these data into operable information for smart fish farming (Olyaie et al., 2017 ;
Shahriar & McCulluch, 2014). Artificial intelligence, especially machine learning and computer vision
applications, is the next frontier technology of fishery data systems (Bradley et al., 2019). Traditional
machine learning methods, such as the support vector machine (SVM) (Cortes & Vapnik, 1995),
artificial neural networks (ANN) (Hassoun, 1996), decision trees (Quinlan, 1986), and principal
component analysis (Jolliffe, 1987), have achieved satisfactory performances in a variety of
applications (Wang et al., 2018). However, the conventional machine learning algorithms rely heavily
on features manually designed by human engineers (Goodfellow, 2016), and it is still difficult to
determine which features are most suitable for a given task (Min et al., 2017).
As a breakthrough in artificial intelligence (AI), deep learning (DL) has overcome previous
limitations. DL methods have demonstrated outstanding performances in many fields, such as
agriculture (Yang et al., 2018 ; Gouiaa & Meunier, 2017), natural language processing (Li, 2018),
medicine (Gulshan et al., 2016), meteorology (Mao et al., 2019), bioinformatics (Min et al., 2017),
and security monitoring (Dhiman & Vishwakarma, 2019). DL belongs to the field of machine learning
but improves data processing by extracting highly nonlinear and complex features via sequences of
multiple layers automatically rather than requiring handcrafted optimal feature representations for a
particular type of data based on domain knowledge (LeCun et al., 2015 ; Goodfellow, 2016). With
its automatic feature learning and high-volume modeling capabilities, DL provides advanced analytical
tools for revealing, quantifying and understanding the enormous amounts of information in big data to
support smart fish farming (Liu et al., 2019). DL techniques can be used to solve the problems of
limited intelligence and poor performance in the analysis of massive, multisource and heterogeneous
big data in aquaculture. By combining the IoT, cloud computing and other technologies, it is possible
to achieve intelligent data processing and analysis, intelligent optimization and decision-making
control functions in smart fish farming.
This paper provides a comprehensive review of DL and its applications in smart fish farming.
First, the various DL applications related to aquaculture are outlined to highlight the latest advances in
relevant areas, and the technical details are briefly introduced. Then, the challenges and future trends
of DL in smart fish farming are discussed. The remainder of this paper is organized as follows: After
the Introduction, Section 2 introduces basic background knowledge such as DL terminology,
definitions, and the most popular learning models and algorithms. Section 3 describes the main
applications of DL in aquaculture, and Section 4 provides technical details. Section 5 discusses the
advantages, disadvantages and future trends of DL in smart fish farming, and Section 6 concludes the
paper.
Fish
0.98
Shrimp
0.01
Weeds
0.01
Visible layer 1st hidden layer 2nd hidden layer 3rd hidden layer Output
(pixels) (edges) (Corners and contours) (object parts) ( object Class)
Compared with the shallow structure of traditional machine learning, the deep hierarchical
structure used in DL makes it easier to model nonlinear relationships through combinations of
functions (Liakos et al., 2018 ; Wang et al., 2018). The advantages of DL are especially obvious
when the amount of data to be processed is large. More specifically, the hierarchical learning and
extraction of different levels of complex data abstractions in DL provides a certain degree of
simplification for big data analytics tasks, especially when analyzing massive volumes of data,
performing data tagging, information retrieval, or conducting discriminative tasks such as
classification and prediction (Najafabadi et al., 2015). Hierarchical architecture learning systems have
achieved superior performances in several engineering applications (Poggio & Smale, 2003 ;
Mhaskar & Poggio, 2016).
The overall structure, process and principles of applying deep learning to fishery management is
depicted in Figure 4. After the data are collected and transmitted, deep learning performs inductive
analysis, learns the experience or knowledge from the samples, and finally formulates rules to guide
management decisions.
Figure 4. Deep-learning-enabled advanced analytics for smart fish farming
However, when applying deep learning, the most serious issue is that of hallucination. Another
failure mode of neural networks is overlearning or overfitting. In addition, neural networks can be
tricked into producing completely different outputs after imperceptible perturbations are applied to
their inputs (Belthangady & Royer, 2019 ; Moosavi-Dezfooli et al., 2016).
A DL model can better distinguish differences in characteristics, categories, and the environment,
which can be used to extract the features of target fish from an image collected in an unconstrained
underwater environment. Fish species can be classified to identify several basic morphological features
(i.e., the head region, body shape, and scales) (Rauf et al., 2019). Most of the DL models show better
results compared with the traditional approaches, reaching classification accuracies above 90% on the
LifeCLEF 14 and LifeCLEF 15 benchmark fish datasets (Ahmad et al., 2016). To avoid the need for
large amounts of annotated data, general deep structures must be fine-tuned to improve the
effectiveness with which they can identify the pertinent information in the feature space of interest.
Accordingly, various DL models for identifying fish species have been developed using a pretrained
approach called transfer learning (Siddiqui et al., 2017 ; Lu et al., 2019 ; Allken et al., 2019). By
fine-tuning pretrained models to perform fish classification using small-scale datasets, these
approaches enable the network to learn the features of a target dataset accurately and comprehensively
(Qiu et al., 2018), and achieved sufficiently high accuracy to serve as economical and effective
alternatives to manual classification.
In addition to visual characteristics, different species of grouper produce different sound
frequencies that can be used to distinguish these species. For example, CNN and LSTM models were
used to classify sounds produced by four species of grouper; their resulting classification accuracy was
significantly better than the previous weighted mel-frequency cepstral coefficients (WMFCCs) method
(Ibrahim et al., 2018).
Nevertheless, due to the influence of various interferences and the small sets of available samples,
the accuracy of same-species classification still has considerable room to improve. Most current fish
classification methods are designed to distinguish fish with significant differences in body size or shape;
thus, the classification of similar fish and fish of the same species is still challenging (dos Santos &
Gonçalves, 2019).
Table 2 Species classification
Frame Preprocessing Transfer Evaluation Comparisons with other
Model Data Results
work augmentation learning index methods
1 Siddiqui CNN MatCo Videos were collected Resized Y Accuracy 94.3% SRC: 65.42%; CNN: 87.46%
et al. nvNet from several baited
(2017) remote underwater video
sampling programs
during 2011–2013.
2 Ahmad et CNN NA LifeCLEF14 and Resized and N Precision, and AC>90%; each fish SVM, KNN, SRC, PCA-
al. (2016) LifeCLEF15 dataset converted to Recall image takes SVM,PCA-KNN,CNNSVM,
grayscale. approximately 1 ms CNN-KNN
for classification.
3 Ibrahim LSTM NA The dataset contains NA N Accuracy 90% WMFCC<90%
et al. and 60,000 files, and the
(2018) CNN audio duration of each
file is 20 s at a sampling
rate of 10 kHz.
4 Qiu et al. CNN NA ImageNet dataset, F4K Super resolution, Y Accuracy 83.92% B-CNNs: 83.52%;
(2018) dataset, a small-scale Flip and rotation B-CNNs+SE BLOCKS:
fine-grained dataset (i.e., 83.78%
Croatian or QUT fish
dataset).
5 Allken et CNN Tensor ImageNet classification Resized; Rotation, Y Accuracy 94% NA
al. (2019) Flow dataset and the images translation,
collected by the Deep shearing, flipping,
Vision system; a total of and zooming
1,216,914 stereo image
pairs from 63 h 19 min of
data collection.
6 Rauf et CNN NA Fish-Pak Resize; Image Y Accuracy, The proposed VGG-16, one block VGG, two
al. (2019) background Precision, method achieves block VGG, three block VGG,
transparent Recall, F1- state of the art LeNet-5, AlexNet, GoogleNet,
Score performance and and ResNet-50
outperforms
existing methods
7 Lu et al. CNN NA A total of 16,517 fish Resize; Horizontal Y Accuracy > 96.24%. NA
(2019) catching images were flipping, vertical
provided by Fishery flipping, width
Agency, Council of shifting, height
Agriculture (Taiwan) shift, rotation,
shearing, zoom-in,
and zoom-out
8 Jalal et YOLO, Tensor LCF15 datasheet and NA N Accuracy LCF15: 91.64%’
al. (2020) CNN Flow UWA datasheet UWA: 79.8%
3.3 Behavioral analysis
Fish are sensitive to environmental changes, and they exhibit a series of responses to changes
environmental factors through behavioral changes (Saberioon et al., 2017 ; Mahesh et al., 2008). In
addition, behavior serves as an effective reference indicator for fish welfare and harvesting (Zion,
2012). Relevant behavior monitoring, especially for unusual behaviors, can provide a nondestructive
understanding and an early warning of fish status (Rillahan et al., 2011). Real-time monitoring of fish
behavior is essential in understanding their status and to facilitate capturing and feeding decisions
(Papadakis et al., 2012).
Fish display behavior through a series of actions that have a certain continuity and time
correlations. Methods of identifying an action from a single image will lose relevance for images
acquired before and after the action. Therefore, it is desirable to use time-series information related to
the prior and subsequent frames in a video to capture action relevance. DL methods have shown strong
ability to recognize visual patterns (Wang et al., 2017). Table 3 shows the details of the behavioral
analysis using DL. In particular, due to their powerful modeling capabilities for sequential data, RNNs
have the potential to address the above problem effectively (Schmidhuber, 2015). Zhao et al. (2018a)
proposed a novel method based on a modified motion influence map and an RNN to systematically
detect, localize and recognize unusual local behaviors of a fish school in intensive aquaculture.
Tracking individuals in a fish school is a challenging task that involves complex nonrigid
deformations, similar appearances, and frequent occlusions. Fish heads have relatively fixed shapes
and colors that can be used to track individual fish (Butail & Paley, 2011 ; Wang et al., 2012). Thus,
data associations can be achieved across frames, and as a result, behavior trajectory tracking can be
implemented without being affected by frequent occlusions (Wang et al., 2017). In addition, data
enhancement and iterative training methods can be used to optimize the accuracy of classification tasks
for identifying behaviors that cannot be distinguished by the human eye (Xu & Cheng, 2017). Finally,
idTracker and further developments in identification algorithms for unmarked animals have been
successful for 2~15 individuals in small groups (Pérez-Escudero et al., 2014). An improved algorithm,
called Idtracker.ai has also been proposed. Using two different CNNs, Idtracker.ai can track all the
individuals in both small and large groups (up to 100 individuals) with a recognition accuracy that
typically exceeds 99.9% (Romero-Ferrero et al., 2019).
When using deep learning to classify fish behavior, crossing, overlapping and blocking caused by
free-swimming fish (Zhao et al., 2018a ; Romero-Ferrero et al., 2019) and low-quality
environmental images (Zhou et al., 2019) form the main challenges to behavior analysis; thus, these
problems need to be solved in the future.
Table 3 Behavior analysis
Frame Preprocessing Transfer Evaluation Comparisons with
Field Model Data Results
work augmentation learning index other methods
1 Xu and CNN MatCo The head feature maps stored Shifting, N Precision, The proposed method NA
Cheng vNet in the segment in the horizontal and Recall, F1- performs significantly well
(2017) trajectory along with the vertical measure, MT, on all metrics.
trajectory ID form the initial rotation ML,Fragments
training dataset. , ID Switch
2 Zhao et RNN Tensor The behavior dataset was NA N Accuracy detection, localization and Accuracy of OMIM
al. Flow made manually following All recognition: 98.91%, and OMIM less than
(2018a) Occurrences Sampling. 91.67% and 89.89% 82.45%
3 Wang et CNN MatCo Randomly selected 300 rotated N IR, Miss ratio, The proposed method idTracker
al. (2017) vNet frames from each of the 5 Error ratio, outperforms two state-of-
datasets and manually Precision, the-art fish tracking
annotated the head point in recall, MT, methods in terms of 7
each frame. ML, Frag, IDS performance metrics
4 Romero- CNN NA 184 juvenile zebrafish, the extracts Y Accuracy 99.95% NA
Ferrero et dataset comprised 3,312,000 ‘blobs’, and
al. (2019) uncompressed, grayscale, then oriented
labeled images.
5 Li et al. CNN Tensor The image was collected Cut and N Accuracy, Accuracy:99.93%,
(2020) Flow from a glass aquarium synthesis precision and precision: 100%, recall:
recall, 99.86%
3.4 Size or biomass estimation
It is essential to continuously observe fish parameters such as abundance, quantity, size, and
weight when managing a fish farm (França Albuquerque et al., 2019). Quantitative estimation of fish
biomass forms the basis of scientific fishery management and conservation strategies for sustainable
fish production (Zion, 2012 ; Li et al., 2019 ; Saberioon & Císař, 2018 ; Lorenzen et al., 2016 ;
Melnychuk et al., 2017). However, it is difficult to estimate fish biomass without human intervention
because fish are sensitive and move freely within an environment where visibility, lighting and stability
are typically uncontrollable (Li et al., 2019).
Recent applications of DL to fishery science offer promising opportunities for massive sampling
in smart fish farming. Machine vision combined with DL can enable more accurate estimation of fish
morphological characteristics such as length, width, weight, and area. Most reported applications have
been either semisupervised or supervised (Marini et al., 2018 ; Díaz-Gil et al., 2017). For example,
the Mask R-CNN architecture was used to estimate the size of saithe (Pollachius virens), blue whiting
(Micromesistius poutassou), redfish (Sebastes spp.), Atlantic mackerel (Scomber scombrus), velvet
belly lanternshark (Etmopterus spinax), Norway pout (Trisopterus esmarkii), Atlantic herring (Clupea
harengus) (Garcia et al., 2019) and European hake (Álvarez-Ellacuría et al., 2019). Another method
for indirectly estimating fish size is to first detect the head and tail of fish with a DL model and then
calculate the length of fish on that basis. Although this approach increases the workload, it is suitable
for more complex images (Tseng et al., 2020). The structural characteristics and computational
capabilities of DL models can be fully exploited (Hu et al., 2014) to achieve superior performances
compared with other models. In addition, DL-based methods can eliminate the influence of fish
overlap during length estimation.
The number of fish shoals can also provide valuable input for the development of intelligent
systems. DL has shown comprehensive advantages in animal computing. To achieve automatic
counting of fish groups under high density and frequent occlusion characteristics, a fish distribution
map can be constructed using DL; then, the fish distribution, density and quantity can be obtained.
These values can indirectly reflect fish conditions such as starvation, abnormalities and other states,
thereby providing an important reference for feeding or harvest decisions (Zhang et al., 2020).
The age structure of a fish school is another important input to fishery assessment models. The
current method for determining fish school age structure relies on manual assessments of otolith age,
which is a labor-intensive and expertise-dependent process. Using a DL approach, target recognition
can instead be performed by using a pretrained CNN to estimate fish ages from otolith images. The
accuracy is equivalent to that achieved by human experts and considerably faster (Moen et al., 2018).
Optical imaging and sonar are often used to monitor fish biomass. A DL algorithm can be applied
to automatically learn the conversion relationship between sonar images and optical images, thus
allowing a "daytime" image to be generated from a sonar image and a corresponding night vision
camera image. This approach can be effectively used to count fish, among other applications
(Terayama et al., 2019).
Table 4 Size or biomass estimation
Preprocessing
Framew Transfer Evaluation Comparisons with
Model Data and Results
ork learning index other methods
augmentation
1 Levy et CNN Keras ILSVRC12 (Imagenet) NA Y Accuracy The method is robust and can YOLO network
al. dataset handle different types of data, topology
(2018) and copes well with the
unique challenges of marine
images.
2 Terayam GAN NA 1,334 camera and sonar Resized; , N NA The proposed model NA
a et al. image pairs from 10 min normalized; successfully generates
(2019) of data at acquired at 3 fps flipped realistic daytime images
from sonar and night camera
images.
3 Moen et CNN TensorFl The dataset comprises Rotated and N MSE,MCV Mean CV: 8.89%: Comparing
al. ow 4,109 images of otolith normalization lowest MSE value: 2.65 accuracy to human
(2018) pairs and 657 images of experts, mean CV of
single otoliths, totaling 8.89%
8,875 otoliths.
4 Álvarez- R-CNN NA COCO dataset; Photos NA Y Root-mean- 1.9 cm NA
Ellacuría were obtained with a square
et al. single webcam, deviation
(2019) resolution: 1,280×760.
5 Zhang CNN Keras Data were collected from Resized and N Accuracy Accuracy: 95.06% CNN: 89.61%
et al. the "Deep Blue No. 1" net enhanced; MCNN: 91.18%
(2020) cage. The resolution is Gaussian noise
1,920×1,080 and frame and salt-and-
rate is 60 fps. pepper noise were
added
6 Tseng CNN Keras 9,000 fish images were Resized; Rotation, N Accuracy Accuracy: 98.78% NA
et al. provided by Fisheries horizontal and
(2020) Agency, Council of vertical shifting,
Agriculture (Taiwan). horizontal and
Another dataset of 154 vertical flipping,
fish images was acquired and scaling
at Nan-Fang-Ao fishing
harbor (Yilan, Taiwan).
7 Fernan CNN The dataset with 1,653 NA R2 R2: BW: 0.96, CW: 0.95 NA
des et fish images was acquired
al. using a Sony
(2020) DSCWX220 digital
camera,
3.5 Feeding decision-making
In intensive aquaculture, the feeding level of fish directly determines the production efficiency
and breeding cost (Chen et al., 2019). In actual production, the feed cost for some varieties of fish
accounts for more than 60% of the total cost (de Verdal et al., 2017 ; Føre et al., 2016 ; Wu et al.,
2015). Thus, unreasonable feeding will reduce production efficiency, while insufficient feeding will
affect fish growth. Excessive feeding also reduces the feed conversion efficiency, and the residual bait
will pollute the environment (Zhou et al., 2018a). Therefore, large economic benefits can be gained by
optimizing the feeding process (Zhou et al., 2018c). However, many factors affect fish feeding,
including physiological, nutritional, environmental and husbandry factors; consequently it is difficult
to detect the real needs of fish (Sun et al., 2016).
Traditionally, feeding decisions depend primarily on experience and simple timing controls (Liu
et al., 2014b). At present, most research on making feeding decisions using DL has focused mostly on
image analysis. By using machine vision, an improved feeding strategy can be developed in
accordance with fish behavior. Such a system can terminate the feeding process at more appropriate
times, thereby reducing unnecessary labor and improving fish welfare (Zhou et al., 2018a). The feeding
intensity of fish can also be roughly graded and used to guide feeding. A combination of CNN and
machine vison has proved to be an effective way to assess fish feeding intensity characteristics (Zhou
et al., 2019); the trained model accuracy was superior to that of two manually extracted feature
indicators: flocking index of fish feeding behavior (FIFFB) and snatch intensity of fish feeding
behavior (SIFFB) (Zhou et al., 2017b ; Chen et al., 2017). This method can be used to detect and
evaluate fish appetite to guide production practices. Due to recent advances in CNNs, it would be
interesting to consider the use of newer neural network frameworks for both spatial and motion feature
extraction. When combined with time-series information, such models may enable better feeding
decisions. Based on this idea, Måløy et al. (2019) considered both temporal and spatial flow by
combining a three-dimensional CNN (3D-CNN) and an RNN to form a new dual deep neural network.
The 3D-CNN and RNN were used to capture spatial and temporal sequence information, respectively,
thereby achieving recognition of both feeding and nonfeeding behaviors. A comparison showed that
the recognition results achieved with this dual-flow structure were better than those of either individual
CNN or RNN models.
The studies discussed above focused primarily on images. However, many factors affect fish
feeding (Sun et al., 2016); consequently, considering only images is insufficient. In the future,
additional data, such as environmental measurements and fish physiological data, will need to be
incorporated to achieve more reasonable feeding decisions.
Table 5 Feeding decisions
Frame Preprocessing Transfer Performance
Model Data Results
work augmentation learning comparison
1 Måløy RNN Tensor 76 videos taken NA N Accuracy NA
et al. Flow at a resolution of
: 80%
(2019) 224×224 pixels
with RGB color
channels and at
24 f/sec.
2 Zhou et CNN NA Image was RST N Accuracy SVM: 73.75%;
al. collected from a :90%; BPNN: 81.25%;
(2019) laboratory at 1 FIFFB: 86.25%;
f/sec. SIFFB: 83.75%
The data and algorithms used are the two main elements of AI (Thrall et al., 2018). These elements
are all necessary conditions for AI to achieve success.
4.1 Data
In DL, an annotated dataset is critical to ensure a model’s performance (Zhuang et al., 2019).
However, in practice, dataset construction is often affected by issues related to both quantity and
quality. Before any images or specific features can be used as the input to a DL model, some effort is
usually necessary to prepare the images through preprocessing and/or augmentation. The most
common preprocessing procedure is to adjust the image size to meet the requirements of the DL model
being applied (Sun et al., 2018 ; Siddiqui et al., 2017). In addition, the learning process can be
facilitated by highlighting the regions of interest (Wang et al., 2017 ; Zhao et al., 2018b), or by
performing background subtraction, foreground pixel extraction, image denoising enhancement (Qin
et al., 2016 ; Zhao et al., 2018b ; Siddiqui et al., 2017) and other steps to simplify image annotation.
Additionally, some related studies have applied data augmentation techniques to artificially
increase the number of training samples. Data augmentation can be used to generate new labeled data
from existing labeled data through rotation, translation, transposition, and other methods (Meng et al.,
2018 ; Xu & Cheng, 2017). These additional data can help to improve the overall learning process;
and such data augmentation is particularly important for training DL models on datasets that contain
only small numbers of images (Kamilaris & Prenafeta-Boldú, 2018).
In addition, to avoid being constrained by the limited availability of annotation data, some
scholars have directly used pretrained DL models to conduct fish classification, thus avoiding the need
to acquire a large volume of annotated data (Ahmad et al., 2016). However, this approach has many
limitations, such as negative transfer (Pan & Yang, 2010), learning or not learning from holistic images
(Sun et al., 2019), and is consequently difficult to implement satisfactorily for specific applications;
hence, it is typically suitable only for theoretical algorithm research.
4.2 Algorithms
(1) Models. From a technical point of view, various CNN models are still the most popular (29
papers, 71%). However, 2 of the papers reviewed here use a GAN, 3 use an RNN, 2 use an LSTM, 2
use both an LSTM and a CNN, and 2 papers use a DBN and YOLO, respectively. Some CNN models
are combined with output-layer classifiers, such as SVM and Softmax (Qin et al., 2016 ; Sun et al.,
2018) or Softmax (Zhao et al., 2018b ; Naddaf-Sh et al., 2018) classifiers.
(2) Frameworks. Caffe and TensorFlow are the most popular frameworks. One possible reason
for the widespread use of Caffe is that it includes a pretrained model that is easy to fine-tune using
transfer learning (Bahrampour et al., 2015). Whether used for specific commercial applications or
experimental research, the combination of DL and transfer learning helps to reduce the need for a large
amount of data while saving significant training time (Erickson et al., 2017). In addition, a variety of
other DL frameworks and datasets exist that users can use easily. In particular, because of its strong
support for graphical processing unit (GPU), the PyTorch framework has been used extensively in
relatively recent literature (Ketkar, 2017 ; Liu et al., 2019).
In fact, much of the research reviewed here (9/41) uses transfer learning (Siddiqui et al., 2017 ;
Levy et al., 2018 ; Sun et al., 2018), which involves using existing knowledge from related tasks or
fields to improve model learning efficiency. The most common transfer learning technique is to use
pretrained DL models that have been trained on related datasets with different categories. These models
are then adapted to the specific challenges and datasets (Lu et al., 2015). Figure 7 shows a typical
example of transfer learning. First, the network is trained on the source task with the labeled dataset.
Then, the trained parameters of the model are transferred to the target tasks (Sun et al., 2018 ; Oquab
et al., 2014).
Figure 7. Typical example of transfer learning
(3) Model inputs. Although some studies use fish audio and water quality data, most of the model
inputs are images (34, 83%). This situation reflects the significant advantage offered by DL in data
processing, especially image processing. The inputs include public datasets such as the ImageNet
dataset, the Fish4Knowledge (F4K) dataset, and the Croatian and Queensland University of
Technology (QUT) fish datasets. Other datasets include data collected and produced in the field or
obtained through Internet search engines, such as Google (Meng et al., 2018 ; Naddaf-Sh et al., 2018).
Combining optical sensors and machine vision with DL systems provides possibilities for developing
faster, cheaper and noninvasive methods for in situ monitoring and post-harvesting quality monitoring
in aquaculture (Saberioon et al., 2017). However, whether these datasets consist of text, audio, or
image/video data, they typically hold large volumes of data. Such large amounts of data are particularly
important when the problem to be solved is complex or when the difference between adjacent classes
is small.
(4) Model outputs. Among the models used for classification, the outputs range from 4 to 16
classes. For example, one study considers images of 16 species of fish, and another considers 4 types
of fish sound files. Among the other papers, 13 targeted live fish recognition where the outputs were
fish and nonfish; 7 were size or biomass estimations; 2 were quantifications of fish feeding intensity;
6 were water quality predictions; and 5 were behavior analyses. However, from a technical point of
view, the boundaries for identification, classification, and biomass estimation based on these
classification models are quite vague. In these papers, the output and input classes for each model are
the same. Each output consists of a set of probabilities that each input belongs to each class, and the
model finally selects the class with the highest output probability for each input as the predicted class
of that input.
In addition, GAN models typically achieve better overall performances in fish recognition
5. Discussion
6. Conclusion
This paper conducted a deep and comprehensive investigation of the current applications of deep
learning (DL) for smart fish farming. Based on a review of the recent literature, the current applications
can be divided into six categories: live fish identification, species classification, behavioral analysis,
feeding decisions, size or biomass estimation, and water quality prediction. The technical details of the
reported methods were comprehensively analyzed in accordance with the key elements of artificial
intelligence (AI): data and algorithms. Performance comparisons with traditional methods based on
manually extracted features indicate that the greatest contribution of DL is its ability to automatically
extract features. Moreover, DL can also output high-precision processing results. However, at present,
DL technology is still in a weak AI stage and requires a large amount of labeled data for training. This
requirement has become a bottleneck restricting further applications of DL in smart fish farming.
Nevertheless, DL still offers breakthroughs for processing text, images, video, sound and other data,
all of which can provide strong support for the implementation of smart fish farming. In the future, DL
is also expected to expand into new application areas, such as fish disease diagnosis; data will become
increasingly important; and composite models and models that consider spatiotemporal sequences will
represent the main research direction. In brief, our purpose in writing this review was to provide
researchers and practitioners with a better understanding of the current applications of DL in smart fish
farming and to facilitate the application of DL technology to solve practical problems in aquaculture.
Acknowledgments
The research was supported by the National Key Technology R&D Program of China
(2019YFD0901004), the Youth Research Fund of Beijing Academy of Agricultural and Forestry
Sciences (QNJJ202014), and the Beijing Excellent Talents Development Project
(2017000057592G125).
References
Ahmad S, Ahsan J, Faisal S, Ajmal M, Mark S, James S, Euan H (2016) Fish species classification in unconstrained
underwater environments based on deep learning. Limnol. Oceanogr. Methods, 14, 570-585.
Alcaraz C, Gholami Z, Esmaeili HR, García-Berthou E (2015) Herbivory and seasonal changes in diet of a highly
endemic cyprinodontid fish (Aphanius farsicus). Environ. Biol. Fishes, 98, 1541-1554.
Allken V, Handegard NO, Rosen S, Schreyeck T, Mahiout T, Malde K (2019) Fish species identification using a
convolutional neural network trained on synthetic data. ICES J. Mar. Sci., 76, 342-349.
Álvarez-Ellacuría A, Palmer M, Catalán IA, Lisani J-L (2019) Image-based, unsupervised estimation of fish size from
commercial landings using deep learning. ICES J. Mar. Sci.
Bahrampour S, Ramakrishnan N, Schott L, Shah MJCS (2015) Comparative Study of Deep Learning Software
Frameworks. arXiv preprint arXiv:1511.06435, 2015.
Banan A, Nasiri A, Taheri-Garavand A (2020) Deep learning-based appearance features extraction for automated
carp species identification. Aquacult. Eng., 89, 102053.
Belthangady C, Royer LA (2019) Applications, promises, and pitfalls of deep learning for fluorescence image
reconstruction. Nat. Methods, 16, 1215-1225.
Bhagat PK, Choudhary P (2018) Image annotation: Then and now. Image Vision Comput., 80, 1-23.
Boom BJ, Huang X, He J, Fisher RB (2012) Supporting Ground-Truth annotation of image datasets using clustering.
In: Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012) . IEEE, Tsukuba,
Japan, pp. 1542-1545.
Bradley D, Merrifield M, Miller KM, Lomonico S, Wilson JR, Gleason MG (2019) Opportunities to improve fisheries
management through innovative technology and advanced data systems. Fish Fish., 20, 564-583.
Bronstein MM, Bruna J, LeCun Y, Szlam A, Vandergheynst P (2017) Geometric deep learning: going beyond
euclidean data. ISPM, 34, 18-42.
Butail S, Paley DA (2011) Three-dimensional reconstruction of the fast-start swimming kinematics of densely
schooling fish. J R Soc Interface, 9, 77-88.
Cao S, Zhao D, Liu X, Sun Y (2020) Real-time robust detector for underwater live crabs based on deep learning.
Comput. Electron. Agric., 172, 105339.
Chen C, Du Y, Zhou C, Sun C (2017) Evaluation of feeding activity of fishes based on image texture. Transactions of
the Chinese Society of Agricultural Engineering, 33, 232-237.
Chen L, Yang X, Sun C, Wang Y, Xu D, Zhou C (2019) Feed intake prediction model for group fish using the MEA-
BP neural network in intensive aquaculture. Information Processing in Agriculture.
Choi K, Fazekas G, Sandler M, Cho K (2018) A comparison of audio signal preprocessing methods for deep neural
networks on music tagging. In: 2018 26th European Signal Processing Conference (EUSIPCO) . IEEE, Rome,
Italy, pp. 1870-1874.
Clavelle T, Lester SE, Gentry R, Froehlich HE (2019) Interactions and management for the future of marine
aquaculture and capture fisheries. Fish Fish., 20, 368-388.
Cortes C, Vapnik V (1995) Support-vector networks. MLear, 20, 273-297.
Daoliang L, Jianhua B (2018) Research progress on key technologies of underwater operation robot for aquaculture.
Transactions of the Chinese Society of Agricultural Engineering, 36, 1-9.
de Verdal H, Komen H, Quillet E, Chatain B, Allal F, Benzie JAH, Vandeputte M (2017) Improving feed efficiency in
fish using selective breeding: a review. Reviews in Aquaculture, 10, 833-851.
Deng H, Peng L, Zhang J, Tang C, Fang H, Liu H (2019) An intelligent aerator algorithm inspired-by deep learning.
Mathematical Biosciences and Engineering, 16, 2990-3002.
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In:
2009 IEEE conference on computer vision and pattern recognition . Ieee, Miami, FL, USA, pp. 248-255.
Deng L, Yu D (2014) Deep learning: methods and applications. Foundations Trends® in Signal Processing, 7, 197-
387.
Dhiman C, Vishwakarma DK (2019) A review of state-of-the-art techniques for abnormal human activity recognition.
Eng. Appl. Artif. Intell., 77, 21-45.
Di Nucci E, McHugh C (2006) Content, Consciousness, and Perception: Essays in Contemporary Philosophy of Mind,
Cambridge Scholars Press.
Díaz-Gil C, Smee SL, Cotgrove L, Follana-Berná G, Hinz H, Marti-Puig P, Grau A, Palmer M, Catalán IA (2017) Using
stereoscopic video cameras to evaluate seagrass meadows nursery function in the Mediterranean. Mar.
Biol., 164, 137.
dos Santos AA, Gonçalves WN (2019) Improving Pantanal fish species recognition through taxonomic ranks in
convolutional neural networks. Ecol Inform, 53, 100977.
Erickson BJ, Korfiatis P, Akkus Z, Kline T, Philbrick K (2017) Toolkits and Libraries for Deep Learning. J. Digit. Imaging,
30, 400-405.
FAO (2018) The State of World Fisheries and Aquaculture 2018‐Meeting the sustainable development goals. FAO
Rome, Italy.
Fernandes AFA, Turra EM, de Alvarenga ÉR, Passafaro TL, Lopes FB, Alves GFO, Singh V, Rosa GJM (2020) Deep
Learning image segmentation for extraction of fish body measurements and prediction of body weight
and carcass traits in Nile tilapia. Comput. Electron. Agric., 170, 105274.
Føre M, Alver M, Alfredsen JA, Marafioti G, Senneset G, Birkevold J, Willumsen FV, Lange G, Espmark Å, Terjesen BF
(2016) Modelling growth performance and feeding behaviour of Atlantic salmon (Salmo salar L.) in
commercial-size aquaculture net pens: Model details and validation through full-scale experiments.
Aquaculture, 464, 268-278.
França Albuquerque PL, Garcia V, da Silva Oliveira A, Lewandowski T, Detweiler C, Gonçalves AB, Costa CS, Naka
MH, Pistori H (2019) Automatic live fingerlings counting using computer vision. Comput. Electron. Agric.,
167, 105015.
Garcia R, Prados R, Quintana J, Tempelaar A, Gracias N, Rosen S, Vågstøl H, Løvall K (2019) Automatic segmentation
of fish using deep learning with application to fish size measurement. ICES J. Mar. Sci.
Geoffrey E Hinton TJS (1999) Unsupervised Learning: Foundations of Neural Computation .
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic
segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp.
580-587.
Goodfellow IaB, Yoshua and Courville, Aaron (2016) Deep learning, MIT press.
Gouiaa R, Meunier J (2017) Learning cast shadow appearance for human posture recognition. Pattern Recog. Lett.,
97, 54-60.
Gulshan V, Peng L, Coram M, Stumpe MC, Wu D, Narayanaswamy A, Venugopalan S, Widner K, Madams T, Cuadros
J, Kim R, Raman R, Nelson PC, Mega JL, Webster R (2016) Development and Validation of a Deep Learning
Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. JAMA-J. Am. Med. Assoc.,
316, 2402-2410.
Guo X, Zhao X, Liu Y, Li D (2019) Underwater sea cucumber identification via deep residual networks. Information
Processing in Agriculture, 6, 307-315.
Hanbury A (2008) A survey of methods for image annotation. J. Vis. Lang. Comput., 19, 617-627.
Hartill BW, Taylor SM, Keller K, Weltersbach MS (2020) Digital camera monitoring of recreational fishing effort:
Applications and challenges. Fish Fish., 21, 204-215.
Hassoun MH (1996) Fundamentals of Artificial Neural Networks. Proc. IEEE, 10, 906.
Hu H, Wen Y, Chua T, Li X (2014) Toward Scalable Systems for Big Data Analytics: A Technology Tutorial. IEEE Access,
2, 652-687.
Hu J, Li D, Duan Q, Han Y, Chen G, Si X (2012) Fish species classification by color, texture and multi-class support
vector machine using computer vision. Comput. Electron. Agric., 88, 133-140.
Hu J, Wang J, Zhang X, Fu Z (2015) Research status and development trends of information technologies in
aquacultures. Transactions of the Chinese Society for Agricultural Machinery, 46, 251-263.
Hu WC, Wu HT, Zhang YF, Zhang SH, Lo CH (2020) Shrimp recognition using ShrimpNet based on convolutional
neural network. J. Ambient Intell. Humaniz. Comput., 8.
Hu Z, Zhang Y, Zhao Y, Xie M, Zhong J, Tu Z, Liu J (2019) A Water Quality Prediction Method Based on the Deep
LSTM Network Considering Correlation in Smart Mariculture. SeAcA, 19.
Ibrahim AK, Zhuang HQ, Cherubin LM, Scharer-Umpierre MT, Erdol N (2018) Automatic classification of grouper
species by their sounds using deep neural networks. J. Acoust. Soc. Am., 144, EL196-EL202.
Jäger J, Simon M, Denzler J, Wolff V (2015) Croatian Fish Dataset: Fine-grained classification of fish species in their
natural habitat. In: Machine Vision of Animals and their Behaviour (MVAB 2015), pp. 6.1-6.7.
Jalal A, Salman A, Mian A, Shortis M, Shafait F (2020) Fish detection and species classification in underwater
environments using deep learning with temporal information. Ecol Inform, 57, 101088.
Jolliffe I (1987) Principal component analysis. Chemometrics Intellig. Lab. Syst., 2, 37-52.
Kamilaris A, Prenafeta-Boldú FX (2018) Deep learning in agriculture: A survey. Comput. Electron. Agric., 147, 70-
90.
Ketkar N (2017) Introduction to PyTorch. In: Deep Learning with Python: A Hands-on Introduction. Apress, Berkeley,
CA, pp. 195-208.
Kim H, Koo J, Kim D, Jung S, Shin J-U, Lee S, Myung H (2016) Image-Based Monitoring of Jellyfish Using Deep
Learning Architecture. IEEE Sens. J., 16, 2215-2216.
Labao AB, Naval PC (2019) Cascaded deep network systems with linked ensemble components for underwater fish
detection in the wild. Ecol Inform, 52, 103-121.
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature, 521, 436.
Levy D, Belfer Y, Osherov E, Bigal E, Scheinin AP, Nativ H, DanTchernov, Treibitz T (2018) Automated Analysis of
Marine Video With Limited Data. In: The IEEE Conference on Computer Vision and Pattern Recognition
(CVPR) pp. 1385-1393.
Li D, Hao Y, Duan Y (2019) Nonintrusive methods for biomass estimation in aquaculture with emphasis on fish: a
review.
Li H (2018) Deep learning for natural language processing: advantages and challenges. National Science Review, 5,
24-26.
Li J, Xu C, Jiang LX, Xiao Y, Deng LM, Han ZZ (2020) Detection and Analysis of Behavior Trajectory for Sea Cucumbers
Based on Deep Learning. Ieee Access, 8, 18832-18840.
Liakos KG, Busato P, Moshou D, Pearson S, Bochtis D (2018) Machine Learning in Agriculture: A Review. Sensors,
18, 2674.
Lin Q, Yang W, Zheng C, Lu KH, Zheng ZM, Wang JP, Zhu JY (2018) Deep-learning based approach for forecast of
water quality in intensive shrimp ponds. Indian J. Fish., 65, 75-80.
Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, van der Laak JAWM, van Ginneken B, Sánchez CI
(2017) A survey on deep learning in medical image analysis. Med. Image Anal., 42, 60-88.
Liu S, Xu L, Jiang Y, Li D, Chen Y, Li Z (2014a) A hybrid WA–CPSO-LSSVR model for dissolved oxygen content
prediction in crab culture. Eng. Appl. Artif. Intell., 29, 114-124.
Liu Y, Zhang Q, Song L, Chen Y (2019) Attention-based recurrent neural networks for accurate short-term and
long-term dissolved oxygen prediction. Comput. Electron. Agric., 165, 104964.
Liu Z, Li X, Fan L, Lu H, Liu L, Liu Y (2014b) Measuring feeding activity of fish in RAS using computer vision. Aquacult.
Eng., 60, 20-27.
Lorenzen K, Cowx IG, Entsua-Mensah R, Lester NP, Koehn J, Randall R, So N, Bonar SA, Bunnell DB, Venturelli P,
Bower SD, Cooke SJ (2016) Stock assessment in inland fisheries: a foundation for sustainable use and
conservation. Rev. Fish Biol. Fish., 26, 405-440.
Lu H, Li Y, Chen M, Kim H, Serikawa S (2018) Brain Intelligence: Go beyond Artificial Intelligence. Mobile Networks
Applications, 23, 368-375.
Lu J, Behbood V, Hao P, Zuo H, Xue S, Zhang GQ (2015) Transfer learning using computational intelligence: A
survey. Knowledge-Based Syst., 80, 14-23.
Lu Y, Tung C, Kuo Y (2019) Identifying the species of harvested tuna and billfish using deep convolutional neural
networks. ICES J. Mar. Sci.
Mahesh S, Manickavasagan A, Jayas DS, Paliwal J, White NDG (2008) Feasibility of near-infrared hyperspectral
imaging to differentiate Canadian wheat classes. Biosys. Eng., 101, 50-57.
Mahmood A, Bennamoun M, An S, Sohel F, Boussaid F, Hovey R, Kendrick G (2019) Automatic detection of Western
rock lobster using synthetic data. ICES J. Mar. Sci.
Måløy H, Aamodt A, Misimi E (2019) A spatio-temporal recurrent network for salmon feeding action recognition
from underwater videos in aquaculture. Comput. Electron. Agric., 105087.
Mao B, Han LG, Feng Q, Yin YC (2019) Subsurface velocity inversion from deep learning-based data assimilation.
JAG, 167, 172-179.
Marini S, Fanelli E, Sbragaglia V, Azzurro E, Del Rio Fernandez J, Aguzzi J (2018) Tracking Fish Abundance by
Underwater Image Recognition. Sci. Rep., 8, 13748.
Melnychuk MC, Peterson E, Elliott M, Hilborn R (2017) Fisheries management impacts on target species status.
Proceedings of the National Academy of Sciences, 114, 178-183.
Meng L, Hirayama T, Oyanagi S (2018) Underwater-Drone With Panoramic Camera for Automatic Fish Recognition
Based on Deep Learning. Ieee Access, 6, 17880-17886.
Merino G, Barange M, Blanchard JL, Harle J, Holmes R, Allen I, Allison EH, Badjeck MC, Dulvy NK, Holt J (2012) Can
marine fisheries and aquaculture meet fish demand from a growing human population in a changing
climate? Global Environ. Change, 22, 795-806.
Mhaskar HN, Poggio T (2016) Deep vs. shallow networks: An approximation theory perspective. Analysis and
Applications, 14, 829-848.
Min S, Lee B, Yoon S (2017) Deep learning in bioinformatics. Brief. Bioinform., 18, 851-869.
Moen E, Handegard NO, Allken V, Albert OT, Harbitz A, Malde K (2018) Automatic interpretation of otoliths using
deep learning. PLoS ONE, 13, 14.
Mohanty SP, Hughes DP, Salathé M (2016) Using Deep Learning for Image-Based Plant Disease Detection. Frontiers
in plant science, 7.
Moosavi-Dezfooli S-M, Fawzi A, Frossard P (2016) Deepfool: a simple and accurate method to fool deep neural
networks. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2574-2582.
Naddaf-Sh MM, Myler H, Zargarzadeh H (2018) Design and Implementation of an Assistive Real-Time Red Lionfish
Detection System for AUV/ROVs. Complexity, 10.
Najafabadi MM, Villanustre F, Khoshgoftaar TM, Seliya N, Wald R, Muharemagic E (2015) Deep learning applications
and challenges in big data analytics. Journal of Big Data, 2, 1.
Olyaie E, Abyaneh HZ, Mehr AD (2017) A comparative analysis among computational intelligence techniques for
dissolved oxygen prediction in Delaware River. Geoscience Frontiers, 8, 517-527.
Oosting T, Star B, Barrett JH, Wellenreuther M, Ritchie PA, Rawlence NJ (2019) Unlocking the potential of ancient
fish DNA in the genomic era. evolutionary applications, 12, 1513-1522.
Oquab M, Bottou L, Laptev I, Sivic J (2014) Learning and transferring mid-level image representations using
convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern
recognition, pp. 1717-1724.
Pan SJ, Yang Q (2010) A Survey on Transfer Learning. IEEE Transactions on Knowledge and Data Engineering, 22,
1345-1359.
Papadakis VM, Papadakis IE, Lamprianidou F, Glaropoulos A, Kentouri M (2012) A computer-vision system and
methodology for the analysis of fish behavior. Aquacult. Eng., 46, 53-59.
Patrício DI, Rieder R (2018) Computer vision and artificial intelligence in precision agriculture for grain crops: A
systematic review. Comput. Electron. Agric., 153, 69-81.
Pérez-Escudero A, Vicente-Page J, Hinz RC, Arganda S, de Polavieja GG (2014) idTracker: tracking individuals in a
group by automatic identification of unmarked animals. Nat. Methods, 11, 743-748.
Poggio T, Smale S (2003) The mathematics of learning: Dealing with data. 2005 International Conference on Neural
Networks and Brain, 50, 537-544.
Qin H, LI X, Liang J, Peng Y, Zhang C (2016) DeepFish: Accurate underwater live fish recognition with a deep
architecture. Neurocomputing, 187, 49-58.
Qiu C, Zhang S, Wang C, Yu Z, Zheng H, Zheng B (2018) Improving Transfer Learning and Squeeze- and-Excitation
Networks for Small-Scale Fine-Grained Fish Image Classification. IEEE Access, 6, 78503-78512.
Quinlan JR (1986) Induction of decision trees. Machine learning, 1, 81-106.
Rahman A, Dabrowski J, McCulloch J (2019) Dissolved oxygen prediction in prawn ponds from a group of one step
predictors. Information Processing in Agriculture.
Rauf HT, Lali MIU, Zahoor S, Shah SZH, Rehman AU, Bukhari SAC (2019) Visual features based automated
identification of fish species using deep convolutional neural networks. Comput. Electron. Agric., 105075.
Ravì D, Wong C, Deligianni F, Berthelot M, Andreu-Perez J, Lo B, Yang G-Z (2016) Deep learning for health
informatics. IEEE Journal of Biomedical and Health Informatics, 21, 4-21.
Ren Q, Wang X, Li W, Wei Y, An D (2020) Research of dissolved oxygen prediction in recirculating aquaculture
systems based on deep belief network. Aquacult. Eng., 90, 102085.
Rillahan C, Chambers MD, Howell WH, Watson WH (2011) The behavior of cod (Gadus morhua) in an offshore
aquaculture net pen. Aquaculture, 310, 361-368.
Romero-Ferrero F, Bergomi MG, Hinz RC, Heras FJH, de Polavieja GG (2019) idtracker.ai: tracking all individuals in
small or large collectives of unmarked animals. Nat. Methods, 16, 179-182.
Roux NL, Bengio Y (2008) Representational power of restricted boltzmann machines and deep belief networks
Neural Comput., 20, 1631-1649.
Saberioon M, Císař P (2018) Automated within tank fish mass estimation using infrared reflection system.
Computers electronics in agriculture, 150, 484-492.
Saberioon M, Gholizadeh A, Cisar P, Pautsina A, Urban J (2017) Application of machine vision systems in aquaculture
with emphasis on fish: state-of-the-art and key issues. Reviews in Aquaculture, 9, 369-387.
Salman A, Siddiqui SA, Shafait F, Mian A, Shortis MR, Khurshid K, Ulges A, Schwanecke U (2019) Automatic fish
detection in underwater videos by a deep neural network-based hybrid motion learning system. ICES J.
Mar. Sci.
Samuel AL (1959) Some Studies in Machine Learning Using the Game of Checkers. IBM J. Res. Dev., 3, 210-229.
Saufi SR, Ahmad ZAB, Leong MS, Lim MH (2019) Challenges and Opportunities of Deep Learning Models for
Machinery Fault Detection and Diagnosis: A Review. IEEE Access, 7, 122644-122662.
Schmidhuber J (2015) Deep learning in neural networks: An overview. Neural Networks, 61, 85-117.
Shahriar MS, McCulluch J (2014) A dynamic data-driven decision support for aquaculture farm closure. Procedia
Computer Science, 29, 1236-1245.
Shi S, Wang Q, Xu P, Chu X (2016) Benchmarking state-of-the-art deep learning software tools. In: 2016 7th
International Conference on Cloud Computing and Big Data (CCBD) . IEEE, pp. 99-104.
Siddiqui SA, Salman A, Malik MI, Shafait F, Mian A, Shortis MR, Harvey ES (2017) Automatic fish species classification
in underwater videos: exploiting pre-trained deep neural network models to compensate for limited
labelled data. ICES J. Mar. Sci., 75, 374-389.
Sun M, Hassan SG, Li D (2016) Models for estimating feed intake in aquaculture: A review. Comput. Electron. Agric.,
127, 425-438.
Sun R, Zhu X, Wu C, Huang C, Shi J, Ma L (2019) Not All Areas Are Equal: Transfer Learning for Semantic
Segmentation via Hierarchical Region Selection. In: 2019 IEEE/CVF Conference on Computer Vision and
Pattern Recognition (CVPR), pp. 4355-4364.
Sun X, Shi J, Liu L, Dong J, Plant C, Wang X, Zhou H (2018) Transferring deep knowledge for object recognition in
Low-quality underwater videos. Neurocomputing, 275, 897-908.
Ta X, Wei Y (2018) Research on a dissolved oxygen prediction method for recirculating aquaculture systems based
on a convolution neural network. Comput. Electron. Agric., 145, 302-310.
Terayama K, Shin K, Mizuno K, Tsuda K (2019) Integration of sonar and optical camera images using deep neural
network for fish monitoring. Aquacult. Eng., 86, 102000.
Thrall JH, Li X, Li Q, Cruz C, Do S, Dreyer K, Brink J (2018) Artificial intelligence and machine learning in radiology:
opportunities, challenges, pitfalls, and criteria for success. Journal of the American College of Radiology,
15, 504-508.
Tripathi MK, Maktedar DD (2019) A role of computer vision in fruits and vegetables among various horticulture
products of agriculture fields: A survey. Information Processing in Agriculture.
Tseng C-H, Hsieh C-L, Kuo Y-F (2020) Automatic measurement of the body length of harvested fish using
convolutional neural networks. Biosys. Eng., 189, 36-47.
Villon S, Mouillot D, Chaumont M, Darling ES, Subsol G, Claverie T, Villeger S (2018) A Deep learning method for
accurate and fast identification of coral reef fishes in underwater images. Ecol Inform, 48, 238-244.
Wang J, Ma Y, Zhang L, Gao RX, Wu D (2018) Deep learning for smart manufacturing: Methods and applications.
Journal of Manufacturing Systems, 48, 144-156.
Wang SH, Zhao JW, Chen YQ (2017) Robust tracking of fish schools using CNN for head identification. Multimedia
Tools and Applications, 76, 23679-23697.
Wang T, Wu DJ, Coates A, Ng AY (2012) End-to-end text recognition with convolutional neural networks. In:
Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012) . IEEE, pp. 3304-3308.
Wei G-Y, Brooks D (2019) Benchmarking tpu, gpu, and cpu platforms for deep learning. arXiv preprint
arXiv:1907.10701.
White DJ, Svellingen C, Strachan NJC (2006) Automated measurement of species and length of fish by computer
vision. Fisheries Research, 80, 203-210.
Wu T-H, Huang Y-I, Chen J-M (2015) Development of an adaptive neural-based fuzzy inference system for feeding
decision-making assessment in silver perch (Bidyanus bidyanus) culture. Aquacult. Eng., 66, 41-51.
Xu Z, Cheng XE (2017) Zebrafish tracking using convolutional neural networks. Sci. Rep., 7, 42815.
Yang Q, Xiao D, Lin S (2018) Feeding behavior recognition for group-housed pigs with the Faster R-CNN. Comput.
Electron. Agric., 155, 453-460.
Yao J, Odobez J (2007) Multi-Layer Background Subtraction Based on Color and Texture. In: 2007 IEEE Conference
on Computer Vision and Pattern Recognition, pp. 1-8.
Zeiler MD, Fergus R (2014) Visualizing and Understanding Convolutional Networks. Springer International
Publishing, Cham, pp. 818-833.
Zhang QS, Zhu SC (2018) Visual interpretability for deep learning: a survey. Front. Inform. Technol. Elect. Eng., 19,
27-39.
Zhang S, Yang X, Wang Y, Zhao Z, Liu J, Liu Y, Sun C, Zhou C (2020) Automatic fish population counting by machine
vision and a hybrid deep neural network model. Animals, 10, 364.
Zhao J, Bao W, Zhang F, Zhu S, Liu Y, Lu H, Shen M, Ye Z (2018a) Modified motion influence map and recurrent
neural network-based monitoring of the local unusual behaviors for fish school in intensive aquaculture.
Aquaculture, 493, 165-175.
Zhao J, Li Y, Zhang F, Zhu S, Liu Y, Lu H, Ye Z (2018b) Semi-Supervised Learning-Based Live Fish Identification in
Aquaculture Using Modified Deep Convolutional Generative Adversarial Networks. Transactions of the
ASABE, 61, 699-710.
Zheng L, Yang Y, Tian Q (2017) SIFT meets CNN: A decade survey of instance retrieval. IEEE transactions on pattern
analysis machine intelligence, 40, 1224-1244.
Zhou C, Lin K, Xu D, Chen L, Guo Q, Sun C, Yang X (2018a) Near infrared computer vision and neuro-fuzzy model-
based feeding decision system for fish in aquaculture. Comput. Electron. Agric., 146, 114-124.
Zhou C, Sun C, Lin K, Xu D, Guo Q, Chen L, Yang X (2018b) Handling Water Reflections for Computer Vision in
Aquaculture. Transactions of the ASABE, 61, 469-479.
Zhou C, Xu D, Chen L, Zhang S, Sun C, Yang X, Wang Y (2019) Evaluation of fish feeding intensity in aquaculture
using a convolutional neural network and machine vision. Aquaculture, 507, 457-465.
Zhou C, Xu D, Lin K, Sun C, Yang X (2018c) Intelligent feeding control methods in aquaculture with an emphasis on
fish: a review. Reviews in aquaculture, 10, 975-993.
Zhou C, Yang X, Zhang B, Lin K, Xu D, Guo Q, Sun C (2017a) An adaptive image enhancement method for a
recirculating aquaculture system. Sci. Rep., 7, 6243.
Zhou C, Zhang B, Lin K, Xu D, Chen C, Yang X, Sun C (2017b) Near-infrared imaging to quantify the feeding behavior
of fish in aquaculture. Comput. Electron. Agric., 135, 233-241.
Zhuang F, Qi Z, Duan K, Xi D, Zhu Y, Zhu H, Xiong H, He Q (2019) A Comprehensive Survey on Transfer Learning.
arXiv preprint arXiv:1911.02685.
Zion B (2012) The use of computer vision technologies in aquaculture – A review. Comput. Electron. Agric., 88, 125-
132.
Appendix A: Public dataset containing fish
NO Dataset URL Description References
1 Fish4- http://groups.inf.ed.ac.uk/f4k This underwater live fish dataset was acquired from a live video dataset captured in the open Boom et al.
Knowledge /index.html sea. It contains a total of 27,370 verified fish images in 23 clusters. Each cluster is represents (2012)
a single species.
2 Croatian fish http://www.inf-cv.uni- This dataset contains 794 images of 12 different fish species collected in the Adriatic sea in Jäger et al.
dataset jena.de/fine_grained_recogn Croatia. All the images show fishes in real-world situations recorded by high definition (2015)
ition.html#datasets cameras.
3 LifeCLEF14 http://www.imageclef.org/ The LCF-14 dataset for fish contains approximately 1,000 videos. Labels are provided for Ahmad et al.
and approximately 20,000 detected fish in the videos. A total of 10 different fish species are (2016)
LifeCLEF15 included in this dataset. LifeCLEF 2015 (LCF-15) was taken from Fish4Knowledge. LCF-
dataset 15 consists of 93 underwater videos covering 15 species and provides 9,000 annotations
with species labels.
4 Fish-Pak https://doi.org/10.17632/n3y This is a dataset consisting of images of 6 different fish species i.e., Catla (Thala), Rauf et al.
dw29sbz.3#folder- Hypophthalmichthys molitrix (Silver carp), Labeo rohita (Rohu), Cirrhinus mrigala (Mori), (2019)
6b024354-bae3-460aa758- Cyprinus carpio (Common carp) and Ctenopharyngodon idella (Grass carp).
352685ba0e38
5 ImageNet http://www.image-net.org/ ImageNet is an image database organized according to the WordNet hierarchy (currently Deng et al.
only nouns), in which each node in the hierarchy is associated with hundreds or thousands (2009)
of images. ImageNet currently has an average of over five hundred images per node.