0% found this document useful (0 votes)

48 views

Learning To Hash With Binary Deep Neural Network: October 2016

This document summarizes a research paper that proposes deep neural network models and learning algorithms for unsupervised and supervised binary hashing. The key contributions are: 1) A novel network design that constrains one hidden layer to directly output binary hash codes, avoiding non-smooth optimization from binarization functions used in prior work. 2) Direct incorporation of independence and balance properties of binary codes into the learning objective, rather than using relaxation. 3) Inclusion of similarity preservation in the objective function to learn codes that better preserve semantic similarity. Experimental results on benchmark datasets show the proposed methods compare favorably to state-of-the-art hashing techniques.

Uploaded by

riadelectro

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

48 views

Learning To Hash With Binary Deep Neural Network: October 2016

Uploaded by

riadelectro

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/308190581

Learning to Hash with Binary Deep Neural Network

Conference Paper · October 2016

DOI: 10.1007/978-3-319-46454-1_14

CITATIONS READS

75 564

3 authors, including:

Thanh-Toan Do Ngai-Man Cheung

University of Liverpool Singapore University of Technology and Design
85 PUBLICATIONS 782 CITATIONS 179 PUBLICATIONS 2,694 CITATIONS

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Multi-aspect Time-ware Word Embedding View project

MediaEval 2017 View project

All content following this page was uploaded by Thanh-Toan Do on 29 November 2017.

The user has requested enhancement of the downloaded file.

Learning to Hash with Binary Deep
Neural Network

Thanh-Toan Do(B) , Anh-Dzung Doan, and Ngai-Man Cheung

Singapore University of Technology and Design, Singapore, Singapore

{thanhtoan do,dung doan,ngaiman cheung}@sutd.edu.sg

Abstract. This work proposes deep network models and learning algo-
rithms for unsupervised and supervised binary hashing. Our novel net-
work design constrains one hidden layer to directly output the binary
codes. This addresses a challenging issue in some previous works: opti-
mizing non-smooth objective functions due to binarization. Moreover,
we incorporate independence and balance properties in the direct and
strict forms in the learning. Furthermore, we include similarity preserv-
ing property in our objective function. Our resulting optimization with
these binary, independence, and balance constraints is diﬃcult to solve.
We propose to attack it with alternating optimization and careful relax-
ation. Experimental results on three benchmark datasets show that our
proposed methods compare favorably with the state of the art.

Keywords: Learning to hash · Neural network · Discrete

optimizatization

1 Introduction
We are interested in learning binary hash codes for large scale visual search.
Two main difficulties with large scale visual search are efficient storage and
fast searching. An attractive approach for handling these difficulties is binary
hashing, where each original high dimensional vector x ∈ RD is mapped to a
very compact binary vector b ∈ {−1, 1}L , where L D.
Hashing methods can be divided into two categories: data-independent and
data-dependent. Methods in data-independent category [1–4] rely on random
projections for constructing hash functions. Methods in data-dependent category
use the available training data to learn the hash functions in unsupervised [5–9]
or supervised manner [10–15]. The review of data-independent/data-dependent
hashing methods can be found in recent surveys [16–18].
One difficult problem in hashing is to deal with the binary constraint on
the codes. Specifically, the outputs of the hash functions have to be binary. In
general, this binary constraint leads to a NP-hard mixed-integer optimization
problem. To handle this difficulty, most aforementioned methods relax the con-
straint during the learning of hash functions. With this relaxation, the continuous
codes are learned first. Then, the codes are binarized (e.g., with thresholding).

c Springer International Publishing AG 2016
B. Leibe et al. (Eds.): ECCV 2016, Part V, LNCS 9909, pp. 219–234, 2016.
DOI: 10.1007/978-3-319-46454-1 14
220 T.-T. Do et al.

This relaxation greatly simpliﬁes the original binary constrained problem. How-
ever, the solution can be suboptimal, i.e., the binary codes resulting from thresh-
olded continuous codes could be inferior to those that are obtained by including
the binary constraint in the learning.
Furthermore, a good hashing method should produce binary codes with the
properties [5]: (i) similarity preserving, i.e., (dis)similar inputs should likely have
(dis)similar binary codes; (ii) independence, i.e., diﬀerent bits in the binary codes
are independent to each other; (iii) balance, i.e., each bit has a 50 % chance of
being 1 or −1. The direct incorporation of the independent and balance proper-
ties can complicate the learning. Previous work has used some relaxation to work
around the problem [6,19,20], but there may be some performance degradation.

1.1 Related Work

Our work is inspired by a few recent successful hashing methods which define
hash functions as a neural network [19,21,22]. We propose an improved design
to address their limitations. In Semantic Hashing [21], the model is formed by a
stack of Restricted Boltzmann Machine, and a pretraining step is required. This
model does not consider the independence and balance of the codes. In Binary
Autoencoder [22], a linear autoencoder is used as hash functions. As this model
only uses one hidden layer, it may not well capture the information of inputs.
Extending [22] with multiple, nonlinear layers is not straight-forward because of
the binary constraint. They also do not consider the independence and balance
of codes. In Deep Hashing [19], a deep neural network is used as hash functions.
However, this model does not fully take into account the similarity preserving.
They also apply some relaxation in arriving the independence and balance of
codes and this may degrade the performance.
In order to handle the binary constraint, Semantic Hashing [21] first solves
the relaxed problem by discarding the constraint and then thresholds the solved
continuous solution. In Deep Hashing (DH) [19], the output of the last layer,
Hn , is binarized by the sgn function. They include a term in the objective
function to reduce this binarization loss: (sgn(Hn ) − Hn ). Solving the objective
function of DH [19] is difficult because the sgn function is non-differentiable. The
authors in [19] work around this difficulty by assuming that the sgn function is
differentiable everywhere. In Binary Autoencoder (BA) [22], the outputs of the
hidden layer are passed into a step function to binarize the codes. Incorporating
the step function in the learning leads to a non-smooth objective function and the
optimization is NP-complete. To handle this difficulty, they use binary SVMs to
learn the model parameters in the case when there is only a single hidden layer.

1.2 Contribution
In this work, we ﬁrst propose a novel deep network model and learning algorithm
for unsupervised hashing. In order to achieve binary codes, instead of involving
the sgn or step function as in [19,22], our proposed network design constrains
one layer to directly output the binary codes (hence the network is called as
Learning to Hash with Binary Deep Neural Network 221

Table 1. Notations and their corresponding meanings.

Notation Meaning
X X = {xi }m
i=1 ∈ R
D×m
: set of m training samples; each column of X
corresponds to one sample
B B = {bi }m
i=1 ∈ {−1, +1}
L×m
: binary code of X
L Number of bits in the output binary code to encode a sample
n Number of layers (including input and output layers)
sl Number of units in layer l
(l)
f Activation function of layer l
W(l) W(l) ∈ Rsl+1 ×sl : weight matrix connecting layer l + 1 and layer l
c(l) c(l) ∈ Rsl+1 :bias vector for units in layer l + 1

(l)
H H(l) = f (l) W(l−1) H(l−1) + c(l−1) 11×m : output values of layer l;
convention: H(1) = X
1a×b Matrix has a rows, b columns and all elements equal to 1

Binary Deep Neural Network ). Moreover, we propose to directly incorporate

the independence and balance properties without relaxing them. Furthermore,
we include the similarity preserving in our objective function. The resulting
optimization with these binary and direct constraints is NP-hard. We propose
to attack this challenging problem with alternating optimization and careful
relaxation. To enhance the discriminative power of the binary codes, we then
extend our method to supervised hashing by leveraging the label information
such that the binary codes preserve the semantic similarity between samples.
The solid experiments on three benchmark datasets show the improvement of
the proposed methods over state-of-the-art hashing methods.
The remaining of this paper is organized as follows. Section 2 and Sect. 3
present and evaluate the proposed unsupervised hashing method, respectively.
Section 4 and Sect. 5 present and evaluate the proposed supervised hashing
method, respectively. Section 6 concludes the paper.

2 Unsupervised Hashing with Binary Deep Neural

Network (UH-BDNN)
2.1 Formulation of UH-BDNN
We summarize the notations in Table 1. In our work, the hash functions are
defined by a deep neural network. In our proposed design, we use different acti-
vation functions in different layers. Specifically, we use the sigmoid function as
activation function for layers 2, · · · , n − 2, and the identity function as activation
function for layer n − 1 and layer n. Our idea is to learn the network such that
the output values of the penultimate layer (layer n−1) can be used as the binary
codes. We introduce constraints in the learning algorithm such that the output
222 T.-T. Do et al.

Fig. 1. The illustration of our network (D = 4, L = 2). In our proposed network design,
the outputs of layer n − 1 are constrained to {−1, 1} and are used as the binary codes.
During training, these codes are used to reconstruct the input samples at the ﬁnal
layer.

values at the layer n − 1 have the following desirable properties: (i) belonging to
{−1, 1}; (ii) similarity preserving; (iii) independent and (iv) balancing. Figure 1
illustrates our network for the case D = 4, L = 2.
Let us start with ﬁrst two properties of the codes, i.e., belonging to {−1, 1}
and similarity preserving. To achieve the binary codes having these two proper-
ties, we propose to optimize the following constrained objective function

1

2 λ n−1
1

(l) 2
min J = X − W(n−1) H(n−1) + c(n−1) 11×m + W (1)
W,c 2m 2
l=1

s.t. H(n−1) ∈ {−1, 1}L×m (2)

The constraint (2) is to ensure the first property. As the acti-
vation
(n−1)function for the last layer is the identity function, the term
W H(n−1) + c(n−1) 11×m is the output of the last layer. The first term
of (1) makes sure that the binary code gives a good reconstruction of X. It
is worth noting that the reconstruction criterion has been used as an indirect
way for preserving the similarity in state-of-the-art unsupervised hashing meth-
ods [6,21,22], i.e., it encourages (dis)similar inputs map to (dis)similar binary
codes. The second term is a regularization that tends to decrease the magnitude
of the weights, and this helps to prevent overfitting. Note that in our proposed
design, we constrain to directly output the binary codes at one layer, and this
avoids the difficulties with the sgn/step function such as non-differentiability.
On the other hand, our formulation with (1) under the binary constraint (2)
is very difficult to solve. It is a mixed-integer problem which is NP-hard. We
propose to attack the problem using alternating optimization by introducing an
auxiliary variable. Using the auxiliary variable B, we reformulate the objective
function (1) under constraint (2) as

1

2 λ n−1
1

(l) 2
min J = X − W(n−1) B − c(n−1) 11×m + W (3)
W,c,B 2m 2
l=1
Learning to Hash with Binary Deep Neural Network 223

s.t. B = H(n−1) (4)

B ∈ {−1, 1}L×m (5)
The benefit of introducing the auxiliary variable B is that we can decom-
pose the difficult constrained optimization problem (1) into two sub-optimization
problems. Then, we can iteratively solve the optimization by using alternating
optimization with respect to (W, c) and B while holding the other fixed. We
will discuss the details of the alternating optimization in a moment. Using the
idea of the quadratic penalty method [23], we relax the equality constraint (4)
by solving the following constrained objective function
1
2

min J = X − W(n−1) B − c(n−1) 11×m
W,c,B 2m
λ1 λ2 2
n−1
(l) 2 (n−1)
+ W + H − B (6)
2 2m
l=1

s.t. B ∈ {−1, 1}L×m (7)

The third term in (6) measures the (equality) constraint violation. By setting
the penalty parameter λ2 sufficiently large, we penalize the constraint violation
severely, thereby forcing the minimizer of the penalty function (6) closer to the
feasible region of the original constrained function (3).
Now let us consider the two remaining properties of the codes, i.e., indepen-
dence and balance. Unlike previous works which use some relaxation or approx-
imation on the independence and balance properties [6,19,20], we propose to
encode these properties strictly and directly based on the binary outputs of our
layer n − 11 . Specifically, we encode the independence and balance properties of
the codes by having the fourth and the fifth term respectively in the following
constrained objective function

1

2 λ n−1
1

(l) 2
min J = X − W(n−1) B − c(n−1) 11×m + W
W,c,B 2m 2
l=1
2 2
λ2 (n−1)
λ3 1 (n−1) (n−1) T
+ λ4
2
+ H − B + H (H ) − I H (n−1)
1m×1
2m 2 m 2m
(8)

s.t. B ∈ {−1, 1}L×m (9)

(8) under constraint (9) is our final formulation. Before discussing how to
solve it, let us present the differences between our work and the recent deep
learning based-hashing models Deep Hashing [19] and Binary Autoencoder [22].
The first important difference between our model and Deep Hashing [19]
/ Binary Autoencoder [22] is the way to achieve the binary codes. Instead of
1
Alternatively, we can constrain the independence and balance on B. This, however,
makes the optimization very difficult.
224 T.-T. Do et al.

involving the sgn or step function as in [19,22], we constrain the network to

directly output the binary codes at one layer. Other diﬀerences are presented as
follows.
Comparison to Deep Hashing (DH) [19]: the deep model of DH is learned by the
following formulation:
1
2
α1 (n) (n) T
min J = sgn(H(n) ) − H(n) − tr H (H )
W,c 2 2m
α2
n−1
(l)
2 α n−1
(l) 2 (l) 2
(l) T 3
+ W (W ) − I + W + c
2 2
l=1 l=1

The DH’s model does not have the reconstruction layer. They apply sgn
function to the outputs at the top layer of the network to obtain the binary
codes. The first term aims to minimize quantization loss when applying the
sgn function to the outputs at the top layer. The balancing and the independent
properties are contained in the second and the third terms [19]. It is worth noting
that minimizing DH’s objective function is difficult due to the non-differentiable
of sgn function. The authors work around this difficulty by assuming that sgn
function is differentiable everywhere.
Contrary to DH, we propose a different model design. In particular, our
model encourages the similarity preserving by having the reconstruction
layer

in the network. For the balancing property, they maximize tr H(n) (H(n) )T .
According to [20], maximizing this term is only an approximation in arriv-
ing the balancing property. In our objective function, the balancing property
2
is directly enforced on the codes by the term H(n−1) 1m×1 . For the indepen-
2
dent property, DH uses a relaxed orthogonality constraint W(l) (W(l) )T − I ,
i.e., constraining on the network weights W. On the contrary, we (once again)
1 (n−1) (n−1) T 2
directly constrain on the codes using m H (H ) − I . Incorporating
the strict constraints can lead to better performance.
Comparison to Binary Autoencoder (BA) [22]: the differences between our model
and BA are quite clear. BA as described in [22] is a shallow linear autoencoder
network with one hidden layer. The BA’s hash function is a linear transformation
of the input followed by the step function to obtain the binary codes. In BA, by
treating the encoder layer as binary classifiers, they use binary SVMs to learn the
weights of the linear transformation. On the contrary, our hash function is defined
by multiple, hierarchical layers of nonlinear and linear transformations. It is not
clear if the binary SVMs approach in BA can be used to learn the weights in our
deep architecture with multiple layers. Instead, we use alternating optimization
to derive a backpropagation algorithm to learn the weights in all layers. Another
difference is that our model ensures the independence and balance of the binary
codes while BA does not. Note that independence and balance properties may
not be easily incorporated in their framework, as these would complicate their
objective function and the optimization problem may become very difficult to
solve.
Learning to Hash with Binary Deep Neural Network 225

2.2 Optimization
In order to solve (8) under constraint (9), we propose to use alternating opti-
mization over (W, c) and B.
(W, c) step. When fixing B, the problem becomes unconstrained optimization.
We use L-BFGS [24] optimizer with backpropagation for solving. The gradient of
the objective function J (8) w.r.t. different parameters are computed as follows.
At l = n − 1, we have
∂J −1
= (X − W(n−1) B − c(n−1) 11×m )BT + λ1 W(n−1) (10)
∂W(n−1) m
∂J −1
= (X − W(n−1) B)1m×1 − mc(n−1) (11)
∂c (n−1) m
For other layers, let us define

2λ 1
λ2 3 (n−1) T
Δ (n−1)
= H (n−1)
−B + H (n−1)
(H ) − I H(n−1)
m m m
λ4 (n−1)
+ H 1m×m f (n−1) (Z(n−1) ) (12)
m

Δ(l) = (W(l) )T Δ(l+1) f (l) (Z(l) ), ∀l = n − 2, · · · , 2 (13)

where denotes Hadamard product; Z(l) = W(l−1) H(l−1) + c(l−1) 11×m , l =

2, · · · , n.
Then, ∀l = n − 2, · · · , 1, we have
∂J
= Δ(l+1) (H(l) )T + λ1 W(l) (14)
∂W(l)
∂J
= Δ(l+1) 1m×1 (15)
∂c(l)
B step. When ﬁxing (W, c), we can rewrite problem (8) as
2 2

min J = X − W(n−1) B − c(n−1) 11×m + λ2 H(n−1) − B (16)
B

s.t. B ∈ {−1, 1}L×m (17)

We adaptively use the recent method discrete cyclic coordinate descent [15]
to iteratively solve B, i.e., row by row. The advantage of this method is that if
we ﬁx L − 1 rows of B and only solve for the remaining row, we can achieve a
closed-form solution for that row.
Let V = X − c(n−1) 11×m ; Q = (W(n−1) )T V + λ2 H(n−1) . For k = 1, · · · L,
let wk be k th column of W(n−1) ; W1 be matrix W(n−1) excluding wk ; qk be
k th column of QT ; bTk be k th row of B; B1 be matrix of B excluding bTk . We
have closed-form for bTk as
bTk = sgn(qT − wkT W1 B1 ) (18)
The proposed UH-BDNN method is summarized in Algorithm 1. In the Algo-
rithm 1, B(t) and (W, c)(t) are values of B and {W(l) , c(l) }n−1
l=1 at iteration t.
226 T.-T. Do et al.

Algorithm 1. Unsupervised Hashing with Binary Deep Neural Network (UH-

BDNN)
Input:
X = {xi }m i=1 ∈ R
D×m
: training data; L: code length; T : maximum iteration number; n: number
of layers; {sl }n
l=2 : number of units of layers 2 → n (note: sn−1 = L, sn = D); λ1 , λ2 , λ3 , λ4 .
Output:
n−1
Parameters {W(l) , c(l) }l=1

1: Initialize B(0) ∈ {−1, 1}L×m using ITQ [6]

n−1 n−2
2: Initialize {c(l) }l=1 = 0sl+1 ×1 . Initialize {W(l) }l=1 by getting the top sl+1 eigenvectors from
the covariance matrix of H(l) . Initialize W(n−1) = ID×L
n−1
3: Fix B(0) , compute (W, c)(0) with (W, c) step using initialized {W(l) , c(l) }l=1 (line 2) as start-
ing point for L-BFGS.
4: for t = 1 → T do
5: Fix (W, c)(t−1) , compute B(t) with B step
6: Fix B(t) , compute (W, c)(t) with (W, c) step using (W, c)(t−1) as starting point for L-BFGS.
7: end for
8: Return (W, c)(T )

3 Evaluation of Unsupervised Hashing with Binary Deep

Neural Network (UH-BDNN)
This section evaluates the proposed UH-BDNN and compares it to the following
state-of-the-art unsupervised hashing methods: Spectral Hashing (SH) [5], Iter-
ative Quantization (ITQ) [6], Binary Autoencoder (BA) [22], Spherical Hashing
(SPH) [8], K-means Hashing (KMH) [7]. For all compared methods, we use the
implementations and the suggested parameters provided by the authors.

3.1 Dataset, Evaluation Protocol, and Implementation Note

Dataset. CIFAR10 [25] dataset consists of 60,000 images of 10 classes. The
training set (also used as database for retrieval) contains 50,000 images. The
query set contains 10,000 images. Each image is represented by a 800-dimensional
feature vector extracted by PCA from 4096-dimensional CNN feature produced
by AlexNet [26].
MNIST [27] dataset consists of 70,000 handwritten digit images of 10 classes.
The training set (also used as database for retrieval) contains 60,000 images. The
query set contains 10,000 images. Each image is represented by a 784 dimensional
gray-scale feature vector by using its intensity.
SIFT1M [28] dataset contains 128 dimensional SIFT vectors [29]. There are
M vectors used as database for retrieval; 100K vectors for training (separated
from retrieval database) and 10 K vectors for query.
Evaluation protocol. We follow the standard setting in unsupervised hash-
ing [6–8,22] using Euclidean nearest neighbors as the ground truths for queries.
Number of ground truths are set as in [22], i.e., for CIFAR10 and MNIST
datasets, for each query, we use 50 its Euclidean nearest neighbors as ground
truths; for large scale dataset SIFT1M, for each query, we use 10, 000 its Euclid-
ean nearest neighbors as ground truths. We use the following evaluation metrics
Learning to Hash with Binary Deep Neural Network 227

12 25 25
UH−BDNN UH−BDNN UH−BDNN
10 BA BA BA
ITQ 20 ITQ 20 ITQ
SH SH SH
8 SPH SPH SPH
15 15
mAP

mAP

mAP
KMH KMH KMH
6
10 10
4

5 5
2

0 0 0
8 16 24 32 8 16 24 32 8 16 24 32
number bits (L) number bits (L) number bits (L)
(a) CIFAR10 (b) MNIST (c) SIFT1M

Fig. 2. mAP comparison between UH-BDNN and state-of-the-art unsupervised hash-

ing methods on CIFAR10, MNIST, and SIFT1M.

Table 2. Precision at Hamming distance r = 2 comparison between UH-BDNN and

state-of-the-art unsupervised hashing methods on CIFAR10, MNIST, and SIFT1M.

L CIFAR10 MNIST SIFT1M

8 16 24 32 8 16 24 32 8 16 24 32
UH-BDNN 0.55 5.79 22.14 18.35 0.53 6.80 29.38 38.50 4.80 25.20 62.20 80.55
BA [22] 0.55 5.65 20.23 17.00 0.51 6.44 27.65 35.29 3.85 23.19 61.35 77.15
ITQ [6] 0.54 5.05 18.82 17.76 0.51 5.87 23.92 36.35 3.19 14.07 35.80 58.69
SH [5] 0.39 4.23 14.60 15.22 0.43 6.50 27.08 36.69 4.67 24.82 60.25 72.40
SPH [8] 0.43 3.45 13.47 13.67 0.44 5.02 22.24 30.80 4.25 20.98 47.09 66.42
KMH [7] 0.53 5.49 19.55 15.90 0.50 6.36 25.68 36.24 3.74 20.74 48.86 76.04

which have been used in state of the art [6,19,22] to measure the performance of
methods. (1) mean Average Precision (mAP); (2) precision of Hamming radius
2 (precision@2) which measures precision on retrieved images having Hamming
distance to query ≤ 2 (if no images satisfy, we report zero precision). Note that
as computing mAP is slow on large dataset SIFT1M, we consider top 10, 000
returned neighbors when computing mAP.
Implementation note. In our deep model, we use n = 5 layers. The parameters
λ1 , λ2 , λ3 and λ4 are empirically set by cross validation as 10−5 , 5 × 10−2 , 10−2
and 10−6 , respectively. The max iteration number T is empirically set to 10. The
number of units in hidden layers 2, 3, 4 are empirically set as [90 → 20 → 8],
[90 → 30 → 16], [100 → 40 → 24] and [120 → 50 → 32] for the 8, 16, 24 and 32
bits, respectively.

3.2 Retrieval Results

Figure 2 and Table 2 show comparative mAP and precision of Hamming radius
2 (precision@2), respectively. We ﬁnd the following observations are consistent
for all three datasets. In term of mAP, the proposed UH-BDNN comparable
or outperforms other methods at all code lengths. The improvement is more
clear at high code length, i.e., L = 24, 32. The mAP of UH-BDNN consistently
228 T.-T. Do et al.

outperforms that of binary autoencoder (BA) [22], which is the current state-
of-the-art unsupervised hashing method. In term of precision@2, UH-BDNN is
comparable to other methods at low L, i.e., L = 8, 16. At L = 24, 32, UH-BDNN
signiﬁcantly outperforms other methods.
Comparison with Deep Hashing (DH): [19] As the implementation of DH is not
available, we set up the experiments on CIFAR10 and MNIST similar to [19] to
make a fair comparison. For each dataset, we randomly sample 1,000 images, 100
per class, as query set; the remaining images are used as training/database set.
Follow [19], for CIFAR10, each image is represented by 512-D GIST descrip-
tor [30]. The ground truths of queries are based on their class labels. Similar
to [19], we report comparative results in term of mAP and the precision of Ham-
ming radius r = 2. The comparative results are presented in the Table 3. It is
clearly showed in Table 3 that the proposed UH-BDNN outperforms DH [19] at
all code lengths, in both mAP and precision of Hamming radius.

Table 3. Comparison with Deep Hashing (DH) [19]. The results of DH are cited
from [19].

L CIFAR10 MNIST
mAP precision@2 mAP precision@2
16 32 16 32 16 32 16 32
DH [19] 16.17 16.62 23.33 15.77 43.14 44.97 66.10 73.29
UH-BDNN 17.83 18.52 24.97 18.85 45.38 47.21 69.13 75.26

4 Supervised Hashing with Binary Deep Neural Network

(SH-BDNN)
In order to enhance the discriminative power of the binary codes, we extend
UH-BDNN to supervised hashing by leveraging the label information. There are
several approaches proposed to leverage the label information, leading to differ-
ent criteria on binary codes. In [10,31], binary codes are learned such that they
minimize the Hamming distance among within-class samples, while maximizing
the Hamming distance among between-class samples. In [15], the binary codes
are learned such that they are optimal for linear classification.
In this work, in order to leverage the label information, we follow the app-
roach proposed in Kernel-based Supervised Hashing (KSH) [11]. The benefit
of this approach is that it directly encourages the Hamming distances between
binary codes of within-class samples equal to 0, and the Hamming distances
between binary codes of between-class samples equal to L. In the other words, it
tries to perfectly preserve the semantic similarity. To achieve this goal, it enforces
that the Hamming distance between learned binary codes has to highly correlate
with the pre-computed pairwise label matrix.
In general, the network structure of SH-BDNN is similar to UH-BDNN,
excepting that the last layer preserving reconstruction of UH-BDNN is removed.
Learning to Hash with Binary Deep Neural Network 229

The layer n − 1 in UH-BDNN becomes the last layer in SH-BDNN. All desir-
able properties, i.e. semantic similarity preserving, independence, and balance,
in SH-BDNN are constrained on the outputs of its last layer.

4.1 Formulation of SH-BDNN

We deﬁne the pairwise label matrix S as

1 if xi and xj are same class
Sij = (19)
−1 if xi and xj are not same class

To achieve the semantic similarity preserving property, we learn the binary codes
such that the Hamming distance between learned binary codes highly correlates
2
with the matrix S, i.e., we want to minimize the quantity L1 (H(n) )T H(n) − S .
In addition, to achieve the independence and balance properties of codes, we
1 (n) (n) T 2 2
want to minimize the quantities m H (H ) − I and H(n) 1m×1 .
Follow the same reformulation and relaxation as UH-BDNN (Sect. 2.1), we
solve the following constrained optimization which ensures the binary constraint,
the semantic similarity preserving, the independence, and the balance properties
of codes

2 λ2 2
n−1
1
1 (H(n) )T H(n) − S + λ1 (l) 2 (n)
min J = W + H − B
W,c,B 2m L 2 2m
l=1
2 2
λ3 1 (n) (n) T
+ λ4
+ H (H ) − I H(n) 1m×1 (20)
2 m 2m

s.t. B ∈ {−1, 1}L×m (21)

(20) under constraint (21) is our formulation for supervised hashing. The main
diﬀerence in formulation between UH-BDNN (8) and SH-BDNN (20) is that
the reconstruction term preserving the neighbor similarity in UH-BDNN (8) is
replaced by the term preserving the label similarity in SH-BDNN (20).

4.2 Optimization
In order to solve (20) under constraint (21), we alternating optimize over (W, c)
and B.
(W, c) step. When fixing B, (20) becomes unconstrained optimization. We used
L-BFGS [24] optimizer with backpropagation for solving. The gradient of objec-
tive function J (20) w.r.t. different parameters are computed as follows.
Let us define
λ 2λ 1
1 2 3
Δ(n) = H(n) V + VT + H(n) − B + H(n) (H(n) )T − I H(n)
mL m m m
λ4 (n)
+ H 1m×m f (n) (Z(n) ) (22)
m
230 T.-T. Do et al.

Algorithm 2. Supervised Hashing with Binary Deep Neural Network (SH-

BDNN)
Input:
X = {xi }mi=1 ∈ R
D×m
: training data; Y ∈ Rm×1 : training label vector; L: code length; T :
maximum iteration number; n: number of layers; {sl }n l=2 : number of units of layers 2 → n (note:
sn = L); λ1 , λ2 , λ3 , λ4 .
Output:
n−1
Parameters {W(l) , c(l) }l=1

1: Compute pairwise label matrix S using (19).

2: Initialize B(0) ∈ {−1, 1}L×m using ITQ [6]
n−1 n−1
3: Initialize {c(l) }l=1 = 0sl+1 ×1 . Initialize {W(l) }l=1 by getting the top sl+1 eigenvectors from
the covariance matrix of H(l) .
n−1
4: Fix B(0) , compute (W, c)(0) with (W, c) step using initialized {W(l) , c(l) }l=1 (line 3) as start-
ing point for L-BFGS.
5: for t = 1 → T do
6: Fix (W, c)(t−1) , compute B(t) with B step
7: Fix B(t) , compute (W, c)(t) with (W, c) step using (W, c)(t−1) as starting point for L-BFGS.
8: end for
9: Return (W, c)(T )

(n) T
where V = 1
L (H ) H(n) − S.

Δ(l) = (W(l) )T Δ(l+1) f (l) (Z(l) ), ∀l = n − 1, · · · , 2 (23)

where denotes Hadamard product; Z(l) = W(l−1) H(l−1) + c(l−1) 11×m , l =

2, · · · , n.
Then ∀l = n − 1, · · · , 1, we have
∂J
= Δ(l+1) (H(l) )T + λ1 W(l) (24)
∂W(l)
∂J
= Δ(l+1) 1m×1 (25)
∂c(l)
B step When ﬁxing (W, c), we can rewrite problem (20) as
2

min J = H(n) − B (26)
B

s.t. B ∈ {−1, 1}L×m (27)

It is easy to see that the optimal solution for (26) under constraint (27) is
B = sgn(H(n) ).
The proposed SH-BDNN method is summarized in Algorithm 2. In the Algo-
rithm, B(t) and (W, c)(t) are values of B and {W(l) , c(l) }n−1
l=1 at iteration t.

5 Evaluation of Supervised Hashing with Binary Deep

Neural Network (SH-BDNN)
This section evaluates the proposed SH-BDNN and compares it to state-of-the-
art supervised hashing methods: Supervised Discrete Hashing (SDH) [15], ITQ-
CCA [6], Kernel-based Supervised Hashing (KSH) [11], Binary Reconstructive
Learning to Hash with Binary Deep Neural Network 231

70 100

90
60

80
50 SH−BDNN

mAP

mAP
SDH
ITQ−CCA 70
KSH
40 BRE
60
SH−BDNN
SDH
30
50 ITQ−CCA
KSH
BRE
20 40
8 16 24 32 8 16 24 32
number bits (L) number bits (L)
(a) CIFAR10 (b) MNIST

Fig. 3. mAP comparison between SH-BDNN and state-of-the-art supervised hashing

methods on CIFAR10 and MNIST.

Table 4. Precision at Hamming distance r = 2 comparison between SH-BDNN and

state-of-the-art supervised hashing methods on CIFAR10 and MNIST.

L CIFAR10 MNIST
8 16 24 32 8 16 24 32
SH-BDNN 54.12 67.32 69.36 69.62 84.26 94.67 94.69 95.51
SDH [15] 31.60 62.23 67.65 67.63 36.49 93.00 93.98 94.43
ITQ-CCA [6] 49.14 65.68 67.47 67.19 54.35 79.99 84.12 84.57
KSH [11] 44.81 64.08 67.01 65.76 68.07 90.79 92.86 92.41
BRE [14] 23.84 41.11 47.98 44.89 37.67 69.80 83.24 84.61

Embedding (BRE) [14]. For all compared methods, we use the implementation
and the suggested parameters provided by the authors.

5.1 Dataset, Evaluation Protocol, and Implementation Note

Dataset We evaluate and compare methods on CIFAR-10 and MNIST datasets.

The descriptions of these datasets are presented in Sect. 3.1.
Evaluation protocol. Follow the literature [6,11,15], we report the retrieval
results in two metrics: (1) mean Average Precision (mAP) and (2) precision of
Hamming radius 2 (precision@2).
Implementation note. The network configuration is same as UH-BDNN
excepting the final layer is removed. The values of parameters λ1 , λ2 , λ3 and λ4
are empirically set using cross validation as 10−3 , 5, 1 and 10−4 , respectively.
The max iteration number T is empirically set to 5.
Follow the settings in ITQ-CCA [6], SDH [15], all training samples are used
in the learning for these two methods. For SH-BDNN, KSH [11] and BRE [14]
where label information is leveraged by the pairwise label matrix, we randomly
select 3, 000 training samples from each class and use them for learning. The
ground truths of queries are defined by the class labels from the datasets.
232 T.-T. Do et al.

Table 5. Comparison between SH-BDNN and CNN-based hashing DSRH [32],

DRSCH [33] on CIFAR10. The results of DSRH and DRSCH are cited from [33].

L mAP precison@2
16 24 32 48 16 24 32 48
SH-BDNN 64.30 65.21 66.22 66.53 56.87 58.67 58.80 58.42
DRSCH [33] 61.46 62.19 62.87 63.05 52.34 53.07 52.31 52.03
DSRH [32] 60.84 61.08 61.74 61.77 50.36 52.45 50.37 49.38

5.2 Retrieval Results

On CIFAR10 dataset, Fig. 3(a) and Table 4 clearly show the proposed SH-BDNN
outperforms all compared methods by a fair margin at all code lengths in both
mAP and precision@2.
On MNIST dataset, Fig. 3(b) and Table 4 show the proposed SH-BDNN sig-
niﬁcantly outperforms the current state-of-the-art SDH at low code length, i.e.,
L = 8. When L increases, SH-BDNN and SDH [15] achieve similar performance.
In comparison to remaining methods, i.e., KSH [11], ITQ-CCA [6], BRE [14],
SH-BDNN outperforms these methods by a large margin in both mAP and pre-
cision@2.
Comparison with CNN-based hashing methods [32,33]: We compare our proposed
SH-BDNN to the recent CNN-based supervised hashing methods: Deep Seman-
tic Ranking Hashing (DSRH) [32] and Deep Regularized Similarity Comparison
Hashing (DRSCH) [33]. Note that the focus of [32,33] are diﬀerent from ours:
in [32,33], the authors focus on a framework in which the image features and
hash codes are jointly learned by combining CNN layers (image feature extrac-
tion) and binary mapping layer into a single model. On the other hand, our work
focuses on only the binary mapping layer given some image feature. In [32,33],
their binary mapping layer only applies a simple operation, i.e., an approxima-
tion of sgn function (i.e., logistic [32], tanh [33]), on CNN features for achieving
the approximated binary codes. Our SH-BDNN advances [32,33] in the way to
map the image features to the binary codes (which is our main focus). Given
the image features (i.e., pre-trained CNN features), we apply multiple transfor-
mations on these features; we constrain one layer to directly output the binary
code, without involving sgn function. Furthermore, our learned codes ensure
good properties, i.e. independence and balance, while DRSCH [33] does not con-
sider such properties, and DSRH [32] only considers the balance of codes.
We follow strictly the comparison setting in [32,33]. In [32,33], when com-
paring their CNN-based hashing to other non CNN-based hashing methods, the
authors use pre-trained CNN features (e.g. AlexNet [26], DeCAF [34]) as input
for other methods. Follow that setting, we use AlexNet features [26] as input
for SH-BDNN. We set up the experiments on CIFAR10 similar to [33], i.e., the
query set contains 10 K images (1 K images per class) randomly sampled from
the dataset; the rest 50 K image are used as the training set; in the testing
Learning to Hash with Binary Deep Neural Network 233

step, each query image is searched within the query set itself by applying the
leave-one-out procedure.
The comparative results between the proposed SH-BDNN and DSRH [32],
DRSCH [33], presented in Table 5, clearly show that at the same code length,
the proposed SH-BDNN outperforms [32,33] in both mAP and precision@2.

6 Conclusion
We propose UH-BDNN and SH-BDNN for unsupervised and supervised hashing.
Our network designs constrain to directly produce binary codes at one layer. Our
models ensure good properties for codes: similarity preserving, independence and
balance. Solid experimental results on three benchmark datasets show that the
proposed methods compare favorably with the state of the art.

References
1. Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hash-
ing. In: VLDB (1999)
2. Kulis, B., Grauman, K.: Kernelized locality-sensitive hashing for scalable image
search. In: ICCV (2009)
3. Raginsky, M., Lazebnik, S.: Locality-sensitive binary codes from shift-invariant
kernels. In: NIPS (2009)
4. Kulis, B., Jain, P., Grauman, K.: Fast similarity search for learned metrics. PAMI
31(2), 2143–2157 (2009)
5. Weiss, Y., Torralba, A., Fergus, R.: Spectral hashing. In: NIPS (2008)
6. Gong, Y., Lazebnik, S.: Iterative quantization: a procrustean approach to learning
binary codes. In: CVPR (2011)
7. He, K., Wen, F., Sun, J.: K-means hashing: an aﬃnity-preserving quantization
method for learning binary compact codes. In: CVPR (2013)
8. Heo, J.P., Lee, Y., He, J., Chang, S.F. Yoon, S.E.: Spherical hashing. In: CVPR
(2012)
9. Kong, W., Li, W.J.: Isotropic hashing. In: NIPS (2012)
10. Strecha, C., Bronstein, A.M., Bronstein, M.M., Fua, P.: LDAHash: improved
matching with smaller descriptors. PAMI 34(1), 66–78 (2012)
11. Liu, W., Wang, J., Ji, R., Jiang, Y.G., Chang, S.F.: Supervised hashing with ker-
nels. In: CVPR (2012)
12. Norouzi, M., Fleet, D.J., Salakhutdinov, R.: Hamming distance metric learning.
In: NIPS (2012)
13. Lin, G., Shen, C., Shi, Q., van den Hengel, A., Suter, D.: Fast supervised hashing
with decision trees for high-dimensional data. In: CVPR (2014)
14. Kulis, B., Darrell, T.: Learning to hash with binary reconstructive embeddings. In:
NIPS (2009)
15. Shen, F., Shen, C., Liu, W., Tao Shen, H.: Supervised discrete hashing. In: CVPR
(2015)
16. Wang, J., Liu, W., Kumar, S., Chang, S.: Learning to hash for indexing big data
- a survey. CoRR (2015)
17. Wang, J., Shen, H.T., Song, J., Ji, J.: Hashing for similarity search: a survey. CoRR
(2014)
234 T.-T. Do et al.

18. Grauman, K., Fergus, R.: Learning binary hash codes for large-scale image search.
In: Cipolla, R., Battiato, S., Farinella, G.M. (eds.) Machine Learning for Computer
Vision. SCI, vol. 411, pp. 55–93. Springer, Heidelberg (2013)
19. Erin Liong, V., Lu, J., Wang, G., Moulin, P., Zhou, J.: Deep hashing for compact
binary codes learning. In: CVPR (2015)
20. Wang, J., Kumar, S., Chang, S.: Semi-supervised hashing for large-scale search.
PAMI 34(12), 2393–2406 (2012)
21. Salakhutdinov, R., Hinton, G.E.: Semantic hashing. Int. J. Approximate Reasoning
50(7), 969–978 (2009)
22. Carreira-Perpinan, M.A., Raziperchikolaei, R.: Hashing with binary autoencoders.
In: CVPR (2015)
23. Nocedal, J., Wright, S.J.: Numerical Optimization, 2nd edn. World Scientific, New
York (2006). Chap. 17
24. Liu, D.C., Nocedal, J.: On the limited memory BFGS method for large scale opti-
mization. Math. Program. 45, 503–528 (1989)
25. Krizhevsky, A.: Learning multiple layers of features from tiny images. Technical
report, University of Toronto (2009)
26. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadar-
rama, S., Darrell, T.: Caffe: Convolutional architecture for fast feature embedding
(2014). arXiv preprint: arXiv:1408.5093
27. Lecun, Y., Cortes, C.: The MNIST database of handwritten digits. http://yann.
lecun.com/exdb/mnist/
28. Jégou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor
search. PAMI 33(1), 117–128 (2011)
29. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 60(2),
91–110 (2004)
30. Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation
of the spatial envelope. IJCV 42(3), 145–175 (2001)
31. Nguyen, V.A., Lu, J., Do, M.N.: Supervised discriminative hashing for compact
binary codes. In: ACM MM (2014)
32. Zhao, F., Huang, Y., Wang, L., Tan, T.: Deep semantic ranking based hashing for
multi-label image retrieval. In: CVPR (2015)
33. Zhang, R., Lin, L., Zhang, R., Zuo, W., Zhang, L.: Bit-scalable deep hashing
with regularized similarity learning for image retrieval and person re-identification.
IEEE Trans. Image Process. 24(12), 4766–4779 (2015)
34. Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., Darrell, T.:
DeCAF: a deep convolutional activation feature for generic visual recognition. In:
ICML (2014)

View publication stats

Advanced C++ Interview Questions You'll Most Likely Be Asked
From Everand
Advanced C++ Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Zhu 2018
No ratings yet
Zhu 2018
8 pages
End To End Binarized Neural Networks For Text Classification
No ratings yet
End To End Binarized Neural Networks For Text Classification
6 pages
Generating Latex Code For Commutative Diagrams
No ratings yet
Generating Latex Code For Commutative Diagrams
8 pages
SBC RC
No ratings yet
SBC RC
12 pages
Reduction and Fixed Points of Boolean Networks and Linear Network Coding Solvability
No ratings yet
Reduction and Fixed Points of Boolean Networks and Linear Network Coding Solvability
35 pages
Content-Based Image Retrieval Tutorial
No ratings yet
Content-Based Image Retrieval Tutorial
16 pages
Double-Bit Quantization For Hashing: Weihao Kong Wu-Jun Li
No ratings yet
Double-Bit Quantization For Hashing: Weihao Kong Wu-Jun Li
7 pages
Mini-Project #2: Instructions
No ratings yet
Mini-Project #2: Instructions
5 pages
Learning Complex Boolean Functions: Algorithms and Applications
No ratings yet
Learning Complex Boolean Functions: Algorithms and Applications
8 pages
3048 Greedy Layer Wise Training of Deep Networks
No ratings yet
3048 Greedy Layer Wise Training of Deep Networks
8 pages
NeurIPS-2022-error-correction-code-transformer-Paper-Conference
No ratings yet
NeurIPS-2022-error-correction-code-transformer-Paper-Conference
11 pages
P-3.1.4 - Pca
No ratings yet
P-3.1.4 - Pca
44 pages
Node2vec: Scalable Feature Learning For Networks: Aditya Grover Et Al. Presented By: Saim Mehmood Ahmadreza Jeddi
No ratings yet
Node2vec: Scalable Feature Learning For Networks: Aditya Grover Et Al. Presented By: Saim Mehmood Ahmadreza Jeddi
30 pages
NLP-NeuralNetworks Reading Notes
No ratings yet
NLP-NeuralNetworks Reading Notes
13 pages
Recklessly Approximate Sparse Coding
No ratings yet
Recklessly Approximate Sparse Coding
35 pages
2017 SCHUBERT - Artificial Intelligence - DBSCAN Revisited Revisited - Why and How You Should (Still) Use DBSCAN
No ratings yet
2017 SCHUBERT - Artificial Intelligence - DBSCAN Revisited Revisited - Why and How You Should (Still) Use DBSCAN
21 pages
Lecture6 421
No ratings yet
Lecture6 421
43 pages
Class XII Study Material KV 2022-23
No ratings yet
Class XII Study Material KV 2022-23
183 pages
ACSD11
No ratings yet
ACSD11
90 pages
LDPC - Low Density Parity Check Codes
No ratings yet
LDPC - Low Density Parity Check Codes
6 pages
Toward Deeper Understanding of Neural Networks: The Power of Initialization and A Dual View On Expressivity
No ratings yet
Toward Deeper Understanding of Neural Networks: The Power of Initialization and A Dual View On Expressivity
36 pages
LP - V - Lab Manual - DL
No ratings yet
LP - V - Lab Manual - DL
53 pages
LDPC Text
No ratings yet
LDPC Text
13 pages
I Nternational Journal of Computational Engineering Research (Ijceronline - Com) Vol. 2 Issue. 7
No ratings yet
I Nternational Journal of Computational Engineering Research (Ijceronline - Com) Vol. 2 Issue. 7
6 pages
Nonlinear Dimensionality Reduction
No ratings yet
Nonlinear Dimensionality Reduction
18 pages
Lattice
No ratings yet
Lattice
9 pages
IEEE Pappers Cryto
No ratings yet
IEEE Pappers Cryto
3 pages
Assignment 1
No ratings yet
Assignment 1
7 pages
The Deep Ritz Method A Deep Learning Based Method For PDE Solving 2017
No ratings yet
The Deep Ritz Method A Deep Learning Based Method For PDE Solving 2017
14 pages
Attention Is All You Need
50% (2)
Attention Is All You Need
11 pages
Attention Is All You Need Paper - Removed
No ratings yet
Attention Is All You Need Paper - Removed
9 pages
CIS 419/519 Introduction To Machine Learning Assignment 2: Instructions
No ratings yet
CIS 419/519 Introduction To Machine Learning Assignment 2: Instructions
12 pages
A Linear Programming Approach To Network Utility Maximization
No ratings yet
A Linear Programming Approach To Network Utility Maximization
7 pages
ML Module 5
No ratings yet
ML Module 5
15 pages
Soft Computing Lab Manual
No ratings yet
Soft Computing Lab Manual
24 pages
134 Faster Cnns With Direct Sparse
No ratings yet
134 Faster Cnns With Direct Sparse
11 pages
DSAD
No ratings yet
DSAD
7 pages
group3_leveraging heterogeneous graphs for extractive summarization of scientific-aspect documents
No ratings yet
group3_leveraging heterogeneous graphs for extractive summarization of scientific-aspect documents
10 pages
12 CS EM Public Answer Key May 2022
No ratings yet
12 CS EM Public Answer Key May 2022
10 pages
MODULE 2 Deep Learning
No ratings yet
MODULE 2 Deep Learning
26 pages
Generic Parallel Cryptography For Hashin
No ratings yet
Generic Parallel Cryptography For Hashin
9 pages
sharma dip
No ratings yet
sharma dip
33 pages
Pseudo Label Final
No ratings yet
Pseudo Label Final
6 pages
A Detailed Analysis of The Supervised Machine Learning Algorithms
No ratings yet
A Detailed Analysis of The Supervised Machine Learning Algorithms
5 pages
Parallel Algorithms For Logic Synthesis
No ratings yet
Parallel Algorithms For Logic Synthesis
7 pages
Embarrassingly Shallow Autoencoders For Sparse Data
No ratings yet
Embarrassingly Shallow Autoencoders For Sparse Data
7 pages
(BESTFITTERS) Inverse Image Captioning Using Generative Adversarial Networks
No ratings yet
(BESTFITTERS) Inverse Image Captioning Using Generative Adversarial Networks
12 pages
Sciencedirect: Cluster Based Application Mapping Strategy For 2D Noc
No ratings yet
Sciencedirect: Cluster Based Application Mapping Strategy For 2D Noc
8 pages
IT Professional Quiz
No ratings yet
IT Professional Quiz
25 pages
Evolutionary Design and FPGA Implementation of Digital Filters
No ratings yet
Evolutionary Design and FPGA Implementation of Digital Filters
12 pages
2011 GPTP FFX Paper
No ratings yet
2011 GPTP FFX Paper
29 pages
Revised_ APSyllabus (1)
No ratings yet
Revised_ APSyllabus (1)
20 pages
DoReFa Net
No ratings yet
DoReFa Net
13 pages
Advanced Topics in Machine Learning: Supervised Learning, Deep Learning, and Optimization Techniques
No ratings yet
Advanced Topics in Machine Learning: Supervised Learning, Deep Learning, and Optimization Techniques
5 pages
03-NDL-Midterm Scheme of evaluation
No ratings yet
03-NDL-Midterm Scheme of evaluation
7 pages
NLPmidterm Slide
No ratings yet
NLPmidterm Slide
16 pages
7181-attention-is-all-you-need
No ratings yet
7181-attention-is-all-you-need
11 pages
Attention is All you Need - NIPS-2017-attention-is-all-you-need-Paper
No ratings yet
Attention is All you Need - NIPS-2017-attention-is-all-you-need-Paper
11 pages
Image Super-Resolution Using Deep Convolutional Networks
No ratings yet
Image Super-Resolution Using Deep Convolutional Networks
13 pages
Generating Pseudo-Random Numbers in Matlab Or, How To Use Rand and Randn
No ratings yet
Generating Pseudo-Random Numbers in Matlab Or, How To Use Rand and Randn
11 pages
Unsupervised Deep Video Hashing With Balanced Rotation
No ratings yet
Unsupervised Deep Video Hashing With Balanced Rotation
7 pages
Active Sample Learning and Feature Selection: A Unified Approach
No ratings yet
Active Sample Learning and Feature Selection: A Unified Approach
11 pages
Graph Autoencoder-Based Unsupervised Feature Selection With Broad and Local Data Structure Preservation
No ratings yet
Graph Autoencoder-Based Unsupervised Feature Selection With Broad and Local Data Structure Preservation
28 pages
GSM Call Routing
100% (1)
GSM Call Routing
22 pages
Sims (Ohne Bild)
No ratings yet
Sims (Ohne Bild)
2 pages
LTE Tutorial FemtoForum Part1
No ratings yet
LTE Tutorial FemtoForum Part1
46 pages
IVR Portal Deployment and Installation Guide Ver 1.0.0
No ratings yet
IVR Portal Deployment and Installation Guide Ver 1.0.0
31 pages
Onenter Log It Will Capture The All The Content With in This Block Onleave Log
No ratings yet
Onenter Log It Will Capture The All The Content With in This Block Onleave Log
1 page
Picture 1. Basic Network Architecture For An SMS Deployment (IS-41)
No ratings yet
Picture 1. Basic Network Architecture For An SMS Deployment (IS-41)
15 pages
Analisa Uji Wilcoxon
No ratings yet
Analisa Uji Wilcoxon
3 pages
AllHome Corporation
No ratings yet
AllHome Corporation
3 pages
Calcii Vectors Basics
No ratings yet
Calcii Vectors Basics
6 pages
12th Class Preboard 1 and 2 Exam & Preboard-1 Practical Schedule 2024-25 SIS
No ratings yet
12th Class Preboard 1 and 2 Exam & Preboard-1 Practical Schedule 2024-25 SIS
3 pages
MINISTRY OF PUBLIC WORKS-Attorney IV
No ratings yet
MINISTRY OF PUBLIC WORKS-Attorney IV
1 page
Nekropol Booklet
67% (3)
Nekropol Booklet
14 pages
List of CDS Annotations
No ratings yet
List of CDS Annotations
6 pages
Com269 Text Assignment
No ratings yet
Com269 Text Assignment
7 pages
ENGLISH STATISTICAL REPORT
No ratings yet
ENGLISH STATISTICAL REPORT
4 pages
Class 8 English
No ratings yet
Class 8 English
8 pages
Nabi Bakhsh
No ratings yet
Nabi Bakhsh
3 pages
Halloween Pumpkin Gnome
100% (7)
Halloween Pumpkin Gnome
16 pages
Element
No ratings yet
Element
22 pages
Kankaria Lake Front Development
67% (3)
Kankaria Lake Front Development
128 pages
Makoons-International Affiliations
No ratings yet
Makoons-International Affiliations
1 page
Astm F1960
No ratings yet
Astm F1960
7 pages
Documentary Stamp Tax
No ratings yet
Documentary Stamp Tax
2 pages
SOP Raw Material
No ratings yet
SOP Raw Material
3 pages
Moot Memo (A) of Semi-Finalists of 2nd GNLUMSIL MOOT
100% (1)
Moot Memo (A) of Semi-Finalists of 2nd GNLUMSIL MOOT
24 pages
Star Test
No ratings yet
Star Test
64 pages
Tyco Inline Joint Single Core Unarmoured Xlpe Mechanical Conn PDF
No ratings yet
Tyco Inline Joint Single Core Unarmoured Xlpe Mechanical Conn PDF
8 pages
Index
No ratings yet
Index
18 pages
Carla Olu Icher: Kemi Schle
No ratings yet
Carla Olu Icher: Kemi Schle
1 page
PLM Dojo-Beware of Implicit Checkout
No ratings yet
PLM Dojo-Beware of Implicit Checkout
3 pages
TRM256 Welded Reinforcement Grids
No ratings yet
TRM256 Welded Reinforcement Grids
2 pages
ﻧﻤﻮذج ﻃﻠـﺐ اﻟﻔﺤﺺ الطبي
No ratings yet
ﻧﻤﻮذج ﻃﻠـﺐ اﻟﻔﺤﺺ الطبي
2 pages
Sms Spam Detectionn (1)
No ratings yet
Sms Spam Detectionn (1)
63 pages
22 December THDC Rishikesh Telephone Dir PDF
No ratings yet
22 December THDC Rishikesh Telephone Dir PDF
31 pages
Catalog OIL HEATERS Motor Grupuri IPL-109web PDF
No ratings yet
Catalog OIL HEATERS Motor Grupuri IPL-109web PDF
16 pages
L2 Risk Assessment Leakage HVAC Ducting Rectification - Plan
100% (2)
L2 Risk Assessment Leakage HVAC Ducting Rectification - Plan
5 pages