Similarity Learning

Similarity learning is a machine learning technique to learn similarity functions that measure how related two objects are. It has applications in ranking, recommendation systems, and tasks like face/speaker verification. Common approaches include regression, classification, and ranking similarity learning as well as locality sensitive hashing. A common technique is to model the similarity function as a bilinear form and learn a matrix that parameterizes it. Similarity learning is closely related to metric learning which learns distance functions between objects.

Uploaded by

watson191

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views

Similarity Learning

Uploaded by

watson191

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Similarity learning

Similarity learning is an area of supervised machine learning in artificial intelligence. It is closely related to
regression and classification, but the goal is to learn a similarity function that measures how similar or
related two objects are. It has applications in ranking, in recommendation systems, visual identity tracking,
face verification, and speaker verification.

Learning setup
There are four common setups for similarity and metric distance learning.

Regression similarity learning

In this setup, pairs of objects are given together with a measure of their similarity
. The goal is to learn a function that approximates for every new
labeled triplet example . This is typically achieved by minimizing a regularized
loss .

Classification similarity learning

Given are pairs of similar objects and non similar objects . An equivalent
formulation is that every pair is given together with a binary label that
determines if the two objects are similar or not. The goal is again to learn a classifier that
can decide if a new pair of objects is similar or not.
Ranking similarity learning
Given are triplets of objects whose relative similarity obey a predefined order:
is known to be more similar to than to . The goal is to learn a function such that
for any new triplet of objects , it obeys (contrastive
learning). This setup assumes a weaker form of supervision than in regression, because
instead of providing an exact measure of similarity, one only has to provide the relative
order of similarity. For this reason, ranking-based similarity learning is easier to apply in
real large-scale applications.[1]
Locality sensitive hashing (LSH)[2]
Hashes input items so that similar items map to the same "buckets" in memory with high
probability (the number of buckets being much smaller than the universe of possible input
items). It is often applied in nearest neighbor search on large-scale high-dimensional data,
e.g., image databases, document collections, time-series databases, and genome
databases.[3]

A common approach for learning similarity is to model the similarity function as a bilinear form. For
example, in the case of ranking similarity learning, one aims to learn a matrix W that parametrizes the
similarity function . When data is abundant, a common approach is to learn a siamese
network - A deep network model with parameter sharing.

Metric learning
Similarity learning is closely related to distance metric learning. Metric learning is the task of learning a
distance function over objects. A metric or distance function has to obey four axioms: non-negativity,
identity of indiscernibles, symmetry and subadditivity (or the triangle inequality). In practice, metric
learning algorithms ignore the condition of identity of indiscernibles and learn a pseudo-metric.

When the objects are vectors in , then any matrix in the symmetric positive semi-definite cone
defines a distance pseudo-metric of the space of x through the form
. When is a symmetric positive definite matrix, is a
metric. Moreover, as any symmetric positive semi-definite matrix can be decomposed as
where and , the distance function can be rewritten equivalently
. The distance
corresponds to the Euclidean distance between the transformed feature
vectors and .

Many formulations for metric learning have been proposed.[4][5] Some well-known approaches for metric
learning include Learning from relative comparisons[6] which is based on the Triplet loss, Large margin
nearest neighbor,[7] Information theoretic metric learning (ITML).[8]

In statistics, the covariance matrix of the data is sometimes used to define a distance metric called
Mahalanobis distance.

Applications
Similarity learning is used in information retrieval for learning to rank, in face verification or face
identification,[9][10] and in recommendation systems. Also, many machine learning approaches rely on
some metric. This includes unsupervised learning such as clustering, which groups together close or similar
objects. It also includes supervised approaches like K-nearest neighbor algorithm which rely on labels of
nearby objects to decide on the label of a new object. Metric learning has been proposed as a preprocessing
step for many of these approaches.[11]

Scalability
Metric and similarity learning naively scale quadratically with the dimension of the input space, as can
easily see when the learned metric has a bilinear form . Scaling to higher dimensions
can be achieved by enforcing a sparseness structure over the matrix model, as done with HDSL,[12] and
with COMET.[13]

Software
metric-learn (https://github.com/scikit-learn-contrib/metric-learn) is a free software Python
library which offers efficient implementations of several supervised and weakly-supervised
similarity and metric learning algorithms. The API of metric-learn is compatible with scikit-
learn.[14]
OpenMetricLearning (https://github.com/OML-Team/open-metric-learning) is a Python
framework to train and validate the models producing high-quality embeddings.

See also
Kernel method
Learning to rank
Latent semantic analysis

Further reading
For further information on this topic, see the surveys on metric and similarity learning by Bellet et al.[4] and
Kulis.[5]

References
1. Chechik, G.; Sharma, V.; Shalit, U.; Bengio, S. (2010). "Large Scale Online Learning of
Image Similarity Through Ranking" (http://www.jmlr.org/papers/volume11/chechik10a/chechi
k10a.pdf) (PDF). Journal of Machine Learning Research. 11: 1109–1135.
2. Gionis, Aristides, Piotr Indyk, and Rajeev Motwani. "Similarity search in high dimensions via
hashing." VLDB. Vol. 99. No. 6. 1999.
3. Rajaraman, A.; Ullman, J. (2010). "Mining of Massive Datasets, Ch. 3" (http://infolab.stanford.
edu/~ullman/mmds.html).
4. Bellet, A.; Habrard, A.; Sebban, M. (2013). "A Survey on Metric Learning for Feature Vectors
and Structured Data". arXiv:1306.6709 (https://arxiv.org/abs/1306.6709) [cs.LG (https://arxiv.
org/archive/cs.LG)].
5. Kulis, B. (2012). "Metric Learning: A Survey" (https://www.nowpublishers.com/article/Details/
MAL-019). Foundations and Trends in Machine Learning. 5 (4): 287–364.
doi:10.1561/2200000019 (https://doi.org/10.1561%2F2200000019).
6. Schultz, M.; Joachims, T. (2004). "Learning a distance metric from relative comparisons" (htt
ps://papers.nips.cc/paper/2366-learning-a-distance-metric-from-relative-comparisons.pdf)
(PDF). Advances in Neural Information Processing Systems. 16: 41–48.
7. Weinberger, K. Q.; Blitzer, J. C.; Saul, L. K. (2006). "Distance Metric Learning for Large
Margin Nearest Neighbor Classification" (http://books.nips.cc/papers/files/nips18/NIPS2005
_0265.pdf) (PDF). Advances in Neural Information Processing Systems. 18: 1473–1480.
8. Davis, J. V.; Kulis, B.; Jain, P.; Sra, S.; Dhillon, I. S. (2007). "Information-theoretic metric
learning" (http://www.cs.utexas.edu/users/pjain/itml/). International Conference in Machine
Learning (ICML): 209–216.
9. Guillaumin, M.; Verbeek, J.; Schmid, C. (2009). "Is that you? Metric learning approaches for
face identification" (http://hal.inria.fr/docs/00/58/50/36/PDF/verbeek09iccv2.pdf) (PDF). IEEE
International Conference on Computer Vision (ICCV).
10. Mignon, A.; Jurie, F. (2012). "PCCA: A new approach for distance learning from sparse
pairwise constraints" (http://hal.archives-ouvertes.fr/docs/00/80/60/07/PDF/12_cvpr_ldca.pd
f) (PDF). IEEE Conference on Computer Vision and Pattern Recognition.
11. Xing, E. P.; Ng, A. Y.; Jordan, M. I.; Russell, S. (2002). "Distance Metric Learning, with
Application to Clustering with Side-information" (https://ai.stanford.edu/~ang/papers/nips02-
metric.pdf) (PDF). Advances in Neural Information Processing Systems. 15: 505–512.
12. Liu; Bellet; Sha (2015). "Similarity Learning for High-Dimensional Sparse Data" (http://jmlr.or
g/proceedings/papers/v38/liu15.pdf) (PDF). International Conference on Artificial
Intelligence and Statistics (AISTATS). arXiv:1411.2374 (https://arxiv.org/abs/1411.2374).
Bibcode:2014arXiv1411.2374L (https://ui.adsabs.harvard.edu/abs/2014arXiv1411.2374L).
13. Atzmon; Shalit; Chechik (2015). "Learning Sparse Metrics, One Feature at a Time" (http://jml
r.org/proceedings/papers/v44/atzmon2015.pdf) (PDF). J. Mach. Learn. Research (JMLR).
14. Vazelhes; Carey; Tang; Vauquier; Bellet (2020). "metric-learn: Metric Learning Algorithms in
Python" (https://www.jmlr.org/papers/volume21/19-678/19-678.pdf) (PDF). J. Mach. Learn.
Research (JMLR). arXiv:1908.04710 (https://arxiv.org/abs/1908.04710).

Retrieved from "https://en.wikipedia.org/w/index.php?title=Similarity_learning&oldid=1160866705"

Similarity-Based Learning: Exercise Solutions: Solutionsmanual-Mit-7X9-Style 2015/4/22 21:17 Page 45 #55
No ratings yet
Similarity-Based Learning: Exercise Solutions: Solutionsmanual-Mit-7X9-Style 2015/4/22 21:17 Page 45 #55
10 pages
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet
Similarity Learning_ Definition, Methods & Applications _
No ratings yet
Similarity Learning_ Definition, Methods & Applications _
6 pages
Learning A Similarity Metric Discriminatively, With Application To Face Verification
No ratings yet
Learning A Similarity Metric Discriminatively, With Application To Face Verification
8 pages
09 Siamese Netwirk
No ratings yet
09 Siamese Netwirk
83 pages
1912.11615v2
No ratings yet
1912.11615v2
36 pages
19-678
No ratings yet
19-678
6 pages
Metric Learning
No ratings yet
Metric Learning
10 pages
Machine Learning: Fundamentals and Applications
From Everand
Machine Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Similarity_based_learning_(part_1)__
No ratings yet
Similarity_based_learning_(part_1)__
6 pages
Similarity Analysis
No ratings yet
Similarity Analysis
85 pages
4.4-InstanceBasedLearning Part 1
No ratings yet
4.4-InstanceBasedLearning Part 1
16 pages
Machine Learning Lab Experiments
No ratings yet
Machine Learning Lab Experiments
40 pages
Pattern Recognition: Fundamentals and Applications
From Everand
Pattern Recognition: Fundamentals and Applications
Fouad Sabry
No ratings yet
Module3-Similarity-based Learning-11Mar2024
No ratings yet
Module3-Similarity-based Learning-11Mar2024
34 pages
Aiml Module 3 Part 2
No ratings yet
Aiml Module 3 Part 2
12 pages
CVPR18 - Deep Adversarial Metric Learning
No ratings yet
CVPR18 - Deep Adversarial Metric Learning
10 pages
Machine Learning
No ratings yet
Machine Learning
6 pages
Machine Learning Overview
No ratings yet
Machine Learning Overview
54 pages
2018_A Review on Multi-task Metric Learning_Yang Et Al_Big Data Analytics
No ratings yet
2018_A Review on Multi-task Metric Learning_Yang Et Al_Big Data Analytics
23 pages
AI notes Module- 4
No ratings yet
AI notes Module- 4
13 pages
Text Book 2 Module 4 Chapter 3-Similarity Based Learning
No ratings yet
Text Book 2 Module 4 Chapter 3-Similarity Based Learning
12 pages
Machine Lab
No ratings yet
Machine Lab
38 pages
Machine Learning Lab Manual - New
No ratings yet
Machine Learning Lab Manual - New
35 pages
Data Mining: Similarity and Distance Recommendation Systems Sketching, Locality Sensitive Hashing
No ratings yet
Data Mining: Similarity and Distance Recommendation Systems Sketching, Locality Sensitive Hashing
57 pages
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
From Everand
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
Fouad Sabry
No ratings yet
Unit-5-1
No ratings yet
Unit-5-1
113 pages
Predict Based Simmiliarity and Validation
No ratings yet
Predict Based Simmiliarity and Validation
19 pages
Semi-Supervised Online Multi-Kernel Similarity Learning For Image Retrieval
No ratings yet
Semi-Supervised Online Multi-Kernel Similarity Learning For Image Retrieval
1 page
J Neunet 2018 06 003
No ratings yet
J Neunet 2018 06 003
28 pages
CosineSimilarityMetricLearning ACCV10
No ratings yet
CosineSimilarityMetricLearning ACCV10
13 pages
Lecture 10
No ratings yet
Lecture 10
26 pages
Similarity_Based_learning_(part_2_)__
No ratings yet
Similarity_Based_learning_(part_2_)__
15 pages
SIRL: Similarity-Based Implicit Representation Learning: Andreea Bobu Yi Liu Rohin Shah
No ratings yet
SIRL: Similarity-Based Implicit Representation Learning: Andreea Bobu Yi Liu Rohin Shah
12 pages
5 - AIML - Module3 - PPT
No ratings yet
5 - AIML - Module3 - PPT
37 pages
ml_lecture
No ratings yet
ml_lecture
73 pages
Ai Lect6 Genetic
No ratings yet
Ai Lect6 Genetic
94 pages
AIML Module 3
No ratings yet
AIML Module 3
25 pages
Approaches To Machine Learning
No ratings yet
Approaches To Machine Learning
4 pages
Week5 Paper2
No ratings yet
Week5 Paper2
19 pages
[Cao'12]Generalization-Bounds-Metric-Learning
No ratings yet
[Cao'12]Generalization-Bounds-Metric-Learning
20 pages
CS-671: Deep Learning and Its Applications Distance Metric Learning
No ratings yet
CS-671: Deep Learning and Its Applications Distance Metric Learning
15 pages
BookSlides 5A Similarity Based Learning
No ratings yet
BookSlides 5A Similarity Based Learning
40 pages
Aiml M3 C2
No ratings yet
Aiml M3 C2
56 pages
Learning
No ratings yet
Learning
48 pages
Metric and Kernel Learning Using A Linear Transformation: Prateek Jain
No ratings yet
Metric and Kernel Learning Using A Linear Transformation: Prateek Jain
29 pages
Chapter 1
No ratings yet
Chapter 1
3 pages
Supervised Learning vs. Unsupervised Learning
No ratings yet
Supervised Learning vs. Unsupervised Learning
7 pages
MLunit 1
No ratings yet
MLunit 1
88 pages
191AIC502T - Machine Learning - Unit 1
No ratings yet
191AIC502T - Machine Learning - Unit 1
41 pages
Topic 3 ML (Hazem)
No ratings yet
Topic 3 ML (Hazem)
160 pages
AI Lect8 Neural
No ratings yet
AI Lect8 Neural
84 pages
DSB- Unit3
No ratings yet
DSB- Unit3
87 pages
Exercise 4: Self-Organizing Maps: Articial Neural Networks and Other Learning Systems, 2D1432
No ratings yet
Exercise 4: Self-Organizing Maps: Articial Neural Networks and Other Learning Systems, 2D1432
7 pages
Ai Unit Iv
No ratings yet
Ai Unit Iv
25 pages
CHP 4
No ratings yet
CHP 4
24 pages
Similarity-Based Heterogeneous Neural Networks: Llu Is A. Belanche Mu Noz Julio Jos e Vald Es Ramos
No ratings yet
Similarity-Based Heterogeneous Neural Networks: Llu Is A. Belanche Mu Noz Julio Jos e Vald Es Ramos
14 pages
contrastive learning
No ratings yet
contrastive learning
10 pages
Unit-1 - Machine Learning
No ratings yet
Unit-1 - Machine Learning
85 pages
Distance Metric Learning: A Comprehensive Survey: Liu Yang Advisor: Rong Jin May 8th, 2006
No ratings yet
Distance Metric Learning: A Comprehensive Survey: Liu Yang Advisor: Rong Jin May 8th, 2006
51 pages
Multidimensional Analysis
No ratings yet
Multidimensional Analysis
2 pages
Linear Least Squares
No ratings yet
Linear Least Squares
10 pages
Instance-Based Learning
No ratings yet
Instance-Based Learning
1 page
Fourier Analysis
No ratings yet
Fourier Analysis
10 pages
List of Bioinformatics Institutions
No ratings yet
List of Bioinformatics Institutions
1 page
Singular Value Decomposition
No ratings yet
Singular Value Decomposition
15 pages
Statistical Distance
No ratings yet
Statistical Distance
3 pages
Nearest-Neighbor Interpolation
No ratings yet
Nearest-Neighbor Interpolation
2 pages
Computational Biology
No ratings yet
Computational Biology
11 pages
Modelling Biological Systems
No ratings yet
Modelling Biological Systems
7 pages
Sparse Distributed Memory
No ratings yet
Sparse Distributed Memory
15 pages
Jumping Library
No ratings yet
Jumping Library
8 pages
List of Bioinformatics Companies
No ratings yet
List of Bioinformatics Companies
1 page
Biodiversity Informatics
No ratings yet
Biodiversity Informatics
9 pages
IBM Watson Studio
No ratings yet
IBM Watson Studio
2 pages
Cyberbiosecurity
No ratings yet
Cyberbiosecurity
3 pages
Nucleic Acid Sequence
No ratings yet
Nucleic Acid Sequence
8 pages
Metabolomics
No ratings yet
Metabolomics
12 pages
Gene Disease Database
No ratings yet
Gene Disease Database
16 pages
Proteomics
No ratings yet
Proteomics
24 pages
MATLAB
No ratings yet
MATLAB
11 pages
Wolfram Mathematica
No ratings yet
Wolfram Mathematica
7 pages
LIONsolver
No ratings yet
LIONsolver
2 pages
Poly Analyst
No ratings yet
Poly Analyst
4 pages
Caffe (Software)
No ratings yet
Caffe (Software)
2 pages
RCASE
No ratings yet
RCASE
3 pages
Splunk
No ratings yet
Splunk
11 pages
Neural Designer
No ratings yet
Neural Designer
2 pages
NACALIMCON Harmony in Composition
No ratings yet
NACALIMCON Harmony in Composition
8 pages
Organizational Agility Level Evaluation Model and Empirical Assessment in High-Growth Companies
No ratings yet
Organizational Agility Level Evaluation Model and Empirical Assessment in High-Growth Companies
14 pages
Homework 4-f: Plane Surveying Gadil, Alexander
No ratings yet
Homework 4-f: Plane Surveying Gadil, Alexander
9 pages
EC410-Chapter 1
No ratings yet
EC410-Chapter 1
36 pages
Kumax: (1000 V / 1500 V) Cs3U-375 - 380 - 385 - 390 - 395Ms
No ratings yet
Kumax: (1000 V / 1500 V) Cs3U-375 - 380 - 385 - 390 - 395Ms
2 pages
MCA Full Syllabus 2019-2021 (Leet) PDF
No ratings yet
MCA Full Syllabus 2019-2021 (Leet) PDF
87 pages
Ap Chemistry Review Sheet
No ratings yet
Ap Chemistry Review Sheet
9 pages
Poly Works Talisman User Guide For Apple Mobile Devices
No ratings yet
Poly Works Talisman User Guide For Apple Mobile Devices
45 pages
DT 9985 Bedienungsanleitung Eng
No ratings yet
DT 9985 Bedienungsanleitung Eng
32 pages
A Novel Grain Level Measurement Method For Silos
No ratings yet
A Novel Grain Level Measurement Method For Silos
5 pages
Stage 2-Landscape Design
No ratings yet
Stage 2-Landscape Design
61 pages
MAD IMP programs
No ratings yet
MAD IMP programs
2 pages
Programmable Logic Controller (PLC)
No ratings yet
Programmable Logic Controller (PLC)
43 pages
(Annual Reports On NMR Spectroscopy 1) E.F. Mooney (Eds.) - Elsevier, Academic Press (1968) PDF
No ratings yet
(Annual Reports On NMR Spectroscopy 1) E.F. Mooney (Eds.) - Elsevier, Academic Press (1968) PDF
365 pages
MDI Training Profile 2023
No ratings yet
MDI Training Profile 2023
16 pages
Partial Discharge On Bushing
100% (2)
Partial Discharge On Bushing
87 pages
Digital and Data Communications
No ratings yet
Digital and Data Communications
7 pages
AmanKumar (2 7)
No ratings yet
AmanKumar (2 7)
1 page
5199 Full Document
No ratings yet
5199 Full Document
75 pages
EHP-5 Pneumatics and Hydraulics COURSE 2-1441-1442 New Update
100% (1)
EHP-5 Pneumatics and Hydraulics COURSE 2-1441-1442 New Update
255 pages
Aethereus Thesis PDF Hai Kya Kisi Ki Banai Hui Refrence Ke Liye TesScience and Technology
No ratings yet
Aethereus Thesis PDF Hai Kya Kisi Ki Banai Hui Refrence Ke Liye TesScience and Technology
8 pages
Complete Download Quantum Cascade Lasers Jérôme Faist PDF All Chapters
No ratings yet
Complete Download Quantum Cascade Lasers Jérôme Faist PDF All Chapters
51 pages
Recloser-Fuse Coordination Protection For Distributed Generation Systems Methodology and Priorities For Optimal Disconnections
No ratings yet
Recloser-Fuse Coordination Protection For Distributed Generation Systems Methodology and Priorities For Optimal Disconnections
6 pages
Rodrigues M. Et Al. Characterization Silver Coins. 2011
No ratings yet
Rodrigues M. Et Al. Characterization Silver Coins. 2011
5 pages
Amm Tasks Airbus A 319 320 321
No ratings yet
Amm Tasks Airbus A 319 320 321
6 pages
Processing of Sandwich Structures: by Dr. Laraib Alam Khan
No ratings yet
Processing of Sandwich Structures: by Dr. Laraib Alam Khan
25 pages
Structural Steelwork Connections_Graham W. Owens
100% (1)
Structural Steelwork Connections_Graham W. Owens
335 pages
Manual Taller Motor Rotax
0% (1)
Manual Taller Motor Rotax
155 pages
Physics 7B 2020 Lec2 MT2
No ratings yet
Physics 7B 2020 Lec2 MT2
5 pages
IFEM.Ch16
No ratings yet
IFEM.Ch16
14 pages

Similarity Learning

Uploaded by

Similarity Learning

Uploaded by

Similarity learning

Regression similarity learning

Classification similarity learning

Retrieved from "https://en.wikipedia.org/w/index.php?title=Similarity_learning&oldid=1160866705"

You might also like