PPT_ICASSPv1
PPT_ICASSPv1
(ICASSP 2024, Seoul, Korea) Self-supervised Representation Learning Wednesday 15th May, 2024 1 / 18
Outline
1 Introduction
2 Proposed Method
4 Conclusion
5 References
(ICASSP 2024, Seoul, Korea) Self-supervised Representation Learning Wednesday 15th May, 2024 2 / 18
Introduction
Background
BACKBONE
INPUT IMAGE COLORIZATION OUTPUT
BACKBONE
BACKBONE
SSL Framework
(ICASSP 2024, Seoul, Korea) Self-supervised Representation Learning Wednesday 15th May, 2024 4 / 18
Introduction
Existing Works
V1 EQ FV1 PQ ZV1
t 1~ T
LUnSUP(ZV1, ZV2)
X
t2 ~ T
V2 EQ FV2 PQ ZV2
(ICASSP 2024, Seoul, Korea) Self-supervised Representation Learning Wednesday 15th May, 2024 5 / 18
Introduction
Limitations
(ICASSP 2024, Seoul, Korea) Self-supervised Representation Learning Wednesday 15th May, 2024 6 / 18
Introduction
Motivation
POSITIVE
SAMPLES
SUPERVISED
CONTRASTIVE
LOSS
Proposed Architecture
UPDATE ENCODER
t1 ~ T
Y W1 EQ FW1 PQ ZW1
λ
*
LSUP(ZW1,ZV1)
t 1~ T
X V1 EQ FV1 PQ ZV1 +
(1-λ)
*
LUnSUP(ZV1,ZV2)
t2 ~ T
V2 EQ FV2 PQ ZV2
PSEUDO- PSEUDO-
PSEUDO PSEUDO
SUPERVISED SUPERVISED
LABEL LABEL
TRAINING TRAINING
GENERATION GENERATION
[λ=0] [λ>0]
ENCODER UPDATE
SCAN 1ST ITTERATION SCAN 2ND ITTERATION
TRAINING TRAINING
(ICASSP 2024, Seoul, Korea) Self-supervised Representation Learning Wednesday 15th May, 2024 8 / 18
Proposed Method
Major Contributions
(ICASSP 2024, Seoul, Korea) Self-supervised Representation Learning Wednesday 15th May, 2024 9 / 18
Proposed Method
– A(i) and B(i) are the set of view indices from the same and different
pseudo-class as ith view in a multi-view batch of size 2N, respectively.
(ICASSP 2024, Seoul, Korea) Self-supervised Representation Learning Wednesday 15th May, 2024 10 / 18
Proposed Method
IPCL Algorithm.
(ICASSP 2024, Seoul, Korea) Self-supervised Representation Learning Wednesday 15th May, 2024 11 / 18
Experiments and Results
Experimental Settings
• Datasets:
– CIFAR-10 [9]: 50,000 training and 10,000 testing 32x32 color images,
10 classes, 6,000 images per category.
– STL-10 [10]: 100,000 training images and 8,000 test 96x96 color
images, 10 classes.
• Architecture:
– ResNet18 base encoder as a feature extractor.
– A 2-layer MLP projection head Yield 128-dimensional feature vectors.
• Hyper-parameters:
– Temperature coefficient of 0.1.
– Stochastic gradient descent (SGD) optimizer (momentum = 0.9 and
learning rate = 0.4) and a cosine learning rate scheduler.
– Batch size of 512.
(ICASSP 2024, Seoul, Korea) Self-supervised Representation Learning Wednesday 15th May, 2024 12 / 18
Experiments and Results
Table 1: Comparison of IPCL and SimCLR for Top-K NN precision over CIFAR-10
and STL-10 datasets
(ICASSP 2024, Seoul, Korea) Self-supervised Representation Learning Wednesday 15th May, 2024 13 / 18
Experiments and Results
Downstream Results
(ICASSP 2024, Seoul, Korea) Self-supervised Representation Learning Wednesday 15th May, 2024 14 / 18
Experiments and Results
Ablation Study
(ICASSP 2024, Seoul, Korea) Self-supervised Representation Learning Wednesday 15th May, 2024 15 / 18
Conclusion
Conclusion
(ICASSP 2024, Seoul, Korea) Self-supervised Representation Learning Wednesday 15th May, 2024 16 / 18
References
References
1 Jing, Longlong, and Yingli Tian. ”Self-supervised visual feature learning with deep neural networks: A survey.” IEEE transactions on pattern
analysis and machine intelligence 43.11 (2020): 4037-4058.
2 LeCun, Yann, and Ishan Misra. ”Self-Supervised Learning: The Dark Matter of Intelligence.” AI Meta,
ai.meta.com/blog/self-supervised-learning-the-dark-matter-of-intelligence/, Accessed 8 April 2024.
3 E. Zheltonozhskii, C. Baskin, A. M. Bronstein, and A. Mendelson, “Self-supervised learning for large-scale unsupervised image clustering,” arXiv
preprint arXiv:2008.10312, 2020.
4 Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton, “A simple framework for contrastive learning of visual representations,”
in International con- ference on machine learning. PMLR, 2020, pp. 1597–1607.
5 Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick, “Momentum contrast for unsupervised visual representation learning,” in
Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 9729–9738.
6 Mathilde Caron, Ishan Misra, Julien Mairal, Priya Goyal, Piotr Bojanowski, and Armand Joulin, “Unsupervised learning of visual features by
contrasting cluster assignments,” Advances in Neural Information Processing Systems, vol. 33, pp. 9912–9924, 2020.
7 Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan,
“Supervised contrastive learning,” Advances in Neural Information Processing Systems, vol. 33, pp. 18661–18673, 2020.
8 Wouter Van Gansbeke, Simon Vandenhende, Stamatios Georgoulis, Marc Proesmans, and Luc Van Gool, “Scan: Learning to classify images
without labels,” in European Conference on Computer Vision. Springer, 2020, pp. 268–285.
9 Alex Krizhevsky, Geoffrey Hinton, et al., “Learning multiple layers of features from tiny images,” 2009.
10 Adam Coates, Andrew Ng, and Honglak Lee, “An analysis of single-layer networks in unsupervised feature learning,” in Proceedings of the
fourteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, 2011, pp. 215–223.
(ICASSP 2024, Seoul, Korea) Self-supervised Representation Learning Wednesday 15th May, 2024 17 / 18
References
Q&A
THANK YOU
GitHub Link
(ICASSP 2024, Seoul, Korea) Self-supervised Representation Learning Wednesday 15th May, 2024 18 / 18