„I met Mohsen when he was a Master's student and we worked together on several projects related to anomaly detection and video understanding. It was a pleasure to collaborate with him and witness his progress. He was passionate, hardworking, and creative. He is now a well-respected researcher in the community, having published top papers in his field. Additionally, he is always willing to lend a listening ear and provide valuable insight and guidance. I have no doubt that he will continue to make significant contributions to the field and I am proud to have had the opportunity to work with him.“
Mohsen Fayyaz
Berlin, Berlin, Deutschland
4553 Follower:innen
500+ Kontakte
Aktivitäten
-
Are you passionate about Large Vision models? How about efficiency? We are looking for reviewers for eLVM workshop workshop page:…
Are you passionate about Large Vision models? How about efficiency? We are looking for reviewers for eLVM workshop workshop page:…
Beliebt bei Mohsen Fayyaz
-
Hey, friends, I just joined Bluesky. If you’re there too, please follow me there so I can find you. https://lnkd.in/gn7_TW87 Cheers.
Hey, friends, I just joined Bluesky. If you’re there too, please follow me there so I can find you. https://lnkd.in/gn7_TW87 Cheers.
Beliebt bei Mohsen Fayyaz
-
Grateful for the opportunity to spend time with esteemed colleagues from across the company, discussing our work in advancing human-agent…
Grateful for the opportunity to spend time with esteemed colleagues from across the company, discussing our work in advancing human-agent…
Beliebt bei Mohsen Fayyaz
Berufserfahrung
Ausbildung
-
-
–
Supervisor: Dr. K. Kiani.
Thesis: Designing and Implementing a Cloud-based Accounting System
Ranked First with highest GPA among all of the university computer software engineering students since the year 2010 -
–
Activities and Societies: Computer Vision and Robotics Team
National Organization for Development of Exceptional Talent (NODET)
Bescheinigungen und Zertifikate
Ehrenamt
-
Admin
Deeplearning.ir
–Heute 9 Jahre 1 Monat
Ausbildung
Helping Iranian graduate students in learning machine learning and deep learning by answering questions in our Q-A forum in http://www.deeplearning.ir website.
Veröffentlichungen
-
Fast Weakly Supervised Action Segmentation Using Mutual Consistency
IEEE Transactions on Pattern Analysis and Machine Intelligence, TPAMI
Action segmentation is the task of predicting the actions in each frame of a video. Because of the high cost of preparing training videos with full supervision for action segmentation, weakly supervised approaches which are able to learn only from transcripts are very appealing. In this paper, we propose a new approach for weakly supervised action segmentation based on a two branch network. The two branches of our network predict two redundant but different representations for action…
Action segmentation is the task of predicting the actions in each frame of a video. Because of the high cost of preparing training videos with full supervision for action segmentation, weakly supervised approaches which are able to learn only from transcripts are very appealing. In this paper, we propose a new approach for weakly supervised action segmentation based on a two branch network. The two branches of our network predict two redundant but different representations for action segmentation. During training we introduce a new mutual consistency loss (MuCon) that enforces that these two representations are consistent. Using MuCon and a transcript prediction loss, our network achieves state-of-the-art results for action segmentation and action alignment while being fully differentiable and faster to train since it does not require a costly alignment step during training.
Andere Autor:innenVeröffentlichung anzeigen -
AVID: Adversarial Visual Irregularity Detection
ACCV 2018, Asian Conference on Computer Vision
Real-time detection of irregularities in visual data is very invaluable and useful in many prospective applications including surveillance, patient monitoring systems, etc. With the surge of deep learning methods in the recent years, researchers have tried a wide spectrum of methods for different applications. However, for the case of irregularity or anomaly detection in videos, training an end-to-end model is still an open challenge, since often irregularity is not well-defined and there are…
Real-time detection of irregularities in visual data is very invaluable and useful in many prospective applications including surveillance, patient monitoring systems, etc. With the surge of deep learning methods in the recent years, researchers have tried a wide spectrum of methods for different applications. However, for the case of irregularity or anomaly detection in videos, training an end-to-end model is still an open challenge, since often irregularity is not well-defined and there are not enough irregular samples to use during training. In this paper, inspired by the success of generative adversarial networks (GANs) for training deep models in unsupervised or self-supervised settings, we propose an end-to-end deep network for detection and fine localization of irregularities in videos (and images). Our proposed architecture is composed of two networks, which are trained in competing with each other while collaborating to find the irregularity. One network works as a pixel-level irregularity Inpainter, and the other works as a patch-level Detector. After an adversarial self-supervised training, in which I tries to fool D into accepting its inpainted output as regular (normal), the two networks collaborate to detect and fine-segment the irregularity in any given testing video. Our results on three different datasets show that our method can outperform the state-of-the-art and fine-segment the irregularity.
Andere Autor:innenVeröffentlichung anzeigen -
Spatio-Temporal Channel Correlation Networks for Action Classification
ECCV 2018, European Conference on Computer Vision
The work in this paper is driven by the question if spatio-temporal correlations are enough for 3D convolutional neural networks (CNN)? Most of the traditional 3D networks use local spatio-temporal features. We introduce a new block that models correlations between channels of a 3D CNN with respect to temporal and spatial features. This new block can be added as a residual unit to different parts of 3D CNNs. We name our novel block 'Spatio-Temporal Channel Correlation' (STC). By embedding this…
The work in this paper is driven by the question if spatio-temporal correlations are enough for 3D convolutional neural networks (CNN)? Most of the traditional 3D networks use local spatio-temporal features. We introduce a new block that models correlations between channels of a 3D CNN with respect to temporal and spatial features. This new block can be added as a residual unit to different parts of 3D CNNs. We name our novel block 'Spatio-Temporal Channel Correlation' (STC). By embedding this block to the current state-of-the-art architectures such as ResNext and ResNet, we improved the performance by 2-3% on Kinetics dataset. Our experiments show that adding STC blocks to current state-of-the-art architectures outperforms the state-of-the-art methods on the HMDB51, UCF101 and Kinetics datasets. The other issue in training 3D CNNs is about training them from scratch with a huge labeled dataset to get a reasonable performance. So the knowledge learned in 2D CNNs is completely ignored. Another contribution in this work is a simple and effective technique to transfer knowledge from a pre-trained 2D CNN to a randomly initialized 3D CNN for a stable weight initialization. This allows us to significantly reduce the number of training samples for 3D CNNs. Thus, by fine-tuning this network, we beat the performance of generic and recent methods in 3D CNNs, which were trained on large video datasets, e.g. Sports-1M, and fine-tuned on the target datasets, e.g. HMDB51/UCF101.
Andere Autor:innenVeröffentlichung anzeigen -
Temporal 3D ConvNets by Temporal Transition Layer
CVPR 2018, IEEE Conference on Computer Vision and Pattern Recognition Workshops
The work in this paper is driven by the question how to exploit the temporal cues available in videos for their accurate classification, and for human action recognition in particular? Thus far, the vision community has focused on spatio-temporal approaches with fixed temporal convolution kernel depths. We introduce a new temporal layer that models variable temporal convolution kernel depths. We embed this new temporal layer in our proposed 3D CNN. We extend the DenseNet architecture - which…
The work in this paper is driven by the question how to exploit the temporal cues available in videos for their accurate classification, and for human action recognition in particular? Thus far, the vision community has focused on spatio-temporal approaches with fixed temporal convolution kernel depths. We introduce a new temporal layer that models variable temporal convolution kernel depths. We embed this new temporal layer in our proposed 3D CNN. We extend the DenseNet architecture - which normally is 2D - with 3D filters and pooling kernels. We name our proposed video convolutional network `Temporal 3D ConvNet'~(T3D) and its new temporal layer `Temporal Transition Layer'~(TTL). Our experiments show that T3D outperforms the current state-of-the-art methods on the HMDB51, UCF101 and Kinetics datasets.
Andere Autor:innenVeröffentlichung anzeigen -
Deep-anomaly: Fully convolutional neural network for fast anomaly detection in crowded scenes
Computer Vision and Image Understanding
The detection of abnormal behavior in crowded scenes has to deal with many challenges. This paper presents an efficient method for detection and localization of anomalies in videos. Using fully convolutional neural networks (FCNs) and temporal data, a pre-trained supervised FCN is transferred into an unsupervised FCN ensuring the detection of (global) anomalies in scenes. High performance in terms of speed and accuracy is achieved by investigating the cascaded detection as a result of reducing…
The detection of abnormal behavior in crowded scenes has to deal with many challenges. This paper presents an efficient method for detection and localization of anomalies in videos. Using fully convolutional neural networks (FCNs) and temporal data, a pre-trained supervised FCN is transferred into an unsupervised FCN ensuring the detection of (global) anomalies in scenes. High performance in terms of speed and accuracy is achieved by investigating the cascaded detection as a result of reducing computation complexities. This FCN-based architecture addresses two main tasks, feature representation and cascaded outlier detection. Experimental results on two benchmarks suggest that the proposed method outperforms existing methods in terms of accuracy regarding detection and localization.
Andere Autor:innenVeröffentlichung anzeigen -
Towards Principled Design of Deep Convolutional Networks: Introducing SimpNet
arXiv preprint
Major winning Convolutional Neural Networks (CNNs), such as VGGNet, ResNet, DenseNet, \etc, include tens to hundreds of millions of parameters, which impose considerable computation and memory overheads. This limits their practical usage in training and optimizing for real-world applications. On the contrary, light-weight architectures, such as SqueezeNet, are being proposed to address this issue. However, they mainly suffer from low accuracy, as they have compromised between the processing…
Major winning Convolutional Neural Networks (CNNs), such as VGGNet, ResNet, DenseNet, \etc, include tens to hundreds of millions of parameters, which impose considerable computation and memory overheads. This limits their practical usage in training and optimizing for real-world applications. On the contrary, light-weight architectures, such as SqueezeNet, are being proposed to address this issue. However, they mainly suffer from low accuracy, as they have compromised between the processing power and efficiency. These inefficiencies mostly stem from following an ad-hoc designing procedure. In this work, we discuss and propose several crucial design principles for an efficient architecture design and elaborate intuitions concerning different aspects of the design procedure. Furthermore, we introduce a new layer called {\it SAF-pooling} to improve the generalization power of the network while keeping it simple by choosing best features. Based on such principles, we propose a simple architecture called {\it SimpNet}. We empirically show that SimpNet provides a good trade-off between the computation/memory efficiency and the accuracy solely based on these primitive but crucial principles. SimpNet outperforms the deeper and more complex architectures such as VGGNet, ResNet, WideResidualNet \etc, on several well-known benchmarks, while having 2 to 25 times fewer number of parameters and operations. We obtain state-of-the-art results (in terms of a balance between the accuracy and the number of involved parameters) on standard datasets, such as CIFAR10, CIFAR100, MNIST and SVHN.
Andere Autor:innenVeröffentlichung anzeigen -
Deep-cascade: Cascading 3D Deep Neural Networks for Fast Anomaly Detection and Localization in Crowded Scenes
IEEE Transactions on Image Processing, TIP
This paper proposes a fast and reliable method for anomaly detection and localization in video data showing crowded scenes. Time-efficient anomaly localization is an ongoing challenge and subject of this paper. We propose a cubic patch-based method, characterized by a cascade of classifiers, which makes use of an advanced feature learning approach. Our cascade of classifiers has two main stages. First, a light but deep 3D auto-encoder is used for early identification of “many” normal cubic…
This paper proposes a fast and reliable method for anomaly detection and localization in video data showing crowded scenes. Time-efficient anomaly localization is an ongoing challenge and subject of this paper. We propose a cubic patch-based method, characterized by a cascade of classifiers, which makes use of an advanced feature learning approach. Our cascade of classifiers has two main stages. First, a light but deep 3D auto-encoder is used for early identification of “many” normal cubic patches. This deep network operates on small cubic patches as being the first stage, before carefully resizing remaining candidates of interest, and evaluating those at the second stage using a more complex and deeper 3D convolutional neural network (CNN). We divide the deep autoencoder and the CNN into multiple sub-stages which operate as cascaded classifiers. Shallow layers of the cascaded deep networks (designed as Gaussian classifiers, acting as weak single class classifiers) detect “simple” normal patches such as background patches, and more complex normal patches are detected at deeper layers. It is shown that the proposed novel technique (a cascade of two cascaded classifiers) performs comparable to current top-performing detection and localization methods on standard benchmarks, but outperforms those in general with respect to required computation time.
Andere Autor:innenVeröffentlichung anzeigen -
STFCN - Spatio-Temporal Fully Convolutional Neural Network for Semantic Segmentation of Street Scenes
ACCV 2016, Asian Conference on Computer Vision Workshops
This paper presents a novel method to involve both spatial and temporal features for semantic video segmentation. Current work on convolutional neural networks(CNNs) has shown that CNNs provide advanced spatial features supporting a very good performance of solutions for both image and video analysis, especially for the semantic segmentation task. We investigate how involving temporal features also has a good effect on segmenting video data. We propose a module based on a long short-term memory…
This paper presents a novel method to involve both spatial and temporal features for semantic video segmentation. Current work on convolutional neural networks(CNNs) has shown that CNNs provide advanced spatial features supporting a very good performance of solutions for both image and video analysis, especially for the semantic segmentation task. We investigate how involving temporal features also has a good effect on segmenting video data. We propose a module based on a long short-term memory (LSTM) architecture of a recurrent neural network for interpreting the temporal characteristics of video frames over time. Our system takes as input frames of a video and produces a correspondingly-sized output; for segmenting the video our method combines the use of three components: First, the regional spatial features of frames are extracted using a CNN; then, using LSTM the temporal features are added; finally, by deconvolving the spatio-temporal features we produce pixel-wise predictions. Our key insight is to build spatio-temporal convolutional networks (spatio-temporal CNNs) that have an end-to-end architecture for semantic video segmentation. We adapted fully some known convolutional network architectures (such as FCN-AlexNet and FCN-VGG16), and dilated convolution into our spatio-temporal CNNs. Our spatio-temporal CNNs achieve state-of-the-art semantic segmentation, as demonstrated for the Camvid and NYUDv2 datasets.
Andere Autor:innenVeröffentlichung anzeigen -
A novel approach for Finger Vein verification based on self-taught learning
9th Iranian Conference on Machine Vision and Image Processing (MVIP)
In this paper, we propose a method for user Finger Vein Authentication (FVA) as a biometric system. Using the discriminative features for classifying theses finger veins is one of the main tips that make difference in related works, thus we propose to learn a set of representative features, based on auto-encoders. We model the represented users' finger vein structure using a Gaussian distribution. Experimental results show that our method performs like a state-of-the-art method on SDUMLA-HMT…
In this paper, we propose a method for user Finger Vein Authentication (FVA) as a biometric system. Using the discriminative features for classifying theses finger veins is one of the main tips that make difference in related works, thus we propose to learn a set of representative features, based on auto-encoders. We model the represented users' finger vein structure using a Gaussian distribution. Experimental results show that our method performs like a state-of-the-art method on SDUMLA-HMT benchmark.
Andere Autor:innenVeröffentlichung anzeigen -
Online Signature Verification Based on Feature Representation
International Symposium on Artificial Intelligence and Signal Processing
Signature verification techniques employ various specifications of a signature. Feature extraction and feature selection have an enormous effect on accuracy of signature verification. Feature extraction is a difficult phase of signature verification systems due to different shapes of signatures and different situations of sampling. This paper presents a method based on feature learning, in which a sparse autoencoder tries to learn features of signatures. Then learned features have…
Signature verification techniques employ various specifications of a signature. Feature extraction and feature selection have an enormous effect on accuracy of signature verification. Feature extraction is a difficult phase of signature verification systems due to different shapes of signatures and different situations of sampling. This paper presents a method based on feature learning, in which a sparse autoencoder tries to learn features of signatures. Then learned features have been
employed to present users’ signatures. Finally, users’ signatures have been classified using one-class classifiers. The proposed method is signature shape independent thanks to learning features from users’ signatures using autoencoder. Verification process of proposed system is evaluated on SVC2004 signature database, which contains genuine and skilled forgery signatures. The experimental results indicate error reduction and accuracy enhancement.Andere Autor:innenVeröffentlichung anzeigen
Kurse
-
Digital Image Processing
-
-
Evolutionary Computing
-
-
Fuzzy Logic
-
-
Knowledge Engineering
-
-
Machine Learning
-
-
Natural Language Processing
-
-
Pattern Recognition
-
-
Social Networks
-
Auszeichnungen/Preise
-
Awarded member of Iran National Elites Foundation (Society of prominent students of the country)
Iran National Elites Foundation
Iran's National Elites Foundation (INEF) (Persian: بنياد ملي نخبگان) is an Iranian governmental organization founded on 31 May 2005 by approval of the Supreme Cultural Revolution Council of Iran. The main purpose of the foundation is to recognize, organize and support Iran's elite national talents. Members of the foundation include all who show exceptionally high intellectual capacity, academic aptitude, creative ability and artistic talents, specially contributors in promotion of global…
Iran's National Elites Foundation (INEF) (Persian: بنياد ملي نخبگان) is an Iranian governmental organization founded on 31 May 2005 by approval of the Supreme Cultural Revolution Council of Iran. The main purpose of the foundation is to recognize, organize and support Iran's elite national talents. Members of the foundation include all who show exceptionally high intellectual capacity, academic aptitude, creative ability and artistic talents, specially contributors in promotion of global science and highly cited scientists and researchers. Iran National Elites Foundation (INEF) is a statewide organization and composed of members with significant scientific and executive background.
Sprachen
-
Persian
Muttersprache oder zweisprachig
-
English
Verhandlungssicher
Organisationen
-
Microsoft
Research Intern
–Heute -
University of Bonn
Doctoral Researcher
– -
Bosch Center for Artificial Intelligence (BCAI)
Research Intern
– -
Sensifai
-
–
Erhaltene Empfehlungen
2 Personen haben Mohsen Fayyaz empfohlen
Jetzt anmelden und ansehenWeitere Aktivitäten von Mohsen Fayyaz
-
I am seeking several Research Assistants to work on Image/video anomaly detection. If you are interested and have sufficient experience in this area,…
I am seeking several Research Assistants to work on Image/video anomaly detection. If you are interested and have sufficient experience in this area,…
Beliebt bei Mohsen Fayyaz
-
We extended AMIE, our research conversational medical AI, with new capabilities, going beyond diagnosis to treatment and management over time. We…
We extended AMIE, our research conversational medical AI, with new capabilities, going beyond diagnosis to treatment and management over time. We…
Beliebt bei Mohsen Fayyaz
-
I'm happy to share that we've released our first vison-language model: Instella-VL-1B, trained on AMD MI300 GPUs and built on top of our AMD-OLMo-1B…
I'm happy to share that we've released our first vison-language model: Instella-VL-1B, trained on AMD MI300 GPUs and built on top of our AMD-OLMo-1B…
Beliebt bei Mohsen Fayyaz
-
So fun to be a judge at the Health Innovation Challenge at the University of Washington. So many talented students, amazing pitches, and great ideas…
So fun to be a judge at the Health Innovation Challenge at the University of Washington. So many talented students, amazing pitches, and great ideas…
Beliebt bei Mohsen Fayyaz
-
418 pages of Probabilistic AI gold just dropped! Easily one of the best courses I've attended at ETH, taught by Andreas Krause. These notes cover…
418 pages of Probabilistic AI gold just dropped! Easily one of the best courses I've attended at ETH, taught by Andreas Krause. These notes cover…
Beliebt bei Mohsen Fayyaz
-
Scaling up agents and simulating social media!!!! Our new work in our open source Camel agentic framework!!! Too many co-authors to tag them all!!!
Scaling up agents and simulating social media!!!! Our new work in our open source Camel agentic framework!!! Too many co-authors to tag them all!!!
Beliebt bei Mohsen Fayyaz
-
Super proud of our team enabling these large models to run locally at speed on the NPU. Great to have a speedy outlet for AI innovation with ai…
Super proud of our team enabling these large models to run locally at speed on the NPU. Great to have a speedy outlet for AI innovation with ai…
Beliebt bei Mohsen Fayyaz
-
🚀🚀🚀 Call for Papers – Efficient Large Vision Models (ELVM) Workshop @ CVPR 2025 Large Vision Models (LVMs) are transforming computer vision, from…
🚀🚀🚀 Call for Papers – Efficient Large Vision Models (ELVM) Workshop @ CVPR 2025 Large Vision Models (LVMs) are transforming computer vision, from…
Beliebt bei Mohsen Fayyaz
-
Great #CVPR workshop on Efficiency of Large Vision Models
Great #CVPR workshop on Efficiency of Large Vision Models
Beliebt bei Mohsen Fayyaz
-
Today we bring the latest #DeepSeek distilled models to #Copilot+ PC’s, where they really shine due to the devices’ efficiency and power. This…
Today we bring the latest #DeepSeek distilled models to #Copilot+ PC’s, where they really shine due to the devices’ efficiency and power. This…
Beliebt bei Mohsen Fayyaz
-
Checkout our work ✨ "Fréchet Wavelet Distance (FWD): A Domain-Agnostic Metric for Image Generation", accepted at #ICLR2025! 🎉 In collaboration…
Checkout our work ✨ "Fréchet Wavelet Distance (FWD): A Domain-Agnostic Metric for Image Generation", accepted at #ICLR2025! 🎉 In collaboration…
Beliebt bei Mohsen Fayyaz
-
Note we have some post docs advertized, please remember that I do not look at my inbox on linkedin generally, instructions are here…
Note we have some post docs advertized, please remember that I do not look at my inbox on linkedin generally, instructions are here…
Beliebt bei Mohsen Fayyaz
Weitere ähnliche Profile
Weitere Mitglieder, die Mohsen Fayyaz heißen
-
Mohsen Fayyaz
CS PhD Student · UCLA
-
Mohsen Fayyaz Dastjerdi
-
Mohsen Fayyaz
Supervisor of optical fiber executive affairs
-
Mohsen Fayyaz
Civil Engineer at Salouk beton
Es gibt auf LinkedIn 6 weitere Personen, die Mohsen Fayyaz heißen.
Weitere Mitglieder anzeigen, die Mohsen Fayyaz heißen