Compilers Lab’s Post

View organization page for Compilers Lab

5,019 followers

2mo

Michael Canesche's paper, which describes the new kernel fusion algorithm used in the Cadence XNNC Tensor Compiler, has been accepted at the International Conference on Compiler Construction. Tensor compilers like XLA, TVM, and TensorRT operate on computational graphs, where vertices represent operations, and edges denote data flow between these operations. Operator fusion is an optimization technique that combines multiple operators into a single, more efficient operation. The paper "Fusion of Operators of Computational Graphs via Greedy Clustering: The XNNC Experience" introduces the operator fusion algorithm recently implemented in the Xtensa Neural Network Compiler (XNNC). XNNC is a toolchain designed for deploying machine learning models on Cadence's Tensilica processors. These edge-device processors are widely used in applications such as automotive systems, consumer electronics, communications, LiDAR, and radar technologies. First released in 2017 to complement Tensilica’s Vision 7 processors, XNNC has since evolved significantly. Now in version 3.0, its codebase spans hundreds of thousands of lines of C++ code. XNNC has been used to compile thousands of neural networks for a broad range of Xtensa architectures, and its design and implementation continue to advance, as this paper demonstrates. Read the paper: https://lnkd.in/dsQJtSkz #compilers #research #university #education #gradschool

To view or add a comment, sign in

More Relevant Posts

Filip Karandysovsky

CEO @ FloLogixAI | AI & Automation Solutions for OEMs & Manufacturers
2mo
Report this post
Hi all, I recently worked on a small but fun CV project to build a real-time drone detection system using the YOLO11n deep learning model. With a dataset from Roboflow (https://lnkd.in/eGHrCSfi), I trained the model on my NVIDIA GTX 1660 Super, leveraging CUDA and cuDNN for GPU acceleration. Results: Achieved 90%+ mAP50 for drone detection accuracy. Processed live camera feeds at ~20 FPS. Successfully tested the model on videos and live streams with minimal false positives! Challenges: Formatting the dataset for YOLO. Managing dependencies like PyTorch and CUDA. Balancing real-time performance with model accuracy. Key Takeaways: Proper dataset structure and GPU acceleration were game-changers. YOLO11n’s lightweight architecture made real-time inference possible. This project showcased the potential for surveillance, monitoring, and even edge deployment!

1 Comment
Like Comment
To view or add a comment, sign in
LVTailoring - eTrology

160 followers
7mo
Report this post
Hello everyone, Today, we've decided to postpone our next demo presentation to delve into some remarkable images from IMEC showcasing High-NA EUV work for Logic and DRAM. A big thank you to Subhash KM for sharing these. While we don't currently have customers utilizing High-NA EUV tools, I took the initiative to run these images through our Measurement Utility (YieldPro 5.1.1). My goal was to see how our software handles the latest metrology challenges. In about 20 minutes, I was able to create a 22nm pitch 2D feature recipe for full cell segmentation. It wasn’t too challenging, but certainly not trivial. I used some Deep Learning pre-filtering along with simple contour extraction based on gradients to perform the segmentation. It's worth noting that the image had high SNR, making the use of Deep Learning somewhat excessive, but it was an interesting exercise nonetheless. I've attached a video showcasing the process—please take a look and share your thoughts! #gazadelendaest #deeplearning #resolution #beam #EHAR #DeepStructures #metrology #Fab #problemsolving #LVTailoring #innovation #measurement #SecondaryElectronDetection #BackScatteredElectronDetection #CriticalDimentionMetrology #noisereduction #artificialneuralnetworks #CAD2SEM #Die2DB #EPE #PSD #precision #accuracy #lithography #tmu #NoiselessPSD #TargetDesign #ai #BlindDenoising #DeepStructures #neuralnetworks #weave #MassMeas #defects #beam #SecondaryElectronDetection #Overlay #EPE #labview #python #noisereduction #appliedmaterials #hitachi #kla
Like Comment
To view or add a comment, sign in
Gaussian Splatting

16,128 followers
8mo
Report this post
This way Gaussian Splatting can leapfrog - big 🤘 #gaussiansplatting Gaussian Splatting #3DGS #colmap

Huguens Jean, Ph.D.

SWE at Google
8mo

COLMAP-Free 3D Gaussian Splatting UC San Diego, NVIDIA, UC Berkeley CVPR 2024, Seattle ✨ Highlight ✨ page: https://lnkd.in/eNQMXtnS arxiv: https://lnkd.in/ergUrMdP video: https://lnkd.in/e363juUW code: https://lnkd.in/exDrDdUJ

COLMAP-Free 3D Gaussian Splatting

oasisyang.github.io
Like Comment
To view or add a comment, sign in
Taha Rangwala

Attended Acropolis Institutions
1mo
Report this post
Expanding My Skillset! Proud to share that I’ve completed the Image Processing Onramp course by MathWorks! Excited to apply these advanced image processing techniques to real-world challenges. #MATLAB #ImageProcessing #Innovation #ContinuousGrowth
4 Comments
Like Comment
To view or add a comment, sign in
Jean-Philip Piquemal

Distinguished Professor of Theoretical Chemistry @ Sorbonne Université | Electronic Structure, Molecular simulation | CSO and Co-Founder @ Qubit Pharmaceuticals
10mo Edited
Report this post
🚨 Latest group preprint: 🚨 𝐅𝐞𝐍𝐍𝐨𝐥: 𝐚𝐧 𝐄𝐟𝐟𝐢𝐜𝐢𝐞𝐧𝐭 𝐚𝐧𝐝 𝐅𝐥𝐞𝐱𝐢𝐛𝐥𝐞 𝐋𝐢𝐛𝐫𝐚𝐫𝐲 𝐟𝐨𝐫 𝐁𝐮𝐢𝐥𝐝𝐢𝐧𝐠 𝐅𝐨𝐫𝐜𝐞-𝐟𝐢𝐞𝐥𝐝-𝐞𝐧𝐡𝐚𝐧𝐜𝐞𝐝 𝐍𝐞𝐮𝐫𝐚𝐥 𝐍𝐞𝐭𝐰𝐨𝐫𝐤 𝐏𝐨𝐭𝐞𝐧𝐭𝐢𝐚𝐥𝐬. 👉 : https://lnkd.in/enz4vrcb A new #GPU-accelerated #opensource library for building, training and running force-field-enhanced neural network potentials. It provides a flexible and modular system for building hybrid models, allowing to easily combine state-of-the-art embeddings with ML-parameterized physical interaction terms without the need for explicit programming. FeNNol shrinks the performance gap between ML potentials and standard force-fields. It can be used standalone or via Deep-HP within Tinker-HP: heavy #HPC optimization is underway for multi-(nodes/GPUs) runs. Available at https://lnkd.in/e4MCDRp2 Great work by Thomas Plé, Olivier ADJOUA, Louis Lagardère Funding European Research Council (ERC) (project EMC2). Supercomputer time GENCI. #drugdesign #NeuralNetworks #GPU #supercomputing #HPC NVIDIA #machinelearning Sorbonne Université CNRS
4 Comments
Like Comment
To view or add a comment, sign in
Naveen Manwani

Engineer at AIMonk Labs || Crafting Stable AI Products and Enhancing Software Aesthetics || Enthusiastic about Robotics and Cutting-Edge AI Developments || Sharing the Hottest Trends in Artificial Intelligence.
11mo
Report this post
🚨CVPR 2024 Paper Alert 🚨 ➡️Paper Title: 4D Gaussian Splatting for Real-Time Dynamic Scene Rendering 🌟Few pointers from the paper 🎯To achieve real-time dynamic scene rendering while also enjoying high training and storage efficiency, authors have proposed 4D Gaussian Splatting (4D-GS) as a holistic representation for dynamic scenes rather than applying 3D-GS for each individual frame. 🎯In 4D-GS, a novel explicit representation containing both 3D Gaussians and 4D neural voxels is proposed. 🎯A decomposed neural voxel encoding algorithm inspired by HexPlane is proposed to efficiently build Gaussian features from 4D neural voxels and then a lightweight MLP is applied to predict Gaussian deformations at novel timestamps. 🎯Their 4D-GS method achieves real-time rendering under high resolutions, 82 FPS at an 800×800 resolution on an RTX 3090 GPU while maintaining comparable or better quality than previous state-of-the-art methods. 🏢Organization: Huazhong University of Science and Technology, Huawei Inc. 🧙Paper Authors: Guanjun Wu, Taoran Yi, Jiemin Fang, Lingxi Xie, Xiaopeng Zhang, Wei Wei, Wenyu Liu, Qi Tian, Xinggang Wang 1️⃣Read the Full Paper here: https://lnkd.in/gxKg-5xn 2️⃣Project Page: https://lnkd.in/gxMKFBas 3️⃣Code: https://lnkd.in/gGvwsSCw 🎥 Be sure to watch the attached Demo Video-Sound on 🔊🔊 Music by Sergio Prosvirini from Pixabay Find this Valuable 💎 ? ♻️REPOST and teach your network something new Follow me, Naveen Manwani, for the latest updates on Tech and AI-related news, insightful research papers, and exciting announcements. #cvpr2024
Like Comment
To view or add a comment, sign in
Vladislav Kaplan

CTO Etrology & Founder LVTailoring
7mo Edited
Report this post
Hello everyone, Today, we've decided to postpone our next demo presentation to delve into some remarkable images from IMEC showcasing High-NA EUV work for Logic and DRAM. A big thank you to Subhash KM for sharing these. While we don't currently have customers utilizing High-NA EUV tools, I took the initiative to run these images through our Measurement Utility (YieldPro 5.1.1). My goal was to see how our software handles the latest metrology challenges. In about 20 minutes, I was able to create a 22nm pitch 2D feature recipe for full cell segmentation. It wasn’t too challenging, but certainly not trivial. I used some Deep Learning pre-filtering along with simple contour extraction based on gradients to perform the segmentation. It's worth noting that the image had high SNR, making the use of Deep Learning somewhat excessive, but it was an interesting exercise nonetheless. I've attached a video showcasing the process—please take a look and share your thoughts! #gazadelendaest #deeplearning #resolution #beam #EHAR #DeepStructures #metrology #Fab #problemsolving #LVTailoring #innovation #measurement #SecondaryElectronDetection #BackScatteredElectronDetection #CriticalDimentionMetrology #noisereduction #artificialneuralnetworks #CAD2SEM #Die2DB #EPE #PSD #precision #accuracy #lithography #tmu #NoiselessPSD #TargetDesign #ai #BlindDenoising #DeepStructures #neuralnetworks #weave #MassMeas #defects #beam #SecondaryElectronDetection #Overlay #EPE #labview #python #noisereduction #appliedmaterials #hitachi #kla

2 Comments
Like Comment
To view or add a comment, sign in
Deep Dey

Student at Manipal University Jaipur
7mo Edited
Report this post
Sharing my implementation of the Neural Radiance Fields (NeRF) model! NeRF creates and predicts 3D scenes from few 2D images, opening up new possibilities in 3D visualization. In this project, I utilized the power of GPU acceleration with CUDA and PyTorch, and it was a great learning experience. Special thanks to Maxime Vandegar and Quei-An Chen for their insightful repositories that greatly assisted in this implementation. you can refer these papers for NeRF : NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis : https://lnkd.in/duUz5qJd NeRF : https://lnkd.in/d5-jZdD6 Or check out the project on my GitHub for more details and to explore the code: https://lnkd.in/dVJH27xR Note : This model takes approximately 8 hours to train on RTX 2080 Ti processor. (Also the reason for my keyboard interrupt) #MachineLearning #NeRF #CUDA #PyTorch #ReserchImplementation #CNN #deeplearning
Like Comment
To view or add a comment, sign in
Christoph Studer

Professor at ETH Zurich in Integrated Information Processing at the Department of Information Technology and Electrical Engineering
7mo Edited
Report this post
Despite extensive research on jamming attacks, the potential of machine learning for amplifying the threat of such attacks, or our ability to mitigate them, remains untapped. A key obstacle to this kind of research has been the absence of a suitable framework. To resolve this obstacle, we released PyJama, a fully-differentiable open-source library that adds jamming and anti-jamming functionality to NVIDIA Sionna. The accompanying paper, which will be presented at SPAWC 2024, demonstrates the utility of PyJama (i) for realistic MIMO simulations by showing examples that involve forward error correction, OFDM waveforms in time and frequency domain, realistic channel models, and mobility; and (ii) for learning to jam. Specifically, we use stochastic gradient descent to optimize jamming power allocation over an OFDM resource grid. The learned strategies are non-trivial, intelligible, and effective. PyJama has been developed by Fabian Ulbricht during his Master's Thesis in our research group, during which he was supervised by Gian Marti and Reinhard W.. The paper is co-authored by Fabian Ulbricht, Gian Marti, Reinhard Wiesmayr, and myself. A preprint of our paper is available on arXiv https://lnkd.in/dwDi3xHn, and the code is available on GitHub https://lnkd.in/ds9BGQhP. Please also check out the PyJama project website http://pyjama.ethz.ch!
1 Comment
Like Comment
To view or add a comment, sign in

5,019 followers

View Profile Follow

Compilers Lab’s Post

More Relevant Posts

Explore topics