Showing 94 open source projects for "nvidia cuda"

View related business solutions
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AIโ€”on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like โ€œBuild me a revenue dashboard on my Stripe dataโ€ and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do bestโ€”building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    CV-CUDA

    CV-CUDA

    CV-CUDAโ„ข is an open-source, GPU accelerated library

    CV-CUDA is an open-source project that enables building efficient cloud-scale Artificial Intelligence (AI) imaging and computer vision (CV) applications. It uses graphics processing unit (GPU) acceleration to help developers build highly efficient pre- and post-processing pipelines. CV-CUDA originated as a collaborative effort between NVIDIA and ByteDance.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 2
    CUDA.jl

    CUDA.jl

    CUDA programming in Julia

    High-performance GPU programming in a high-level language. JuliaGPU is a GitHub organization created to unify the many packages for programming GPUs in Julia. With its high-level syntax and flexible compiler, Julia is well-positioned to productively program hardware accelerators like GPUs without sacrificing performance. The latest development version of CUDA.jl requires Julia 1.8 or higher. If you are using an older version of Julia, you need to use a previous version of CUDA.jl. This will...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    CUDA API Wrappers

    CUDA API Wrappers

    Thin, unified, C++-flavored wrappers for the CUDA APIs

    CUDA API Wrappers is a C++ library providing high-level, modern wrappers for NVIDIAโ€™s CUDA runtime and driver APIs, enhancing usability and efficiency. It is intended for those who would otherwise use these APIs directly, to make working with them more intuitive and consistent, making use of modern C++ language capabilities, programming idioms, and best practices. In a nutshell - making CUDA API work more fun.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    NVIDIA GPU Operator

    NVIDIA GPU Operator

    NVIDIA GPU Operator creates/configures/manages GPUs atop Kubernetes

    ...These components include the NVIDIA drivers (to enable CUDA), Kubernetes device plugin for GPUs, the NVIDIA Container Runtime, automatic node labeling, DCGM-based monitoring, and others.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional servicesโ€”without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 5
    CuPy

    CuPy

    A NumPy-compatible array library accelerated by CUDA

    CuPy is an open source implementation of NumPy-compatible multi-dimensional array accelerated with NVIDIA CUDA. It consists of cupy.ndarray, a core multi-dimensional array class and many functions on it. CuPy offers GPU accelerated computing with Python, using CUDA-related libraries to fully utilize the GPU architecture. According to benchmarks, it can even speed up some operations by more than 100X. CuPy is highly compatible with NumPy, serving as a drop-in replacement in most cases. ...
    Downloads: 23 This Week
    Last Update:
    See Project
  • 6
    KeyKiller-Cuda

    KeyKiller-Cuda

    Solving the Satoshi Puzzle

    KeyKiller is a GPU-accelerated version of the KeyKiller project, designed to achieve extreme performance in solving Satoshi Nakamoto's puzzles using modern NVIDIA GPUs. KeyKiller CUDA pushes the limits of cryptographic key search performance by leveraging CUDA, thread-beam parallelism, and batch EC operations. The command-line version is open-source and free to use. For the paid advanced graphics version, please visit: https://gitlab.com/8891689/KeyKiller-Cuda/
    Downloads: 26 This Week
    Last Update:
    See Project
  • 7
    Tiny CUDA Neural Networks

    Tiny CUDA Neural Networks

    Lightning fast C++/CUDA neural network framework

    This is a small, self-contained framework for training and querying neural networks. Most notably, it contains a lightning-fast "fully fused" multi-layer perceptron (technical paper), a versatile multiresolution hash encoding (technical paper), as well as support for various other input encodings, losses, and optimizers. We provide a sample application where an image function (x,y) -> (R,G,B) is learned. The fully fused MLP component of this framework requires a very large amount of shared...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Nvitop

    Nvitop

    An interactive NVIDIA-GPU process viewer and beyond

    nvitop is an interactive NVIDIA device and process monitoring tool. It has a colorful and informative interface that continuously updates the status of the devices and processes. As a resource monitor, it includes many features and options, such as tree-view, environment variable viewing, process filtering, process metrics monitoring, etc. Beyond that, the package also ships a CUDA device selection tool nvisel for deep learning researchers.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    TensorRT

    TensorRT

    C++ library for high performance inference on NVIDIA GPUs

    ...TensorRT is built on CUDAยฎ, NVIDIAโ€™s parallel programming model, and enables you to optimize inference leveraging libraries, development tools, and technologies in CUDA-Xโ„ข for artificial intelligence, autonomous machines, high-performance computing, and graphics. With new NVIDIA Ampere Architecture GPUs, TensorRT also leverages sparse tensor cores providing an additional performance boost.
    Downloads: 30 This Week
    Last Update:
    See Project
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered appsโ€”without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • 10
    XMRig

    XMRig

    RandomX, KawPow, CryptoNight, AstroBWT and GhostRider unified miner

    High performance, open-source, cross-platform RandomX, KawPow, CryptoNight, and AstroBWT CPU/GPU miner, RandomX benchmark, and stratum proxy. XMRig is a high-performance, open-source, cross-platform RandomX, KawPow, CryptoNight, and AstroBWT unified CPU/GPU miner and RandomX benchmark. Official binaries are available for Windows, Linux, macOS, and FreeBSD. The preferred way to configure the miner is the JSON config file as it is more flexible and human-friendly. The command-line interface...
    Downloads: 107 This Week
    Last Update:
    See Project
  • 11
    Torch-TensorRT

    Torch-TensorRT

    PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT

    Torch-TensorRT is a compiler for PyTorch/TorchScript, targeting NVIDIA GPUs via NVIDIAโ€™s TensorRT Deep Learning Optimizer and Runtime. Unlike PyTorchโ€™s Just-In-Time (JIT) compiler, Torch-TensorRT is an Ahead-of-Time (AOT) compiler, meaning that before you deploy your TorchScript code, you go through an explicit compile step to convert a standard TorchScript program into a module targeting a TensorRT engine. Torch-TensorRT operates as a PyTorch extension and compiles modules that integrate...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 12
    OuteTTS

    OuteTTS

    Interface for OuteTTS models

    ...The project supports multiple backends including llama.cpp (Python bindings and server), Hugging Face Transformers, ExLlamaV2, VLLM and a JavaScript interface via Transformers.js, allowing it to run on CPUs, NVIDIA CUDA GPUs, AMD ROCm, Vulkan-capable GPUs, and Apple Metal. It also includes a notion of speaker profiles: you can create a speaker from a short audio sample, save it as JSON, and reuse it for consistent voice identity across generations and sessions. For best quality, the model is designed to work with a reference speaker clip and will inherit emotion, style, and accent from that reference.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    Jan.ai

    Jan.ai

    Open source alternative to ChatGPT that runs 100% offline

    Jan.ai is an open-source, privacy-focused AI assistant that serves as an alternative to ChatGPT, running completely locally on your device. It allows you to download and run LLMs (local language models) offline while also offering optional integration with cloud-based model providersโ€”giving you full control over your data and AI interactions. Download and run LLMs (Llama, Gemma, Qwen, GPT-oss etc.) from HuggingFace. Connect to GPT models via OpenAI, Claude models via Anthropic, Mistral,...
    Downloads: 37 This Week
    Last Update:
    See Project
  • 14
    cuDF

    cuDF

    GPU DataFrame Library

    ...It relies on NVIDIAยฎ CUDAยฎ primitives for low-level compute optimization but exposing that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Simple StyleGan2 for Pytorch

    Simple StyleGan2 for Pytorch

    Simplest working implementation of Stylegan2

    Simple Pytorch implementation of Stylegan2 that can be completely trained from the command-line, no coding needed. You will need a machine with a GPU and CUDA installed. You can also specify the location where intermediate results and model checkpoints should be stored. You can increase the network capacity (which defaults to 16) to improve generation results, at the cost of more memory. By default, if the training gets cut off, it will automatically resume from the last checkpointed file....
    Downloads: 4 This Week
    Last Update:
    See Project
  • 16
    InvokeAI

    InvokeAI

    InvokeAI is a leading creative engine for Stable Diffusion models

    ...InvokeAI offers an industry leading Web Interface, interactive Command Line Interface, and also serves as the foundation for multiple commercial products. This fork is supported across Linux, Windows and Macintosh. Linux users can use either an Nvidia-based card (with CUDA support) or an AMD card (using the ROCm driver). We do not recommend the GTX 1650 or 1660 series video cards. They are unable to run in half-precision mode and do not have sufficient VRAM to render 512x512 images.
    Downloads: 27 This Week
    Last Update:
    See Project
  • 17
    ParallelStencil.jl

    ParallelStencil.jl

    Package for writing high-level code for parallel stencil computations

    ParallelStencil empowers domain scientists to write architecture-agnostic high-level code for parallel high-performance stencil computations on GPUs and CPUs. Performance similar to CUDA C / HIP can be achieved, which is typically a large improvement over the performance reached when using only CUDA.jl or AMDGPU.jl GPU Array programming. For example, a 2-D shallow ice solver presented at JuliaCon 2020 [1] achieved a nearly 20 times better performance than a corresponding GPU Array programming implementation; in absolute terms, it reached 70% of the theoretical upper performance bound of the used Nvidia P100 GPU, as defined by the effective throughput metric, T_eff. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    GoCV

    GoCV

    Go package for computer vision using OpenCV 4 and beyond

    GoCV gives programmers who use the Go programming language access to the OpenCV 4 computer vision library. The GoCV package supports the latest releases of Go and OpenCV v4.5.4 on Linux, macOS, and Windows. Our mission is to make the Go language a โ€œfirst-classโ€ client compatible with the latest developments in the OpenCV ecosystem. Computer Vision (CV) is the ability of computers to process visual information, and perform tasks normally associated with those performed by humans. CV software...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    AWS Deep Learning Containers

    AWS Deep Learning Containers

    A set of Docker images for training and serving models in TensorFlow

    AWS Deep Learning Containers (DLCs) are a set of Docker images for training and serving models in TensorFlow, TensorFlow 2, PyTorch, and MXNet. Deep Learning Containers provide optimized environments with TensorFlow and MXNet, Nvidia CUDA (for GPU instances), and Intel MKL (for CPU instances) libraries and are available in the Amazon Elastic Container Registry (Amazon ECR). The AWS DLCs are used in Amazon SageMaker as the default vehicles for your SageMaker jobs such as training, inference, transforms etc. They've been tested for machine learning workloads on Amazon EC2, Amazon ECS and Amazon EKS services as well. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    OpenFace Face Recognition

    OpenFace Face Recognition

    Face recognition with deep neural networks

    OpenFace is a Python and Torch implementation of face recognition with deep neural networks and is based on the CVPR 2015 paper FaceNet: A Unified Embedding for Face Recognition and Clustering by Florian Schroff, Dmitry Kalenichenko, and James Philbin at Google. Torch allows the network to be executed on a CPU or with CUDA. This research was supported by the National Science Foundation (NSF) under grant number CNS-1518865. Additional support was provided by the Intel Corporation, Google, Vodafone, NVIDIA, and the Conklin Kistler family fund. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and should not be attributed to their employers or funding sources. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Dogecoin Mining - Software

    Dogecoin Mining - Software

    This is a multi-threaded multi-pool FPGA and ASIC miner for DOGECOIN

    Dogecoin Mining - Software is an open source miner for ASIC, GPU and FPGA. It works on Windows, Linux and macOS. This miner is extremely flexible in terms of platform and can work with a variety of hardware miners and GPUs including AMD, CUDA and NVIDIA platforms.
    Leader badge
    Downloads: 56 This Week
    Last Update:
    See Project
  • 22
    bitResurrector

    bitResurrector

    Bitcoin private key recovery tool with Bloom Filter & CUDA

    ...Technological Stack: - Zero-Latency Bloom Filter: Real-time matching against 58M+ active addresses (Loyce Club data). - Turbo Core: C++/AVX-512 optimization with processor affinity. - GPU Mode: Massive parallel computation via NVIDIA CUDA. - Memory Management: Zero disk I/O lag using mmap (Memory-Mapped Files). BitResurrector is a closed-source, proprietary tool for researchers and professionals.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 23
    VanitySearch

    VanitySearch

    VanitySearch is a Bitcoin address prefix lookup tool.

    VanitySearch is a Bitcoin address prefix lookup tool. If you want to generate a secure private key, use the `-s` option to enter your passphrase, which will be used to generate the base key conforming to the BIP38 standard (e.g., `VanitySearch.exe -s "my passphrase" 1MyPrefix"`). You can also use `VanitySearch.exe -ps "my passphrase"`, which adds a cryptographically secure seed to your passphrase.Fixed custom address matching errors and private key conversion errors, changed the randomizer,...
    Downloads: 40 This Week
    Last Update:
    See Project
  • 24
    HanoiVM

    HanoiVM

    HanoiVM is a recursive, AI-augmented ternary virtual machine

    ๐Ÿš€ HanoiVM โ€” Recursive Ternary Virtual Machine **HanoiVM** is a recursive, AI-augmented **ternary virtual machine** built on a symbolic base-81 architecture. It is the execution core of the **Axion + T81Lang** ecosystem, enabling stack-tier promotion, symbolic AI opcodes, and entropy-aware transformations across three levels of logic: - ๐Ÿ”น `T81`: 81-bit operand logic (register-like) - ๐Ÿ”ธ `T243`: Symbolic BigInt + FSM state logic - ๐Ÿ”บ `T729`: Tensor-based AI macros with semantic...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    GSvit

    GSvit

    Fast FDTD solver with graphics card support

    Fast FDTD solver with graphics card support. Optimized for nanoscale optics - scanning near field optical microscopy, rough surface scattering and solar cells. Uses CUDA environment for graphics card operation.
    Downloads: 12 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next