Showing 79 open source projects for "statistical learning"

View related business solutions
  • Build on Google Cloud with $300 in Free Credit Icon
    Build on Google Cloud with $300 in Free Credit

    New to Google Cloud? Get $300 in free credit to explore Compute Engine, BigQuery, Cloud Run, Vertex AI, and 150+ other products.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query exabytes in BigQuery, or build AI apps with Vertex AI and Gemini. Once your credits are used, keep building with 20+ products with free monthly usage, including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. Sign up to start building right away.
    Start Free Trial
  • Deploy Apps in Seconds with Cloud Run Icon
    Deploy Apps in Seconds with Cloud Run

    Host and run your applications without the need to manage infrastructure. Scales up from and down to zero automatically.

    Cloud Run is the fastest way to deploy containerized apps. Push your code in Go, Python, Node.js, Java, or any language and Cloud Run builds and deploys it automatically. Get fast autoscaling, pay only when your code runs, and skip the infrastructure headaches. Two million requests free per month. And new customers get $300 in free credit.
    Try Cloud Run Free
  • 1
    stdlib

    stdlib

    Standard library for JavaScript and Node.js

    A standard library for javascript and node.js. High performance, rigorous, and robust mathematical and statistical functions. Build advanced statistical models and machine learning libraries. Plotting and graphics functionality for data visualization and exploratory data analysis. Analyze and understand your data. Comprehensively tested utilities for application and library development. Functions to assert, group, filter, map, pluck, and transform your data both in browsers and on the server. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    pmdarima

    pmdarima

    Statistical library designed to fill the void in Python's time series

    A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Smile

    Smile

    Statistical machine intelligence and learning engine

    Smile is a fast and comprehensive machine learning engine. With advanced data structures and algorithms, Smile delivers the state-of-art performance. Compared to this third-party benchmark, Smile outperforms R, Python, Spark, H2O, xgboost significantly. Smile is a couple of times faster than the closest competitor. The memory usage is also very efficient. If we can train advanced machine learning models on a PC, why buy a cluster? Write applications quickly in Java, Scala, or any JVM...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 4
    spaCy

    spaCy

    Industrial-strength Natural Language Processing (NLP)

    spaCy is a library built on the very latest research for advanced Natural Language Processing (NLP) in Python and Cython. Since its inception it was designed to be used for real world applications-- for building real products and gathering real insights. It comes with pretrained statistical models and word vectors, convolutional neural network models, easy deep learning integration and so much more. spaCy is the fastest syntactic parser in the world according to independent benchmarks, with an accuracy within 1% of the best available. It's blazing fast, easy to install and comes with a simple and productive API.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Cut Data Warehouse Costs up to 54% with BigQuery Icon
    Cut Data Warehouse Costs up to 54% with BigQuery

    Migrate from Snowflake, Databricks, or Redshift with free migration tools. Exabyte scale without the Exabyte price.

    BigQuery delivers up to 54% lower TCO than cloud alternatives. Migrate from legacy or competing warehouses using free BigQuery Migration Service with automated SQL translation. Get serverless scale with no infrastructure to manage, compressed storage, and flexible pricing—pay per query or commit for deeper discounts. New customers get $300 in free credit.
    Try BigQuery Free
  • 5
    TensorFlow Probability

    TensorFlow Probability

    Probabilistic reasoning and statistical analysis in TensorFlow

    TensorFlow Probability is a library for probabilistic reasoning and statistical analysis. TensorFlow Probability (TFP) is a Python library built on TensorFlow that makes it easy to combine probabilistic models and deep learning on modern hardware (TPU, GPU). It's for data scientists, statisticians, ML researchers, and practitioners who want to encode domain knowledge to understand data and make predictions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    DataFrame

    DataFrame

    C++ DataFrame for statistical, Financial, and ML analysis

    This is a C++ analytical library designed for data analysis similar to libraries in Python and R. For example, you would compare this to Pandas, R data.frame, or Polars. You can slice the data in many different ways. You can join, merge, and group-by the data. You can run various statistical, summarization, financial, and ML algorithms on the data. You can add your custom algorithms easily. You can multi-column sort, custom pick, and delete the data. DataFrame also includes a large...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 7
    Simd Library

    Simd Library

    C++ image processing and machine learning library with using of SIMD

    The Simd Library is a free open-source image processing and machine learning library, designed for C and C++ programmers. It provides many useful high-performance algorithms for image processing such as pixel format conversion, image scaling and filtration, extraction of statistical information from images, motion detection, object detection and classification, neural networks. The algorithms are optimized with using of different SIMD CPU extensions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    FinMind

    FinMind

    Open Data, more than 50 financial data

    ...Regardless of the program, you can download data through the api provided by FinMind, or you can download data directly from the website. After data is available, statistical analysis, regression analysis, time series analysis, machine learning, and deep learning can be performed. For individual stocks, provide visual analysis of technical, fundamental, and chip levels. According to different strategies, back-test analysis is performed to provide performance, profit and loss, and stock selection targets of different strategy investment portfolios.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    NVIDIA Earth2Studio

    NVIDIA Earth2Studio

    Open-source deep-learning framework

    ...The toolkit makes it easy to run deterministic and ensemble forecasts, swap models interchangeably, and process large geophysical datasets with Xarray structures, enabling experimentation with state-of-the-art deep learning models for climate and atmospheric prediction. Users can extend Earth2Studio with optional model packs, advanced data interfaces, statistical operators, and backend integrations that support flexible workflows from simple tests to large-scale operational inference.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Cut Cloud Costs with Google Compute Engine Icon
    Cut Cloud Costs with Google Compute Engine

    Save up to 91% with Spot VMs and get automatic sustained-use discounts. One free VM per month, plus $300 in credits.

    Save on compute costs with Compute Engine. Reduce your batch jobs and workload bill 60-91% with Spot VMs. Compute Engine's committed use offers customers up to 70% savings through sustained use discounts. Plus, you get one free e2-micro VM monthly and $300 credit to start.
    Try Compute Engine
  • 10
    StatsForecast

    StatsForecast

    Fast forecasting with statistical and econometric models

    StatsForecast is a Python library for time-series forecasting that delivers a suite of classical statistical and econometric forecasting models optimized for high performance and scalability. It is designed not just for academic experiments but for production-level time-series forecasting, meaning it handles forecasting for many series at once, efficiently, reliably, and with minimal overhead. The library implements a broad set of models, including AutoARIMA, ETS, CES, Theta, plus a battery...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    MiniSom

    MiniSom

    MiniSom is a minimalistic implementation of the Self Organizing Maps

    MiniSom is a minimalistic and Numpy-based implementation of the Self Organizing Maps (SOM). SOM is a type of Artificial Neural Network able to convert complex, nonlinear statistical relationships between high-dimensional data items into simple geometric relationships on a low-dimensional display. Minisom is designed to allow researchers to easily build on top of it and to give students the ability to quickly grasp its details. The project initially aimed for a minimalistic implementation of...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Darts

    Darts

    A python library for easy manipulation and forecasting of time series

    darts is a Python library for easy manipulation and forecasting of time series. It contains a variety of models, from classics such as ARIMA to deep neural networks. The models can all be used in the same way, using fit() and predict() functions, similar to scikit-learn. The library also makes it easy to backtest models, combine the predictions of several models, and take external data into account. Darts supports both univariate and multivariate time series and models. The ML-based models...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    ROOT

    ROOT

    Analyzing, storing and visualizing big data, scientifically

    ROOT is a unified software package for the storage, processing, and analysis of scientific data: from its acquisition to the final visualization in the form of highly customizable, publication-ready plots. It is reliable, performant and well supported, easy to use and obtain, and strives to maximize the quantity and impact of scientific results obtained per unit cost, both of human effort and computing resources. ROOT provides a very efficient storage system for data models, that...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    mlforecast

    mlforecast

    Scalable machine learning for time series forecasting

    mlforecast is a time-series forecasting framework built around machine-learning models, designed to make forecasting both efficient and scalable. It lets you apply any regressor that follows the typical scikit-learn API, for example, gradient-boosted trees or linear models, to time-series data by automating much of the messy feature engineering and data preparation. Instead of writing custom code to build lagged features, rolling statistics, and date-based predictors, mlforecast generates...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Synthetic Data Vault (SDV)

    Synthetic Data Vault (SDV)

    Synthetic Data Generation for tabular, relational and time series data

    The Synthetic Data Vault (SDV) is a Synthetic Data Generation ecosystem of libraries that allows users to easily learn single-table, multi-table and timeseries datasets to later on generate new Synthetic Data that has the same format and statistical properties as the original dataset. Synthetic data can then be used to supplement, augment and in some cases replace real data when training Machine Learning models. Additionally, it enables the testing of Machine Learning or other data dependent software systems without the risk of exposure that comes with data disclosure. Underneath the hood it uses several probabilistic graphical modeling and deep learning based techniques. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    PostgresML

    PostgresML

    The GPU-powered AI application database

    ...Combine and automate the entire workflow from embedding generation to indexing and querying for the simplest (and fastest) knowledge-based chatbot implementation. Leverage multiple types of natural language processing and machine learning models such as vector search and personalization with embeddings to improve search results. Leverage your data with time series forecasting to garner key business insights. Build statistical and predictive models with the full power of SQL and dozens of regression algorithms. Return results and detect fraud faster with ML at the database layer. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    PyMC

    PyMC

    Bayesian Modeling and Probabilistic Programming in Python

    PyMC is a Python library for probabilistic programming focused on Bayesian statistical modeling and machine learning. Built on top of computational tools like Aesara and NumPy, PyMC allows users to define models using intuitive syntax and perform inference using MCMC, variational inference, and other advanced algorithms. It’s widely used in scientific research, data science, and decision modeling.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Potpie

    Potpie

    Create custom engineering agents for your codebase

    Potpie is an AI-powered data analysis tool that automates the exploration and visualization of datasets, assisting users in uncovering insights without extensive coding.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Orange Data Mining

    Orange Data Mining

    Orange: Interactive data analysis

    Open source machine learning and data visualization. Build data analysis workflows visually, with a large, diverse toolbox. Perform simple data analysis with clever data visualization. Explore statistical distributions, box plots and scatter plots, or dive deeper with decision trees, hierarchical clustering, heatmaps, MDS and linear projections. Even your multidimensional data can become sensible in 2D, especially with clever attribute ranking and selections.
    Downloads: 44 This Week
    Last Update:
    See Project
  • 20
    NeuralForecast

    NeuralForecast

    Scalable and user friendly neural forecasting algorithms.

    NeuralForecast offers a large collection of neural forecasting models focusing on their performance, usability, and robustness. The models range from classic networks like RNNs to the latest transformers: MLP, LSTM, GRU, RNN, TCN, TimesNet, BiTCN, DeepAR, NBEATS, NBEATSx, NHITS, TiDE, DeepNPTS, TSMixer, TSMixerx, MLPMultivariate, DLinear, NLinear, TFT, Informer, AutoFormer, FedFormer, PatchTST, iTransformer, StemGNN, and TimeLLM. There is a shared belief in Neural forecasting methods'...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Lingua-Py

    Lingua-Py

    The most accurate natural language detection library for Python

    Its task is simple: It tells you which language some text is written in. This is very useful as a preprocessing step for linguistic data in natural language processing applications such as text classification and spell checking. Other use cases, for instance, might include routing e-mails to the right geographically located customer service department, based on the e-mails' languages. Language detection is often done as part of large machine learning frameworks or natural language processing...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    awesome-single-cell

    awesome-single-cell

    Community-curated list of software packages and data resources

    Community-curated list of software packages and data resources for single-cell, including RNA-seq, ATAC-seq, etc. List of software packages (and the people developing these methods) for single-cell data analysis, including RNA-seq, ATAC-seq, etc. Rapid, accurate and memory-frugal preprocessing of single-cell and single-nucleus RNA-seq data. Find bimodal, unimodal, and multimodal features in your data. Ascend is an R package comprised of fast, streamlined analysis functions optimized to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    sadsa

    sadsa

    SADSA (Software Application for Data Science and Analytics)

    SADSA (Software Application for Data Science and Analytics) is a Python-based desktop application designed to simplify statistical analysis, machine learning, and data visualization for students, researchers, and data professionals. Built using Python for the GUI, SADSA provides a menu-driven interface for handling datasets, applying transformations, running advanced statistical tests, machine learning algorithms, and generating insightful plots — all without writing code.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 24
    LakshmiDROP

    LakshmiDROP

    Fixed Income Analytics, Portfolio Construction, Asset Backed Cost

    DROP implements the model libraries and provides systems for fixed income valuation and adjustments, asset allocation and transaction cost analytics, and supporting libraries in numerical optimization and statistical learning. DROP is composed of four main libraries: [Asset Allocation] (https://lakshmidrip.github.io/DROP/AssetAllocationModule.html) [Fixed Income Analytics] (https://lakshmidrip.github.io/DROP/FixedIncomeModule.html) [Numerical Optimization] (https://lakshmidrip.github.io/DROP/NumericalOptimizerModule.html) [Statistical Learning] (https://lakshmidrip.github.io/DROP/StatisticalLearningModule.html)
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    dibnn

    dibnn

    Drop In the Bucket Neural Networks

    One more lightweight neural network in C.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next
MongoDB Logo MongoDB
Gen AI apps are built with MongoDB Atlas
Atlas offers built-in vector search and global availability across 125+ regions. Start building AI apps faster, all in one place.
Try Free →