Skip to content
View odedfos's full-sized avatar
  • 17:20 (UTC +02:00)

Block or report odedfos

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Scrapy, a fast high-level web crawling & scraping framework for Python.

Python 59,363 11,200 Updated Dec 31, 2025

A collaborative note taking, wiki and documentation platform that scales. Built with Django and React.

Python 15,380 493 Updated Dec 30, 2025

The batteries-included, No-Code FinOps automation platform, with the AI you trust.

TypeScript 974 166 Updated Dec 31, 2025

A Q&A platform software for teams at any scales. Whether it's a community forum, help center, or knowledge management platform, you can always count on Apache Answer.

Go 15,263 1,262 Updated Dec 31, 2025

MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW

Python 2,847 313 Updated Jan 2, 2026

Open, Multi-modal Catalog for Data & AI

Java 3,244 557 Updated Dec 18, 2025

Dump the license list of packages installed with pip.

Python 360 57 Updated Dec 19, 2025

Let your Python tests travel through time

Python 4,480 288 Updated Aug 19, 2025

Streamlit — A faster way to build and share data apps.

Python 42,942 4,010 Updated Jan 3, 2026

PyGWalker: Turn your dataframe into an interactive UI for visual analysis

Python 15,536 852 Updated Dec 30, 2025

Automatically exported from code.google.com/p/passlib

Python 23 3 Updated May 3, 2015

MinIO is a high-performance, S3 compatible object store, open sourced under GNU AGPLv3 license.

Go 59,564 6,849 Updated Dec 3, 2025

Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.

Python 22,933 2,248 Updated Oct 28, 2025

API for manipulating time series on top of Apache Spark: lagged time values, rolling statistics (mean, avg, sum, count, etc), AS OF joins, downsampling, and interpolation

Jupyter Notebook 334 58 Updated Dec 4, 2025

Bonus materials, exercises, and example projects for our Python tutorials

HTML 5,071 5,333 Updated Jan 1, 2026

A simplified, lightweight ETL Framework based on Apache Spark

Scala 587 158 Updated Jan 24, 2024

A curated list of awesome Apache Spark packages and resources.

Shell 1,853 345 Updated Oct 24, 2024

Learn Apache Spark in Scala, Python (PySpark) and R (SparkR) by building your own cluster with a JupyterLab interface on Docker. ⚡

Jupyter Notebook 505 200 Updated Nov 7, 2025

Python Sorted Container Types: Sorted List, Sorted Dict, and Sorted Set

Python 3,888 222 Updated Mar 8, 2024

Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.

Scala 3,564 574 Updated Nov 4, 2025

Change data capture for a variety of databases. Please log issues at https://github.com/debezium/dbz/issues.

Java 12,251 2,816 Updated Dec 22, 2025
Python 525 69 Updated Jan 1, 2026

Native cross-platform MongoDB management tool

C++ 9,368 819 Updated Sep 22, 2022

A pure Python implementation of Apache Spark's RDD and DStream interfaces.

Python 270 45 Updated Sep 3, 2024

This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]

Scala 1,369 782 Updated Jan 28, 2025

Apache Superset is a Data Visualization and Data Exploration Platform

TypeScript 69,616 16,418 Updated Jan 3, 2026

Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.

Python 28,125 4,537 Updated Jan 1, 2026

Apache Spark - A unified analytics engine for large-scale data processing

Scala 42,584 28,986 Updated Jan 3, 2026

A curated list of useful resources for gRPC

8,175 598 Updated Oct 28, 2025

Python Geohash Compression Tool

Python 192 16 Updated Apr 4, 2024
Next