Skip to content
View sibiryakov's full-sized avatar

Organizations

@scrapinghub

Block or report sibiryakov

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Carbon Language's main repository: documents, design, implementation, and related tools. (NOTE: Carbon Language is experimental; see README)

C++ 33,606 1,512 Updated Jan 5, 2026

🔍 A simple, fast fuzzy finder for the terminal

C 3,182 142 Updated Jul 29, 2025

usb-device implementation for Synopsys USB OTG IP cores

Rust 51 36 Updated Oct 10, 2025

Headless chrome/chromium automation library (unofficial port of puppeteer)

Python 3,562 370 Updated Aug 5, 2021

Fast Avro for Python

Python 692 177 Updated Dec 23, 2025

Python Stream Processing

Python 6,832 535 Updated Jul 27, 2024

The binary distribution of openHAB

PowerShell 1,373 398 Updated Dec 29, 2025

Zstandard - Fast real-time compression algorithm

C 26,342 2,371 Updated Dec 22, 2025

Cross-platform lib for process and system monitoring in Python

Python 11,008 1,458 Updated Jan 2, 2026

A game theoretic approach to explain the output of any machine learning model.

Jupyter Notebook 24,884 3,462 Updated Jan 1, 2026

Just the facts -- web page content extraction

Python 1,278 181 Updated Jul 8, 2025

Module for automatic summarization of text documents and HTML pages.

Python 3,656 541 Updated Dec 29, 2025

fast python port of arc90's readability tool, updated to match latest readability.js!

Python 2,877 356 Updated May 3, 2025

NER toolkit for HTML data

HTML 259 59 Updated May 3, 2024

Python3 compatible wrapper for RE2

Python 2 Updated Feb 26, 2017

a powerful DNS toolkit for python

Python 2,626 550 Updated Jan 2, 2026

Web crawling framework based on asyncio.

Python 2,026 205 Updated Jun 1, 2019

Extract text from HTML

HTML 135 22 Updated Jul 22, 2020

MessagePack serializer implementation for Java / msgpack.org[Java]

Java 1,462 321 Updated Jan 5, 2026

Python Non-cryptographic Hash Library

C 288 53 Updated Aug 19, 2023

Python bindings for FarmHash and CityHash

C++ 46 32 Updated Oct 9, 2025

MessagePack serializer implementation for Python msgpack.org[Python]

Python 2,049 236 Updated Dec 1, 2025

Fast, efficient, and scalable distributed map/reduce system, DAG execution, in memory or on disk, written in pure Go, runs standalone or distributedly.

Go 3,553 290 Updated Nov 20, 2025

Pure Python implementation of reading SequenceFile-s with pickles written by Spark's saveAsPickleFile()

Python 24 3 Updated Jun 8, 2017

A pure python HDFS client

Python 859 214 Updated Apr 19, 2022

Hadoop (Utilities, Patches and Examples)

Python 244 146 Updated Jun 21, 2016

Run MapReduce jobs on Hadoop or Amazon Web Services

Python 2,616 586 Updated Mar 24, 2023

StackStorm (aka "IFTTT for Ops") is event-driven automation for auto-remediation, incident responses, troubleshooting, deployments, and more for DevOps and SREs. Includes rules engine, workflow, 16…

Python 6,394 779 Updated Dec 10, 2025

Tools for keeping your cloud operating in top form. Chaos Monkey is a resiliency tool that helps applications tolerate random instance failures.

Java 7,983 1,128 Updated Dec 18, 2018

Hubstorage crawl frontier backend for Frontera

Python 3 4 Updated Feb 22, 2017
Next