Skip to content
View mattbernst's full-sized avatar

Block or report mattbernst

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A fast and accurate POS and morphological tagging toolkit (EACL 2014)

HTML 149 48 Updated Feb 16, 2020

Flink Scala API is a thin wrapper on top of Flink Java API which support Scala Types for serialisation as well the latest Scala version

Scala 104 23 Updated Feb 4, 2026

Your files ready for Gen AI ✨🚀 AlcheMark is a lightweight PDF to Markdown, alchemical-inspired toolkit that transmutes PDF documents into structured Markdown pages—complete with rich metadata and n…

Python 77 7 Updated Apr 28, 2025

Jan is an open source alternative to ChatGPT that runs 100% offline on your computer.

TypeScript 40,315 2,509 Updated Feb 4, 2026

best way to save what you love

Svelte 38,412 3,174 Updated Jan 24, 2026

Automated, smooth, N'th order derivatives of non-uniformly sampled time series data

Python 230 8 Updated Oct 20, 2024

This is a python implementation for stitching images.

Jupyter Notebook 231 9 Updated Oct 3, 2024

Scala ZIO-powered Apache Parquet library

Scala 28 4 Updated Dec 13, 2025

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

Python 70,179 9,785 Updated Feb 4, 2026

DOM to Semantic-Markdown for use with LLMs

TypeScript 954 26 Updated May 21, 2025

A CLI tool to convert your codebase into a single LLM prompt with source tree, prompt templating, and token counting.

Rust 7,105 402 Updated Feb 3, 2026

LLM-based text extraction from unstructured data like PDFs, Words and HTMLs. Transform and cluster the text into your desired format. Less information loss, more interpretation, and faster R&D!

Python 233 61 Updated Sep 24, 2025

Tag manager and captioner for image datasets

Python 1,259 65 Updated Oct 11, 2025

Utilities for latency measurement and reporting

Java 466 61 Updated May 5, 2024

Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals

JavaScript 13,261 432 Updated Jan 31, 2026

Tinfoil Chat - Onion-routed, endpoint secure messaging system

Python 1,305 88 Updated Jun 16, 2025

Simpler DynamoDB access for Scala

Scala 317 126 Updated Dec 12, 2025

SNIPER / AutoFocus is an efficient multi-scale object detection training / inference algorithm

Python 2,692 441 Updated Aug 22, 2021

Painless relocation of Linux binaries–and all of their dependencies–without containers.

Python 3,006 73 Updated Nov 5, 2023

Files to create the figures in the paper "Super-Convergence: Very Fast Training of Residual Networks Using Large Learning Rates"

Shell 191 29 Updated Dec 15, 2017

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Jupyter Notebook 10,047 1,578 Updated Sep 11, 2025
C++ 7 3 Updated May 1, 2017

A better build tool for Java, Scala and Kotlin: Simpler than Maven, easier than Gradle, with 3-7x faster dev workflows than other JVM build tools

Scala 2,682 429 Updated Feb 4, 2026

Python module for quantum chemistry

Python 1,518 664 Updated Feb 5, 2026

A RESTish web API for climate change related data 🌍

377 6 Updated Feb 4, 2019

Generates Fortran, C, and Python header files containing CODATA 2014 physical constants

Python 1 Updated Jan 21, 2018

NWChem: Open Source High-Performance Computational Chemistry

Fortran 584 182 Updated Feb 3, 2026

A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.

Python 2,253 371 Updated Jun 24, 2022