Open Source Linux Text Processing Software

Browse free open source Text Processing software and projects for Linux below. Use the toggles on the left to filter open source Text Processing software by OS, license, language, programming language, and project status.

  • Cut Data Warehouse Costs up to 54% with BigQuery Icon
    Cut Data Warehouse Costs up to 54% with BigQuery

    Migrate from Snowflake, Databricks, or Redshift with free migration tools. Exabyte scale without the Exabyte price.

    BigQuery delivers up to 54% lower TCO than cloud alternatives. Migrate from legacy or competing warehouses using free BigQuery Migration Service with automated SQL translation. Get serverless scale with no infrastructure to manage, compressed storage, and flexible pricing—pay per query or commit for deeper discounts. New customers get $300 in free credit.
    Try BigQuery Free
  • 99.99% Uptime for MySQL and PostgreSQL on Google Cloud Icon
    99.99% Uptime for MySQL and PostgreSQL on Google Cloud

    Enterprise Plus edition delivers sub-second maintenance downtime and 2x read/write performance. Built for critical apps.

    Cloud SQL Enterprise Plus gives you a 99.99% availability SLA with near-zero downtime maintenance—typically under 10 seconds. Get 2x better read/write performance, intelligent data caching, and 35 days of point-in-time recovery. Supports MySQL, PostgreSQL, and SQL Server with built-in vector search for gen AI apps. New customers get $300 in free credit.
    Try Cloud SQL Free
  • 1
    KDiff3

    KDiff3

    A graphical text difference analyzer

    This repository is no longer maintained and is kept for archival purposes. See https://invent.kde.org/sdk/kdiff3 for the newest code and https://download.kde.org/stable/kdiff3/ for release bundles. All bugs should be filed at bugs.kde.org. KDiff3 is a graphical text difference analyzer for up to 3 input files, provides character-by-character analysis and a text merge tool with integrated editor. It can also compare and merge directories. Platform-independant.
    Leader badge
    Downloads: 2,212 This Week
    Last Update:
    See Project
  • 2
    OmegaT - multiplatform CAT tool

    OmegaT - multiplatform CAT tool

    The free computer aided translation (CAT) tool for professionals

    OmegaT is a free and open source multiplatform Computer Assisted Translation tool with fuzzy matching, translation memory, keyword search, glossaries, and translation leveraging into updated projects.
    Leader badge
    Downloads: 1,818 This Week
    Last Update:
    See Project
  • 3
    XMLStarlet is a set of command line utilities (tools) to transform, query, validate, and edit XML documents and files using simple set of shell commands in similar way it is done for text files with UNIX grep, sed, awk, diff, patch, join, etc utilities.
    Leader badge
    Downloads: 1,290 This Week
    Last Update:
    See Project
  • 4
    Utilities for general- and special-purpose documentation. Includes reStructuredText, the easy to read, easy to use, what-you-see-is-what-you-get plaintext markup language.
    Leader badge
    Downloads: 132 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    Diffuse
    Diffuse is a graphical tool for comparing and merging text files. It can retrieve files for comparison from Bazaar, CVS, Darcs, Git, Mercurial, Monotone, RCS, Subversion, and SVK repositories.
    Leader badge
    Downloads: 103 This Week
    Last Update:
    See Project
  • 6
    ANTLR

    ANTLR

    Parser generator to read, process, or translate structured text

    ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files. It's widely used to build languages, tools, and frameworks. From a grammar, ANTLR generates a parser that can build and walk parse trees. It’s widely used in academia and industry to build all sorts of languages, tools, and frameworks. Twitter search uses ANTLR for query parsing, with over 2 billion queries a day. The languages for Hive and Pig, the data warehouse and analysis systems for Hadoop, both use ANTLR. Lex Machina uses ANTLR for information extraction from legal texts. Oracle uses ANTLR within SQL Developer IDE and their migration tools. NetBeans IDE parses C++ with ANTLR. The HQL language in the Hibernate object-relational mapping framework is built with ANTLR.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 7
    regexxer
    regexxer is a nifty GUI search/replace tool featuring Perl-style regular expressions. If you need project-wide substitution and you're tired of hacking sed command lines together, then you should definitely give regexxer a try.
    Leader badge
    Downloads: 65 This Week
    Last Update:
    See Project
  • 8
    Notepad--

    Notepad--

    Notepad for Windows, Linux, and Mac platforms

    Notepad-- a text editor written in C++ that works seamlessly across Windows, Linux, and Mac platforms. Our aim is to eventually surpass Notepad++, with a particular focus on the MacOS and Chinese UOS operating system. Unlike Notepad++, our advantage lies in our cross-platform compatibility and support for various OSes. If you are using MacOS and want to find a useful text editor, please try Ndd, it won't disappoint you. My Github homepage is: https://github.com/cxasm/notepad--
    Leader badge
    Downloads: 155 This Week
    Last Update:
    See Project
  • 9
    Command-line/Ant-task/embeddable text file preprocessor. Macros, flow control, expressions. Recursive directory processing. Extensible in Java to display data from any data sources (as database). Can generate complete homepages (tree of HTML-s, images, etc.)
    Leader badge
    Downloads: 94 This Week
    Last Update:
    See Project
  • Build AI Apps with Gemini 3 on Vertex AI Icon
    Build AI Apps with Gemini 3 on Vertex AI

    Access Google’s most capable multimodal models. Train, test, and deploy AI with 200+ foundation models on one platform.

    Vertex AI gives developers access to Gemini 3—Google’s most advanced reasoning and coding model—plus 200+ foundation models including Claude, Llama, and Gemma. Build generative AI apps with Vertex AI Studio, customize with fine-tuning, and deploy to production with enterprise-grade MLOps. New customers get $300 in free credits.
    Try Vertex AI Free
  • 10
    iText®, a JAVA PDF library

    iText®, a JAVA PDF library

    PDF Library for Developers

    iText is an open-source PDF library available for Java and .NET (C#). iText allows you to effortlessly generate and manipulate standards-compliant PDF documents with a powerful and feature-rich SDK. With iText, you can create archivable and accessible PDFs, split and merge documents, fill and flatten forms, digitally sign documents, and more. iText add-ons enable additional functionality, such as PDF creation from HTML templates, secure redaction, OCR, and much more. The latest versions of iText build on the success of previous versions and feature an improved document engine, high and low-level programming capabilities, and a more efficient modular structure. iText represents the next level for developers looking to leverage PDF in document workflows. The main project page for iText is now on GitHub, and all the latest releases, code samples, open source add-ons and tools, etc. can be found at https://github.com/itext/.
    Leader badge
    Downloads: 161 This Week
    Last Update:
    See Project
  • 11
    FAR - Find And Replace
    Search and replace operations on file content accross multiple files. Recursive operations within entire directory trees. FAR comes with support for regular expressions (regex) over multiple lines, automatic backup and various character encodings. Run grep like extractions to condense or rearrange sources, or perform bulk file renaming.
    Downloads: 32 This Week
    Last Update:
    See Project
  • 12
    The DITA Open Toolkit is an implementation of the OASIS DITA XML Specification. The Toolkit transforms DITA content into many deliverable formats. See https://www.dita-ot.org/ for documentation and links to downloads. The source code and issue trackers have been moved to https://github.com/dita-ot/dita-ot
    Downloads: 17 This Week
    Last Update:
    See Project
  • 13
    Vrapper

    Vrapper

    Vim-like editing in Eclipse

    Vrapper is an eclipse plugin which acts as a wrapper for existing eclipse text editors to provide a Vim-like input scheme for moving around and editing text. Eclipse Update Site: http://vrapper.sourceforge.net/update-site/stable
    Downloads: 10 This Week
    Last Update:
    See Project
  • 14
    Ada Class Library

    Ada Class Library

    Ada Class Library - an object orientated library for Ada.

    Text search and replace. Scripting (small tool programs). CGI scripts. Execution of external programs (incl. I/O redirection). Garbage Collection. Extendended Booch Components. CD-Recorder
    Leader badge
    Downloads: 61 This Week
    Last Update:
    See Project
  • 15
    Text Encoding Initiative

    Text Encoding Initiative

    TEI produces the TEI Guidelines and associated software

    The TEI is an international and interdisciplinary standard used by libraries, museums, publishers, and academics to represent all kinds of literary and linguistic texts, using an encoding scheme that is maximally expressive and minimally obsolescent.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 16
    DrPython is a highly customizable cross-platform ide to aid programming in Python. It was developed with teaching in mind, and has a clean, simple interface. It is written in Python, using wxPython as the gui.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 17
    Notepad3

    Notepad3

    Light-weight Scintilla-based text editor with syntax highlighting

    Notepad3 is a fast and light-weight Scintilla-based text editor with syntax highlighting. Notepad3 is an excellent replacement for the default Windows text editor. Notepad3 offers many extra features over Notepad. It has a small memory footprint, but is powerful enough to handle most programming jobs.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 18
    TextBlob

    TextBlob

    TextBlob is a Python library for processing textual data

    Simple, Pythonic, text processing, Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more. TextBlob stands on the giant shoulders of NLTK and pattern, and plays nicely with both. Supports word inflection (pluralization and singularization) and lemmatization, as well as spelling correction. Add new models or languages through extensions. Also, it comes with a WordNet integration. If you only intend to use TextBlob’s default models (no model overrides), you can pass the lite argument. This downloads only those corpora needed for basic functionality. TextBlob is also available as a conda package.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    RText is a customizable programmer's text editor written in Java. Some of its features include: syntax highlighting, editing multiple documents at once, printing and print preview, find/replace/find in files dialogs, undo/redo, and online help.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 20
    Diff-ext is an extension for filemanagers such as Windows Explorer and Nautilus that allows to launch diff/merge tools on selected files.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 21
    TEA is a text editor that provides a wide range of text-processing functions (over 100) and the syntax highlighting. There are two branches of TEA: Qt-based and GTK-based.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 22
    Camomile is a Unicode library for ocaml. Camomile provides Unicode character type, UTF-8, UTF-16, UTF-32 strings, conversion to/from about 200 encodings, collation and locale-sensitive case mappings, and more.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 23
    Early Access iText, a PDF generation library in Java
    Downloads: 5 This Week
    Last Update:
    See Project
  • 24
    SciTECO

    SciTECO

    Advanced TECO dialect and interactive screen editor based on Scintilla

    SciTECO is an interactive TECO dialect, similar to Video TECO. It also adds features from classic TECO-11, as well as unique new ideas. Project development takes place here: https://git.fmsbw.de/sciteco The download archive is mirrored at Sourceforge, but for nightly builds check out: https://sciteco.fmsbw.de/downloads/nightly/
    Downloads: 5 This Week
    Last Update:
    See Project
  • 25
    bitext2tmx CAT bitext aligner/converter
    A free computer-aided translation / computer-assisted translation (CAT) tool to align and converter bitext into TMX translation memory format to be used in other CAT tools by translators and other language professionals.
    Leader badge
    Downloads: 5 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB
Gen AI apps are built with MongoDB Atlas
Atlas offers built-in vector search and global availability across 125+ regions. Start building AI apps faster, all in one place.
Try Free →