text-processing

OCR, extract and classify documents. In addition, annotate documents and build your own NLP and Computer Vision models using Python by downloading the data. Find examples in our Colab Notebooks, e. g. how to fine-tune Flair.

python nlp ocr computer-vision text-classification text-processing document-extraction document-annotate document-annotation document-annotation-tool

Updated Jun 5, 2024
Jupyter Notebook

aappleby / matcheroni

Star

A minimalist single-header library for building pattern-matchers, lexers, and parsers.

c parser parsing pattern-matching regex regular-expression parsing-expression-grammar lexer lexing regular-expression-engine text-processing regular-expressions parsing-expression-grammars cplusplus-20

Updated Jun 4, 2024
C++

pyparsing / pyparsing

Star

Python library for creating PEG parsers

python parsing parser-combinators python3 parsing-expression-grammar python-3 text-processing python-2 python2 parsing-library peg-parsers

Updated Jun 3, 2024
Python

MoSalem149 / PythonUtilities

Star

A collection of Python scripts for common utility tasks including file manipulation, word counting, longest word detection, and grade categorization. Perfect for quick and easy solutions to everyday programming problems.

python utility text-analysis data-analysis text-processing file-io word-counting file-manipulation educational-tools grade-calculation

Updated Jun 3, 2024
Python

Gyanbardhan / DuplicateQuestionDetection

Star

A platform enables sharing diverse knowledge, but similarly worded questions are common. We use NLP techniques to identify duplicate questions, enhancing user experience by making it easier to find high-quality answers.

nlp tf-idf bow text-processing feature-engineering nlp-machine-learning

Updated Jun 3, 2024
Jupyter Notebook

sunsided / merge-whitespace-rs

Star

Procedural macros for merging whitespace in const contexts

graphql rust macros text-processing procedural-macro

Updated Jun 3, 2024
Rust

ZeroX-DG / vi-rs

Star

Vietnamese Input Method library

ime input-method text-processing vietnamese-language

Updated Jun 3, 2024
Rust

anguswg-ucsb / ingredient-slicer

Star

Python 📦 package for extracting quantity, units, and (sometimes) food names from unstructured recipe ingredients

food recipes parser parsing text-processing ingredients ingredient

Updated Jun 3, 2024
Python

RMNCLDYO / gemini-ai-toolkit

Star

A versatile CLI and Python wrapper for Google's Gemini Pro large language models. Streamline the creation of chatbots, generate dynamic text, analyze images and transcribe audio with ease.