Name | Downloads | Categories | Status | Description |
charset-normalizer | 6497378 | Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic,Utilities
| No status | The Real First Universal Charset Detector. Open, modern and actively maintained alternative to Chardet. |
docutils | 2587648 | Documentation,Software Development :: Documentation,Text Processing
| 4 - Beta | Docutils -- Python Documentation Utilities |
chardet | 2167450 | Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic
| 5 - Production/Stable | Universal encoding detector for Python 3 |
lxml | 2074641 | Text Processing :: Markup :: HTML,Text Processing :: Markup :: XML,Software Development :: Libraries :: Python Modules
| 5 - Production/Stable | Powerful and Pythonic XML processing library combining libxml2/libxslt with the ElementTree API. |
pyyaml | 6040296 | Software Development :: Libraries :: Python Modules,Text Processing :: Markup
| 5 - Production/Stable | YAML parser and emitter for Python |
pygments | 1370394 | Text Processing :: Filters,Utilities
| 6 - Mature | Pygments is a syntax highlighting package written in Python. |
regex | 871433 | Scientific/Engineering :: Information Analysis,Software Development :: Libraries :: Python Modules,Text Processing,Text Processing :: General
| 5 - Production/Stable | Alternative regular expression module, to replace re. |
markdown | 737793 | Communications :: Email :: Filters,Internet :: WWW/HTTP :: Dynamic Content :: CGI Tools/Libraries,Internet :: WWW/HTTP :: Site Management,Software Development :: Documentation,Software Development :: Libraries :: Python Modules,Text Processing :: Filters,Text Processing :: Markup :: HTML,Text Processing :: Markup :: Markdown
| 5 - Production/Stable | Python implementation of Markdown. |
defusedxml | 702240 | Text Processing :: Markup :: XML
| 5 - Production/Stable | XML bomb protection for Python stdlib modules |
xmltodict | 692842 | Text Processing :: Markup :: XML
| No status | Makes working with XML feel like you are working with JSON |
jedi | 575564 | Software Development :: Libraries :: Python Modules,Text Editors :: Integrated Development Environments (IDE),Utilities
| 4 - Beta | An autocompletion tool for Python that can be used for text editors. |
parso | 571092 | Software Development :: Libraries :: Python Modules,Text Editors :: Integrated Development Environments (IDE),Utilities
| 4 - Beta | A Python Parser |
nltk | 447875 | Scientific/Engineering,Scientific/Engineering :: Artificial Intelligence,Scientific/Engineering :: Human Machine Interfaces,Scientific/Engineering :: Information Analysis,Text Processing,Text Processing :: Filters,Text Processing :: General,Text Processing :: Indexing,Text Processing :: Linguistic
| 5 - Production/Stable | Natural Language Toolkit |
mistune | 413606 | Text Processing :: Markup
| 4 - Beta | A sane Markdown parser with useful plugins and renderers |
humanfriendly | 391351 | Communications,Scientific/Engineering :: Human Machine Interfaces,Software Development,Software Development :: Libraries :: Python Modules,Software Development :: User Interfaces,System :: Shells,System :: System Shells,System :: Systems Administration,Terminals,Text Processing :: General,Text Processing :: Linguistic,Utilities
| 6 - Mature | Human friendly output for text interfaces using Python |
unidecode | 390692 | Text Processing,Text Processing :: Filters
| No status | ASCII transliterations of Unicode text |
pandocfilters | 342327 | Text Processing :: Filters
| 3 - Alpha | Utilities for writing pandoc filters in python |
tinycss2 | 318584 | Text Processing
| 5 - Production/Stable | tinycss2 |
snowballstemmer | 313601 | Database,Internet :: WWW/HTTP :: Indexing/Search,Text Processing :: Indexing,Text Processing :: Linguistic
| 5 - Production/Stable | This package provides 29 stemmers for 28 languages generated from Snowball algorithms. |
sphinx | 294922 | Documentation,Documentation :: Sphinx,Internet :: WWW/HTTP :: Site Management,Printing,Software Development,Software Development :: Documentation,Text Processing,Text Processing :: General,Text Processing :: Indexing,Text Processing :: Markup,Text Processing :: Markup :: HTML,Text Processing :: Markup :: LaTeX,Utilities
| 5 - Production/Stable | Python documentation generator |
prettytable | 230490 | Text Processing
| No status | A simple Python library for easily displaying tabular data in a visually appealing ASCII table format |
sphinxcontrib-htmlhelp | 209172 | Documentation,Documentation :: Sphinx,Text Processing,Utilities
| 5 - Production/Stable | sphinxcontrib-htmlhelp is a sphinx extension which renders HTML help files |
humanize | 154685 | Text Processing,Text Processing :: General
| 5 - Production/Stable | Python humanize utilities |
texttable | 137487 | Software Development :: Libraries :: Python Modules,Text Processing,Utilities
| 5 - Production/Stable | module for creating simple ASCII tables |
pypandoc | 128106 | Text Processing,Text Processing :: Filters
| 4 - Beta | Thin wrapper for pandoc. |
docstring-parser | 121934 | Documentation :: Sphinx,Text Processing :: Markup,Software Development :: Libraries :: Python Modules
| 4 - Beta | "Parse Python docstrings in reST, Google and Numpydoc format" |
natsort | 114580 | Scientific/Engineering :: Information Analysis,Utilities,Text Processing
| 5 - Production/Stable | Simple yet flexible natural sorting in Python. |
jellyfish | 99910 | Text Processing :: Linguistic
| 4 - Beta | a library for doing approximate and phonetic matching of strings. |
markdown-it-py | 93965 | Software Development :: Libraries :: Python Modules,Text Processing :: Markup
| 3 - Alpha | Python port of markdown-it. Markdown parsing, done right! |
mdit-py-plugins | 82218 | Software Development :: Libraries :: Python Modules,Text Processing :: Markup
| 3 - Alpha | Collection of plugins for markdown-it-py |
urlextract | 72726 | Text Processing,Text Processing :: Markup :: HTML,Software Development :: Libraries :: Python Modules
| 5 - Production/Stable | Collects and extracts URLs from given text. |
terminaltables | 71530 | Software Development :: Libraries,Terminals,Text Processing :: Markup
| 5 - Production/Stable | Generate simple tables in terminals from a nested list of strings. (has wheels!) |
reportlab | 70085 | Printing,Text Processing :: Markup
| 5 - Production/Stable | The Reportlab Toolkit |
elementpath | 62591 | Software Development :: Libraries,Text Processing :: Markup :: XML
| 5 - Production/Stable | XPath 1.0/2.0/3.0 parsers and selectors for ElementTree and lxml |
feedparser | 58924 | Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: XML
| 5 - Production/Stable | Universal feed parser, handles RSS 0.9x, RSS 1.0, RSS 2.0, CDF, Atom 0.3, and Atom 1.0 feeds |
xmlschema | 56207 | Software Development :: Libraries,Text Processing :: Markup :: XML
| 5 - Production/Stable | An XML Schema validator and decoder |
python-crfsuite | 47885 | Software Development,Software Development :: Libraries :: Python Modules,Scientific/Engineering,Scientific/Engineering :: Information Analysis,Text Processing :: Linguistic
| 4 - Beta | Python binding for CRFsuite |
mkdocs | 42324 | Documentation,Text Processing
| 5 - Production/Stable | Project documentation with Markdown. |
parsimonious | 39288 | Scientific/Engineering :: Information Analysis,Software Development :: Libraries,Text Processing :: General
| 3 - Alpha | (Soon to be) the fastest pure-Python PEG parser I could muster |
junitparser | 38525 | Text Processing
| 5 - Production/Stable | Manipulates JUnit/xUnit Result XML files |
xmlsec | 36542 | Text Processing :: Markup :: XML
| 5 - Production/Stable | Python bindings for the XML Security Library |
pymdown-extensions | 35209 | Internet :: WWW/HTTP :: Dynamic Content,Software Development :: Libraries :: Python Modules,Text Processing :: Filters,Text Processing :: Markup :: HTML
| 5 - Production/Stable | Extension pack for Python Markdown. |
pyphen | 32889 | Text Processing,Text Processing :: Linguistic
| 4 - Beta | Pure Python module to hyphenate text |
mkdocs-material-extensions | 29980 | Internet :: WWW/HTTP :: Dynamic Content,Software Development :: Libraries :: Python Modules,Text Processing :: Filters,Text Processing :: Markup :: HTML
| 5 - Production/Stable | Extension pack for Python Markdown. |
tangled-up-in-unicode | 29669 | Database,Scientific/Engineering :: Information Analysis,Text Processing,Text Processing :: General,Utilities
| No status | Access to the Unicode Character Database (UCD) |
myst-parser | 28926 | Software Development :: Libraries :: Python Modules,Text Processing :: Markup
| 4 - Beta | An extended commonmark compliant parser, with bridges to docutils & sphinx. |
weasyprint | 28271 | Internet :: WWW/HTTP,Text Processing :: Markup :: HTML,Multimedia :: Graphics :: Graphics Conversion,Printing
| 5 - Production/Stable | The Awesome Document Factory |
num2words | 27493 | Software Development :: Internationalization,Software Development :: Libraries :: Python Modules,Software Development :: Localization,Text Processing :: Linguistic
| 5 - Production/Stable | Modules to convert numbers to words. Easily extensible. |
dominate | 25459 | Internet :: WWW/HTTP :: Dynamic Content,Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: HTML
| No status | Dominate is a Python library for creating and manipulating HTML documents using an elegant DOM API. |
anyconfig | 24988 | Software Development :: Libraries :: Python Modules,Text Processing :: Markup,Utilities
| 4 - Beta | Library provides common APIs to load and dump configuration files in various formats |
pdfkit | 23538 | Text Processing,Text Processing :: General,Text Processing :: Markup,Text Processing :: Markup :: HTML,Text Processing :: Markup :: XML,Utilities
| No status | Wkhtmltopdf python wrapper to convert html to pdf using the webkit rendering engine and qt |
colorclass | 22793 | Software Development :: Libraries,Terminals,Text Processing :: Markup
| 5 - Production/Stable | Colorful worry-free console applications for Linux, Mac OS X, and Windows. |
markdown2 | 22282 | Software Development :: Libraries :: Python Modules,Software Development :: Documentation,Text Processing :: Filters,Text Processing :: Markup :: HTML
| 5 - Production/Stable | A fast and complete Python implementation of Markdown |
pathvalidate | 21173 | Software Development :: Libraries,Software Development :: Libraries :: Python Modules,System :: Filesystems,Text Processing
| 5 - Production/Stable | pathvalidate is a Python library to sanitize/validate a string such as filenames/file-paths/etc. |
python-stdnum | 20634 | Office/Business :: Financial,Software Development :: Libraries :: Python Modules,Text Processing :: General
| 5 - Production/Stable | Python module to handle standardized numbers and codes |
python-benedict | 19304 | Education :: Testing,Software Development :: Build Tools,System :: Filesystems,Text Processing :: Markup :: XML,Utilities
| 5 - Production/Stable | python-benedict is a dict subclass with keylist/keypath support, normalized I/O operations (base64, csv, ini, json, pickle, plist, query-string, toml, xml, yaml) and many utilities... for humans, obviously. |
sphinx-tabs | 18157 | Documentation :: Sphinx,Documentation,Software Development :: Documentation,Text Processing,Utilities
| 5 - Production/Stable | Tabbed views for Sphinx |
anyascii | 17683 | Text Processing,Text Processing :: Linguistic
| No status | Unicode to ASCII transliteration |
rjsmin | 16602 | Internet :: WWW/HTTP :: Dynamic Content,Software Development :: Libraries,Software Development :: Libraries :: Python Modules,Text Processing,Text Processing :: Filters,Utilities
| 5 - Production/Stable | Javascript Minifier |
rcssmin | 16003 | Internet :: WWW/HTTP :: Dynamic Content,Software Development :: Libraries,Software Development :: Libraries :: Python Modules,Text Processing,Text Processing :: Filters,Utilities
| 5 - Production/Stable | CSS Minifier |
jinja2 | 4726799 | Internet :: WWW/HTTP :: Dynamic Content,Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: HTML
| 5 - Production/Stable | A small but fast and easy to use stand-alone template engine written in pure python. |
markupsafe | 4122239 | Internet :: WWW/HTTP :: Dynamic Content,Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: HTML
| 5 - Production/Stable | Implements a XML/HTML/XHTML Markup safe string for Python |
beautifulsoup4 | 1750144 | Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: HTML,Text Processing :: Markup :: SGML,Text Processing :: Markup :: XML
| 5 - Production/Stable | Screen-scraping library |
fonttools | 503237 | Text Processing :: Fonts,Multimedia :: Graphics,Multimedia :: Graphics :: Graphics Conversion
| 5 - Production/Stable | Tools to manipulate font files |
html5lib | 468819 | Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: HTML
| 5 - Production/Stable | HTML parser based on the WHATWG HTML specification |
text-unidecode | 423272 | Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic
| 5 - Production/Stable | The most basic Text::Unidecode port |
commonmark | 362012 | Documentation,Internet :: WWW/HTTP :: Dynamic Content,Software Development :: Documentation,Software Development :: Libraries :: Python Modules,Text Processing :: Markup,Utilities
| 4 - Beta | Python parser for the CommonMark Markdown spec |
bs4 | 324087 | Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: HTML,Text Processing :: Markup :: SGML,Text Processing :: Markup :: XML
| 5 - Production/Stable | Dummy package for Beautiful Soup |
brotli | 283796 | Software Development :: Libraries,Software Development :: Libraries :: Python Modules,System :: Archiving,System :: Archiving :: Compression,Text Processing :: Fonts,Utilities
| 4 - Beta | Python bindings for the Brotli compression library |
sphinxcontrib-serializinghtml | 217443 | Documentation,Documentation :: Sphinx,Text Processing,Utilities
| 5 - Production/Stable | sphinxcontrib-serializinghtml is a sphinx extension which outputs "serialized" HTML files (json and pickle). |
sphinxcontrib-jsmath | 209392 | Documentation,Documentation :: Sphinx,Text Processing,Utilities
| 5 - Production/Stable | A sphinx extension which renders display math in HTML via JavaScript |
sphinxcontrib-devhelp | 209302 | Documentation,Documentation :: Sphinx,Text Processing,Utilities
| 5 - Production/Stable | sphinxcontrib-devhelp is a sphinx extension which outputs Devhelp document. |
sphinxcontrib-applehelp | 209302 | Documentation,Documentation :: Sphinx,Text Processing,Utilities
| 5 - Production/Stable | sphinxcontrib-applehelp is a sphinx extension which outputs Apple help books |
sphinxcontrib-qthelp | 209267 | Documentation,Documentation :: Sphinx,Text Processing,Utilities
| 5 - Production/Stable | sphinxcontrib-qthelp is a sphinx extension which outputs QtHelp document. |
gensim | 194153 | Scientific/Engineering :: Artificial Intelligence,Scientific/Engineering :: Information Analysis,Text Processing :: Linguistic
| 5 - Production/Stable | Python framework for fast Vector Space Modelling |
sentencepiece | 178320 | Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic
| 5 - Production/Stable | SentencePiece python wrapper |
pytimeparse | 149235 | Text Processing
| 4 - Beta | Time expression parser |
netaddr | 134000 | Communications,Documentation,Education,Education :: Testing,Home Automation,Internet,Internet :: Log Analysis,Internet :: Name Service (DNS),Internet :: Proxy Servers,Internet :: WWW/HTTP,Internet :: WWW/HTTP :: Indexing/Search,Internet :: WWW/HTTP :: Site Management,Security,Software Development,Software Development :: Libraries,Software Development :: Libraries :: Python Modules,Software Development :: Quality Assurance,Software Development :: Testing,Software Development :: Testing :: Traffic Generation,System :: Benchmark,System :: Clustering,System :: Distributed Computing,System :: Installation/Setup,System :: Logging,System :: Monitoring,System :: Networking,System :: Networking :: Firewalls,System :: Networking :: Monitoring,System :: Networking :: Time Synchronization,System :: Recovery Tools,System :: Shells,System :: Software Distribution,System :: Systems Administration,System :: System Shells,Text Processing,Text Processing :: Filters,Utilities
| 5 - Production/Stable | Pythonic manipulation of IPv4, IPv6, CIDR, EUI and MAC network addresses |
coloredlogs | 130672 | System :: Logging,Terminals,Text Processing :: Markup :: HTML
| 5 - Production/Stable | Colored stream handler for the logging module |
parsedatetime | 110047 | Software Development :: Libraries :: Python Modules,Text Processing
| 5 - Production/Stable | Parse human-readable date/time text. |
inflect | 75065 | Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic
| 3 - Alpha | Correctly generate plurals, singular nouns, ordinals, indefinite articles; convert numbers to words |
intervaltree | 61650 | Scientific/Engineering :: Artificial Intelligence,Scientific/Engineering :: Bio-Informatics,Scientific/Engineering :: Information Analysis,Software Development :: Libraries,Text Processing :: General,Text Processing :: Linguistic,Text Processing :: Markup
| 4 - Beta | Editable interval tree data structure for Python 2 and 3 |
ftfy | 60690 | Software Development :: Libraries :: Python Modules,Text Processing :: Filters
| 5 - Production/Stable | Fixes some problems with Unicode text after the fact |
htmlmin | 53005 | Text Processing :: Markup :: HTML
| 4 - Beta | An HTML Minifier |
lark-parser | 51789 | Software Development :: Libraries :: Python Modules,Text Processing :: General,Text Processing :: Linguistic
| 5 - Production/Stable | a modern parsing library |
parsel | 51067 | Text Processing :: Markup,Text Processing :: Markup :: HTML,Text Processing :: Markup :: XML
| 5 - Production/Stable | Parsel is a library to extract data from HTML and XML using XPath and CSS selectors |
mkdocs-material | 46316 | Documentation,Software Development :: Documentation,Text Processing :: Markup :: HTML
| 5 - Production/Stable | A Material Design theme for MkDocs |
lark | 42894 | Software Development :: Libraries :: Python Modules,Text Processing :: General,Text Processing :: Linguistic | 5 - Production/Stable | a modern parsing library |
cssutils | 39279 | Internet,Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: HTML | 4 - Beta | A CSS Cascading Style Sheets library for Python |
nameparser | 38673 | Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic | 5 - Production/Stable | A simple Python module for parsing human names into their individual components. |
textblob | 30376 | Text Processing :: Linguistic | 4 - Beta | Simple, Pythonic text processing. Sentiment analysis, POS tagging, noun phrase parsing, and more. |
url-normalize | 30235 | Internet,Software Development :: Libraries :: Python Modules,Text Processing :: Indexing,Utilities | No status | URL normalization for Python (python3) |
ansiwrap | 30126 | Software Development :: Libraries :: Python Modules,Text Processing,Text Processing :: Filters | 4 - Beta | textwrap, but savvy to ANSI colors and styles |
textwrap3 | 29969 | Software Development :: Libraries :: Python Modules,Text Processing,Text Processing :: Filters | 4 - Beta | textwrap from Python 3.6 backport (plus a few tweaks) |
pyxb | 29845 | Software Development :: Code Generators,Text Processing :: Markup :: XML | 5 - Production/Stable | Python XML Schema Bindings |
pyfiglet | 29530 | Text Processing,Text Processing :: Fonts | 5 - Production/Stable | Pure-python FIGlet implementation |
diff-match-patch | 27210 | Text Processing | 6 - Mature | The Diff Match and Patch libraries offer robust algorithms to perform the operations required for synchronizing plain text. |
ebcdic | 25632 | Text Processing | 4 - Beta | Additional EBCDIC codecs |
polib | 24055 | Software Development :: Internationalization,Software Development :: Libraries :: Python Modules,Software Development :: Localization,Text Processing :: Linguistic | 4 - Beta | A library to manipulate gettext files (po and mo files). |
jsmin | 22775 | Internet :: WWW/HTTP :: Dynamic Content,Software Development :: Pre-processors,Text Processing :: Filters | 5 - Production/Stable | JavaScript minifier. |
chevron | 21771 | Text Processing :: Markup | 4 - Beta | Mustache templating language renderer |
sphinxcontrib-websupport | 20889 | Documentation,Documentation :: Sphinx,Text Processing,Utilities | 5 - Production/Stable | Sphinx API for Web Apps |
pattern | 18968 | Internet :: WWW/HTTP :: Indexing/Search,Multimedia :: Graphics,Scientific/Engineering :: Artificial Intelligence,Scientific/Engineering :: Visualization,Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: HTML | 5 - Production/Stable | Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization. |
sacrebleu | 18596 | Scientific/Engineering,Scientific/Engineering :: Artificial Intelligence,Text Processing | 5 - Production/Stable | Hassle-free computation of shareable, comparable, and reproducible BLEU scores |
restructuredtext-lint | 17356 | Text Processing :: Markup | 3 - Alpha | reStructuredText linter |
python-bidi | 16587 | Text Processing | 4 - Beta | Pure python implementation of the BiDi layout algorithm |
beautifulsoup | 15895 | Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: HTML,Text Processing :: Markup :: SGML,Text Processing :: Markup :: XML | 5 - Production/Stable | Screen-scraping library |
jieba | 15883 | Text Processing,Text Processing :: Indexing,Text Processing :: Linguistic | No status | Chinese Words Segmentation Utilities |
stanza | 15296 | Scientific/Engineering,Scientific/Engineering :: Artificial Intelligence,Scientific/Engineering :: Information Analysis,Text Processing,Text Processing :: Linguistic,Software Development,Software Development :: Libraries | 4 - Beta | A Python NLP Library for Many Human Languages, by the Stanford NLP Group |
langid | 15103 | Scientific/Engineering :: Artificial Intelligence,Text Processing :: Linguistic | 5 - Production/Stable | langid.py is a standalone Language Identification (LangID) tool. |
svglib | 14917 | Documentation,Multimedia :: Graphics :: Graphics Conversion,Printing,Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: XML,Utilities | 4 - Beta | An experimental library for reading and converting SVG. |
xhtml2pdf | 14762 | Documentation,Multimedia,Office/Business,Printing,Text Processing,Text Processing :: Filters,Text Processing :: Fonts,Text Processing :: General,Text Processing :: Indexing,Text Processing :: Markup,Text Processing :: Markup :: HTML,Text Processing :: Markup :: XML,Utilities | 4 - Beta | PDF generator using HTML and CSS |
cheetah3 | 14560 | Internet :: WWW/HTTP,Internet :: WWW/HTTP :: Dynamic Content,Internet :: WWW/HTTP :: Site Management,Software Development :: Code Generators,Software Development :: Libraries :: Python Modules,Software Development :: User Interfaces,Text Processing | 4 - Beta | Cheetah is a template engine and code generation tool |
ptable | 13843 | Text Processing | No status | A simple Python library for easily displaying tabular data in a visually appealing ASCII table format |
parsy | 13789 | Software Development :: Compilers,Software Development :: Interpreters,Text Processing | 5 - Production/Stable | easy-to-use parser combinators, for parsing in pure Python |
pyahocorasick | 13752 | Software Development :: Libraries,Text Editors :: Text Processing | 5 - Production/Stable | pyahocorasick is a fast and memory efficient library for exact or approximate multi-pattern string search. With the ahocorasick.Automaton class, you can find multiple key strings occurrences at once in some input text. You can use it as a plain dict-like Trie or convert a Trie to an automaton for efficient Aho-Corasick search. Implemented in C and tested on Python 2.7 and 3.4+. Works on Linux, Mac and Windows. BSD-3-clause license. |
ansi2html | 13470 | Text Processing,Text Processing :: Markup,Text Processing :: Markup :: HTML | 5 - Production/Stable | Convert text with ANSI color codes to HTML |
demoji | 13466 | Text Processing,Utilities | 3 - Alpha | Accurately remove and replace emojis in text strings. |
publicsuffixlist | 13389 | Internet :: Name Service (DNS),Text Processing :: Filters | 4 - Beta | publicsuffixlist implement |
htmldate | 12704 | Scientific/Engineering,Scientific/Engineering :: Information Analysis,Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic,Text Processing :: Markup :: HTML | 4 - Beta | This module can handle all the steps needed from web page download to HTML parsing, including scraping and textual analysis. Its goal is to find the creation date of a page all common structural patterns, text-based heuristics and robust date extraction. It takes URLs, HTML files or trees as input and outputs a date. |
segtok | 12594 | Scientific/Engineering :: Information Analysis,Software Development :: Libraries,Text Processing,Text Processing :: Linguistic | 3 - Alpha | sentence segmentation and word tokenization tools |
strictyaml | 12565 | Text Processing :: Markup,Software Development :: Libraries | 4 - Beta | Strict, typed YAML parser |
jupytext | 12203 | Text Processing :: Markup | 5 - Production/Stable | Jupyter notebooks as Markdown documents, Julia, Python or R scripts |
pyenchant | 11783 | Software Development :: Libraries,Text Processing :: Linguistic | 5 - Production/Stable | Python bindings for the Enchant spellchecking system |
breathe | 10708 | Documentation,Text Processing,Utilities | 4 - Beta | Sphinx Doxygen renderer |
jaconv | 10529 | Text Processing | 4 - Beta | Pure-Python Japanese character interconverter for Hiragana, Katakana, Hankaku and Zenkaku |
trafilatura | 10141 | Internet :: WWW/HTTP,Scientific/Engineering :: Information Analysis,Text Processing :: Linguistic,Text Processing :: Markup :: HTML | 4 - Beta | Downloads web pages, scrapes main text and comments while preserving some structure, and converts to TXT, CSV, XML & TEI-XML |
marisa-trie | 9976 | Scientific/Engineering :: Information Analysis,Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic | 4 - Beta | Static memory-efficient & fast Trie-like structures for Python (based on marisa-trie C++ library) |
titlecase | 9885 | Text Processing :: Filters | 4 - Beta | Python Port of John Gruber's titlecase.pl |
dartsclone | 9845 | Text Processing :: Linguistic | No status | Python binding of Darts Clone |
pyhcl | 9528 | Text Processing | 5 - Production/Stable | HCL configuration parser for python |
pyrtf3 | 9356 | Software Development :: Libraries :: Python Modules,Text Editors :: Text Processing | 4 - Beta | PyRTF - Rich Text Format Document Generation |
tempita | 8827 | Text Processing | 4 - Beta | A very small text templating language |
pyarabic | 8790 | Text Processing :: Linguistic | 5 - Production/Stable | Arabic text tools for Python |
vadersentiment | 8762 | Scientific/Engineering :: Artificial Intelligence,Scientific/Engineering :: Information Analysis,Text Processing :: General,Text Processing :: Linguistic | 4 - Beta | VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains. |
ngram | 8471 | Text Processing,Text Processing :: Indexing,Text Processing :: Linguistic | 5 - Production/Stable | A `set` subclass providing fuzzy search based on N-grams. |
mecab-python3 | 8368 | Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic | No status | python wrapper for mecab: Morphological Analysis engine |
fpdf2 | 8289 | Printing,Software Development :: Libraries :: Python Modules,Text Processing :: Markup,Multimedia :: Graphics,Multimedia :: Graphics :: Presentation | 5 - Production/Stable | Simple PDF generation for Python |
hachoir | 8187 | Multimedia,Scientific/Engineering :: Information Analysis,Software Development :: Disassemblers,Software Development :: Interpreters,Software Development :: Libraries :: Python Modules,System :: Filesystems,Text Processing,Utilities | 5 - Production/Stable | Package of Hachoir parsers used to open binary files |
yake | 8052 | Scientific/Engineering :: Information Analysis,Software Development :: Libraries,Text Processing,Text Processing :: Linguistic | 3 - Alpha | Keyword extraction Python package |
m2r | 7768 | Text Processing | 4 - Beta | Markdown and reStructuredText in a single file. |
whoosh | 7731 | Software Development :: Libraries :: Python Modules,Text Processing :: Indexing | 3 - Alpha | Fast, pure-Python full text indexing, search, and spell checking library. |
pdfrw | 7725 | Multimedia :: Graphics :: Graphics Conversion,Printing,Software Development :: Libraries,Text Processing,Utilities | 4 - Beta | PDF file reader/writer library |
httpie | 7654 | Internet :: WWW/HTTP,Software Development,System :: Networking,Terminals,Text Processing,Utilities | 5 - Production/Stable | HTTPie - a CLI, cURL-like tool for humans. |
cdifflib | 7334 | Software Development,Text Processing :: General | 4 - Beta | C implementation of parts of difflib |
pybtex | 6906 | Text Editors :: Text Processing,Text Processing :: Markup :: LaTeX,Text Processing :: Markup :: XML | 4 - Beta | A BibTeX-compatible bibliography processor in Python |
latexcodec | 6809 | Text Processing :: Filters,Text Processing :: Markup :: LaTeX | 5 - Production/Stable | A lexer and codec to work with LaTeX code in Python. |
iso-639 | 6773 | Text Processing :: Linguistic | 5 - Production/Stable | Python library for ISO 639 standard |
m2r2 | 6584 | Text Processing | 4 - Beta | Markdown and reStructuredText in a single file. |
bashlex | 6410 | Software Development :: Libraries :: Python Modules,System :: System Shells,Text Processing | 4 - Beta | Python parser for bash |
goslate | 6304 | Software Development :: Internationalization,Software Development :: Localization,Text Processing :: Linguistic | 5 - Production/Stable | Goslate: Free Google Translate API |
mechanize | 6042 | Internet,Internet :: File Transfer Protocol (FTP),Internet :: WWW/HTTP,Internet :: WWW/HTTP :: Browsers,Internet :: WWW/HTTP :: Indexing/Search,Internet :: WWW/HTTP :: Site Management,Internet :: WWW/HTTP :: Site Management :: Link Checking,Software Development :: Libraries,Software Development :: Libraries :: Python Modules,Software Development :: Testing,Software Development :: Testing :: Traffic Generation,System :: Archiving :: Mirroring,System :: Networking :: Monitoring,System :: Systems Administration,Text Processing,Text Processing :: Markup,Text Processing :: Markup :: HTML,Text Processing :: Markup :: XML | 5 - Production/Stable | Stateful programmatic web browsing. |
pansi | 6012 | Software Development,Software Development :: Libraries,System :: Console Fonts,System :: Shells,Terminals,Terminals :: Terminal Emulators/X Terminals,Text Processing,Text Processing :: Markup,Utilities | 1 - Planning | ANSI escape code library for Python |
myst-nb | 5978 | Software Development :: Libraries :: Python Modules,Text Processing :: Markup | 3 - Alpha | A Jupyter Notebook Sphinx reader built on top of the MyST markdown parser. |
draftjs-exporter | 5917 | Internet :: WWW/HTTP,Internet :: WWW/HTTP :: Dynamic Content,Internet :: WWW/HTTP :: Site Management,Software Development :: Libraries :: Application Frameworks,Software Development :: Libraries :: Python Modules,Text Editors :: Word Processors | No status | Library to convert the Facebook Draft.js editor's raw ContentState to HTML |
tinysegmenter | 5859 | Scientific/Engineering :: Artificial Intelligence,Scientific/Engineering :: Information Analysis,Text Processing :: Linguistic | 4 - Beta | Very compact Japanese tokenizer |
mdformat | 5825 | Documentation,Software Development :: Libraries :: Python Modules,Text Processing :: Markup | No status | CommonMark compliant Markdown formatter |
pybtex-docutils | 5751 | Text Editors :: Text Processing,Text Processing :: Markup :: XML | 4 - Beta | A docutils backend for pybtex. |
tcolorpy | 5621 | Software Development :: Libraries,Software Development :: Libraries :: Python Modules,Terminals,Text Processing | 3 - Alpha | tcolopy is a Python library to apply true color for terminal text. |
beautifultable | 5491 | Printing,Text Processing | 3 - Alpha | Utility package to print visually appealing ASCII tables to terminal |
parsley | 5469 | Software Development :: Code Generators,Text Processing :: General | 6 - Mature | Parsing and pattern matching made easy. |
tatsu | 5243 | Software Development :: Code Generators,Software Development :: Compilers,Software Development :: Interpreters,Text Processing :: General | 5 - Production/Stable | TatSu takes a grammar in a variation of EBNF as input, and outputs a memoizing PEG/Packrat parser in Python. |
tailer | 5223 | System :: Logging,Text Processing | No status | Python tail is a simple implementation of GNU tail and head. |
vobject | 5143 | Text Processing | 5 - Production/Stable | A full-featured Python package for parsing and creating iCalendar and vCard files |
imgkit | 5133 | Text Processing,Text Processing :: General,Text Processing :: Markup,Text Processing :: Markup :: HTML,Text Processing :: Markup :: XML,Utilities | No status | Wkhtmltopdf python wrapper to convert html to image using the webkit rendering engine and qt |
deep-translator | 5071 | Education,Software Development,Communications,Text Processing,Text Processing :: Linguistic | 5 - Production/Stable | A flexible python tool to translate between different languages in a simple way. |
tabula-py | 5067 | Text Processing :: General | 5 - Production/Stable | Simple wrapper for tabula-java, read tables from PDF into DataFrame |
pytils | 5020 | Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic | 4 - Beta | Utils for easy processing string in russian. |
pycld3 | 5013 | Text Processing :: Linguistic | 3 - Alpha | CLD3 Python bindings |
pypinyin | 4995 | Utilities,Text Processing | 5 - Production/Stable | 汉字拼音转换模块/工具. |
bbcode | 4912 | Text Processing :: Markup | 5 - Production/Stable | A pure python bbcode parser and formatter. |
readability-lxml | 4882 | Internet,Software Development :: Libraries :: Python Modules,Text Processing :: Indexing,Utilities | No status | fast html to text parser (article readability tool) with python3 support |
xmldiff | 4817 | Text Processing :: Markup :: XML | 4 - Beta | Creates diffs of XML files |
compressed-rtf | 4797 | System :: Archiving :: Compression,Text Processing | 5 - Production/Stable | Compressed Rich Text Format (RTF) compression and decompression package |
pdfminer | 4771 | Text Processing | 4 - Beta | PDF parser and analyzer |
parse-torrent-name | 4699 | Text Processing | 5 - Production/Stable | Parse torrent name of a movie or TV show |
in-place | 4505 | System :: Filesystems,Text Processing :: Filters | 4 - Beta | In-place file processing |
empy | 4473 | Software Development :: Libraries :: Python Modules,Text Editors :: Text Processing,Text Processing :: Filters,Text Processing :: General,Text Processing :: Markup,Utilities | 6 - Mature | A powerful and robust templating system for Python. |
art | 4405 | Text Processing :: Fonts,Text Editors,Text Processing :: General,Utilities,Multimedia,Printing | 5 - Production/Stable | ASCII Art Library For Python |
pycld2 | 4396 | Text Processing :: Linguistic | 4 - Beta | Python bindings around Google Chromium's embedded compact language detection library (CLD2) |
pandoc | 4345 | Software Development,Text Processing | 5 - Production/Stable | Pandoc Documents for Python |
django-autoslug | 4309 | Software Development :: Libraries :: Python Modules,Text Processing :: General | 5 - Production/Stable | An automated slug field for Django. |
dawg-python | 4163 | Scientific/Engineering :: Information Analysis,Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic | 3 - Alpha | Pure-python reader for DAWGs created by dawgdic C++ library or DAWG Python extension. |
word2number | 4160 | Scientific/Engineering :: Information Analysis,Software Development :: Libraries :: Python Modules,Text Editors :: Text Processing,Text Processing :: Linguistic | No status | Convert number words eg. three hundred and forty two to numbers (342). |
smartypants | 4116 | Text Processing :: Filters | 5 - Production/Stable | Python with the SmartyPants |
funcparserlib | 4115 | Software Development :: Compilers,Software Development :: Interpreters,Software Development :: Libraries,Software Development :: Libraries :: Python Modules,Text Processing | 5 - Production/Stable | Recursive descent parsing library based on functional combinators |
anytemplate | 4062 | Software Development :: Libraries :: Python Modules,Text Processing :: Markup,Utilities | 4 - Beta | A python template abstraction layer module |
pymorphy2 | 4049 | Software Development :: Libraries :: Python Modules,Scientific/Engineering :: Information Analysis,Text Processing :: Linguistic | 4 - Beta | Morphological analyzer (POS tagger + inflection engine) for Russian language. |
jsonpath | 3990 | Software Development :: Libraries :: Python Modules,Text Processing :: Markup | 4 - Beta | An XPath for JSON |
justext | 3895 | Internet :: WWW/HTTP,Software Development :: Pre-processors,Text Processing :: Filters,Text Processing :: Markup :: HTML | 5 - Production/Stable | Heuristic based boilerplate removal tool |
unicode | 3753 | Text Editors :: Text Processing,Utilities | 5 - Production/Stable | Display unicode character properties |
textstat | 3694 | Text Processing | No status | Calculate statistical features from text |
mdx-truly-sane-lists | 3669 | Text Processing :: Markup,Utilities | No status | Extension for Python-Markdown that makes lists truly sane. Custom indents for nested lists and fix for messy linebreaks. |
mdutils | 3638 | Utilities,Software Development :: Documentation,Text Processing :: Markup | 5 - Production/Stable | Useful package for creating Markdown files while executing python code. |
meld3 | 3558 | Text Processing :: Markup :: HTML | 5 - Production/Stable | meld3 is an HTML/XML templating engine. |
biplist | 3557 | Software Development :: Libraries :: Python Modules,Text Processing :: Markup | 5 - Production/Stable | biplist is a library for reading/writing binary plists. |
lunr | 3481 | Text Processing | 3 - Alpha | A Python implementation of Lunr.js |
english | 3468 | Text Processing,Text Processing :: Linguistic | 1 - Planning | English language utility library for Python |
pyjson5 | 3467 | Text Processing :: General | 4 - Beta | JSON5 serializer and parser for Python 3 written in Cython. |
genshi | 3466 | Internet :: WWW/HTTP :: Dynamic Content,Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: HTML,Text Processing :: Markup :: XML | 4 - Beta | A toolkit for generation of output for the web |
chameleon | 3364 | Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: HTML,Text Processing :: Markup :: XML | No status | XML-based template compiler. |
dict2xml | 3362 | Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: XML | 5 - Production/Stable | Small utility to convert a python dictionary into an XML string |
pyyaml-include | 3302 | Software Development :: Libraries :: Python Modules,Text Processing :: Markup | 5 - Production/Stable | Extending PyYAML with a custom constructor for including YAML files within YAML files |
pymorphy2-dicts-ru | 3164 | Scientific/Engineering :: Information Analysis,Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic | 4 - Beta | Russian dictionaries for pymorphy2 |
csscompressor | 3061 | Software Development :: Build Tools,Text Processing :: General,Utilities | 5 - Production/Stable | A python port of YUI CSS Compressor |
mwparserfromhell | 3031 | Text Processing :: Markup | 4 - Beta | MWParserFromHell is a parser for MediaWiki wikicode. |
blockdiag | 3028 | Software Development,Software Development :: Documentation,Text Processing :: Markup | 4 - Beta | blockdiag generate block-diagram image file from spec-text file. |
plantuml-markdown | 2973 | Software Development :: Documentation,Software Development :: Libraries :: Python Modules,Text Processing :: Filters,Text Processing :: Markup :: HTML | 5 - Production/Stable | A PlantUML plugin for Markdown |
sphinx-design | 2943 | Software Development :: Libraries :: Python Modules,Text Processing :: Markup | 3 - Alpha | A sphinx extension for designing beautiful, view size responsive web components. |
pylatexenc | 2938 | Scientific/Engineering,Text Processing :: General,Text Processing :: Markup :: LaTeX | 5 - Production/Stable | Simple LaTeX parser providing latex-to-unicode and unicode-to-latex conversion |
lorem | 2893 | Software Development :: Libraries,Software Development :: Libraries :: Python Modules,Text Processing,Text Processing :: Linguistic | 4 - Beta | Generator for random text that looks like Latin. |
stopwordsiso | 2892 | Text Processing :: Linguistic | 3 - Alpha | Collection of stopwords for multiple languages. Using ISO 639-1 language code. |
detect-delimiter | 2864 | Software Development :: Libraries,Text Processing,Utilities | 4 - Beta | Detects the delimiter used in CSV, TSV and other ad hoc file formats. |
courlan | 2862 | Internet :: WWW/HTTP,Scientific/Engineering :: Information Analysis,Text Processing :: Filters | 4 - Beta | Clean, filter, normalize, and sample URLs |
pynliner | 2794 | Communications :: Email,Text Processing :: Markup :: HTML | 3 - Alpha | Python CSS-to-inline-styles conversion tool for HTML using BeautifulSoup and cssutils |
pyscss | 2790 | Software Development :: Code Generators,Software Development :: Libraries :: Python Modules,Text Processing :: Markup | 5 - Production/Stable | pyScss, a Scss compiler for Python |
whatthepatch | 2770 | Software Development,Software Development :: Libraries :: Python Modules,Software Development :: Version Control,Text Processing | 2 - Pre-Alpha | A patch parsing library |
sphinx-external-toc | 2741 | Software Development :: Libraries :: Python Modules,Text Processing :: Markup | 3 - Alpha | A sphinx extension that allows the site-map to be defined in a single YAML file. |
emoji-country-flag | 2685 | Internet :: WWW/HTTP :: Dynamic Content,Communications :: Chat,Printing,Text Processing :: General | 5 - Production/Stable | En/Decode unicode country flags emoji |
hellosign-python-sdk | 2657 | Text Processing :: Linguistic | 3 - Alpha | An API wrapper written in Python to interact with HelloSign's API (http://www.hellosign.com) |
mf2py | 2551 | Text Processing :: Markup :: HTML | 3 - Alpha | Python Microformats2 parser |
kitchen | 2546 | Software Development :: Libraries :: Python Modules,Text Processing :: General | 3 - Alpha | Kitchen contains a cornucopia of useful code |
konlpy | 2491 | Scientific/Engineering,Scientific/Engineering :: Artificial Intelligence,Scientific/Engineering :: Information Analysis,Text Processing,Text Processing :: Filters,Text Processing :: General,Text Processing :: Linguistic | 4 - Beta | Python package for Korean natural language processing. |
esprima | 2473 | Software Development :: Code Generators,Software Development :: Compilers,Software Development :: Libraries :: Python Modules,Text Processing :: General | 4 - Beta | ECMAScript parsing infrastructure for multipurpose analysis in Python |
junos-eznc | 2465 | Software Development :: Libraries,Software Development :: Libraries :: Application Frameworks,Software Development :: Libraries :: Python Modules,System :: Networking,Text Processing :: Markup :: XML | 5 - Production/Stable | Junos 'EZ' automation for non-programmers |
natto-py | 2448 | Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic | 4 - Beta | A Tasty Python Binding with MeCab (FFI-based, no SWIG or compiler necessary) |
jinja2schema | 2444 | Internet :: WWW/HTTP :: Dynamic Content,Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: HTML | 4 - Beta | Type inference for Jinja2 templates. |
yorm | 2391 | Software Development :: Libraries :: Python Modules,Software Development :: Version Control,System :: Filesystems,Text Editors :: Text Processing | 4 - Beta | Automatic object YAML mapping for Python. |
capturer | 2362 | Communications,Scientific/Engineering :: Human Machine Interfaces,Software Development,Software Development :: Libraries :: Python Modules,Software Development :: User Interfaces,System :: Shells,System :: Systems Administration,System :: System Shells,Terminals,Text Processing :: General | 4 - Beta | Easily capture stdout/stderr of the current process and subprocesses |
jupyter-book | 2350 | Software Development :: Libraries :: Python Modules,Text Processing :: Markup | 4 - Beta | Jupyter Book: Create an online book with Jupyter Notebooks |
coala | 2341 | Scientific/Engineering :: Information Analysis,Software Development :: Quality Assurance,Text Processing :: Linguistic | 4 - Beta | Linting and Fixing Code for All Languages |
benepar | 2229 | Scientific/Engineering :: Artificial Intelligence,Text Processing :: Linguistic | No status | Berkeley Neural Parser |
polyglot | 2228 | Scientific/Engineering :: Artificial Intelligence,Text Processing :: Linguistic | 4 - Beta | Polyglot is a natural language pipeline that supports massive multilingual applications. |
pythainlp | 2198 | Scientific/Engineering :: Artificial Intelligence,Text Processing,Text Processing :: General,Text Processing :: Linguistic | 5 - Production/Stable | Thai Natural Language Processing library |
markdown-include | 2175 | Internet :: WWW/HTTP :: Site Management,Software Development :: Documentation,Software Development :: Libraries :: Python Modules,Text Processing :: Filters,Text Processing :: Markup :: HTML | 5 - Production/Stable | This is an extension to Python-Markdown which provides an "include" function, similar to that found in LaTeX (and also the C pre-processor and Fortran). I originally wrote it for my FORD Fortran auto-documentation generator. |
ucca | 2167 | Text Processing :: Linguistic | 4 - Beta | Universal Conceptual Cognitive Annotation |
misaka | 2122 | Text Processing :: Markup,Text Processing :: Markup :: HTML,Utilities | 4 - Beta | A Python binding for Sundown. |
prettierfier | 2115 | Text Processing :: Markup :: HTML,Text Processing :: Markup :: XML | No status | Intelligently pretty-print HTML/XML with inline tags. |
edit-distance | 2100 | Scientific/Engineering,Software Development,Text Processing,Utilities | 4 - Beta | Computing edit distance on arbitrary Python sequences. |
postal-address | 2098 | Office/Business,Software Development :: Libraries :: Python Modules,Software Development :: Localization,Text Processing,Utilities | 4 - Beta | Parse, normalize and render postal addresses. |
feedgen | 2048 | Communications,Internet,Text Processing,Text Processing :: Markup,Text Processing :: Markup :: XML | 4 - Beta | Feed Generator (ATOM, RSS, Podcasts) |
bitmath | 2036 | Scientific/Engineering :: Mathematics,Software Development :: Libraries,Software Development :: Libraries :: Python Modules,Software Development :: User Interfaces,Software Development :: Widget Sets,System :: Filesystems,Text Processing :: Filters,Utilities | 5 - Production/Stable | Pythonic module for representing and manipulating file sizes with different prefix notations. |
esmre | 2022 | Software Development :: Libraries :: Python Modules,Text Processing :: Indexing | 4 - Beta | Regular expression accelerator |
mkdocs-include-markdown-plugin | 1977 | Software Development :: Documentation,Documentation,Text Processing,Text Processing :: Markup :: Markdown | 5 - Production/Stable | Mkdocs Markdown includer plugin. |
jxmlease | 1970 | Text Processing :: Markup :: XML | No status | jxmlease converts between XML and intelligent Python data structures. |
py-gfm | 1892 | Text Processing :: Markup | No status | An implementation of Github-Flavored Markdown written as an extension to the Python Markdown library. |
quantulum3 | 1873 | Scientific/Engineering,Text Processing :: Linguistic | 3 - Alpha | Extract quantities from unstructured text. |
phonetics | 1852 | Internet :: WWW/HTTP :: Indexing/Search,Text Processing :: General,Text Processing :: Indexing,Text Processing :: Linguistic | 3 - Alpha | None |
mistletoe | 1791 | Software Development :: Libraries :: Python Modules,Text Processing :: Markup | 3 - Alpha | A fast, extensible Markdown parser in pure Python. |
markdown-inline-graphviz-extension | 1778 | Documentation,Text Processing | 5 - Production/Stable | Render inline graphs with Markdown and Graphviz (python3 version) |
pytextrank | 1776 | Scientific/Engineering :: Artificial Intelligence,Scientific/Engineering :: Human Machine Interfaces,Scientific/Engineering :: Information Analysis,Text Processing :: General,Text Processing :: Indexing,Text Processing :: Linguistic | 5 - Production/Stable | Python implimentation of TextRank for text document NLP parsing and summarization |
riotwatcher | 1769 | Software Development :: Libraries :: Python Modules,Text Processing,Games/Entertainment :: Real Time Strategy,Games/Entertainment :: Role-Playing | No status | RiotWatcher is a thin wrapper on top of the Riot Games API for League of Legends. |
python-card-me | 1746 | Text Processing | 5 - Production/Stable | UNKNOWN |
django-htmlmin | 1742 | Text Processing :: Markup :: HTML | 5 - Production/Stable | HTML minifier for Python frameworks (not only Django, despite the name). |
jaro-winkler | 1739 | Text Processing,Text Processing :: General,Text Processing :: Indexing | No status | Original, standard and customisable versions of the Jaro-Winkler functions. |
sphinxext-opengraph | 1736 | Documentation :: Sphinx,Documentation,Software Development :: Documentation,Text Processing,Utilities | 5 - Production/Stable | Sphinx Extension to enable OGP support |
pyuca | 1705 | Text Processing | 5 - Production/Stable | a Python implementation of the Unicode Collation Algorithm |
rst2ansi | 1619 | Software Development :: Libraries :: Python Modules,Text Processing :: Markup,Utilities | 4 - Beta | A rst converter to ansi-decorated console output |
hanziconv | 1607 | System,Text Processing,Utilities | 4 - Beta | Hanzi Converter for Traditional and Simplified Chinese |
srt | 1600 | Multimedia :: Video,Software Development :: Libraries,Text Processing | 5 - Production/Stable | A tiny library for parsing, modifying, and composing SRT files. |
asciimatics | 1583 | Text Processing :: General | 5 - Production/Stable | A cross-platform package to create ASCII art and cinematic storyboard sequences |
jsoncomment | 1581 | Software Development :: Pre-processors,Text Editors :: Text Processing | 4 - Beta | A wrapper to JSON parsers allowing comments, multiline strings and trailing commas |
stemming | 1580 | Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic | 5 - Production/Stable | Python implementations of various stemming algorithms. |
xsdata | 1573 | Software Development :: Code Generators,Text Processing :: Markup :: XML | No status | Python XML Binding |
nwdiag | 1569 | Software Development,Software Development :: Documentation,Text Processing :: Markup | 3 - Alpha | nwdiag generate network-diagram image file from spec-text file. |
fuzzy | 1556 | Text Processing,Text Processing :: General,Text Processing :: Indexing,Text Processing :: Linguistic | 4 - Beta | Fast Python phonetic algorithms |
selectolax | 1555 | Text Processing :: Markup :: HTML,Internet,Internet :: WWW/HTTP | 5 - Production/Stable | Fast HTML5 parser with CSS selectors. |
django-inlinecss | 1542 | Communications :: Email,Text Processing :: Markup :: HTML | No status | A Django app useful for inlining CSS (primarily for e-mails) |
overpunch | 1523 | Text Processing | 5 - Production/Stable | Overpunch Parser/Formatter |
mdformat-gfm | 1522 | Documentation,Text Processing :: Markup | No status | Mdformat plugin for GitHub Flavored Markdown compatibility |
mdv | 1479 | Text Processing :: Markup | No status | Terminal Markdown Viewer |
html5-parser | 1474 | Text Processing,Text Processing :: Markup,Text Processing :: Markup :: HTML,Text Processing :: Markup :: XML | 5 - Production/Stable | Fast C based HTML 5 parsing for python |
coala-bears | 1439 | Scientific/Engineering :: Information Analysis,Software Development :: Quality Assurance,Text Processing :: Linguistic | 4 - Beta | Bears for coala (Code Analysis Application) |
lepl | 1431 | Software Development,Software Development :: Libraries,Software Development :: Libraries :: Python Modules,Text Processing,Text Processing :: Filters,Text Processing :: General,Utilities | 5 - Production/Stable | A Parser Library for Python 3 (and 2.6): Recursive Descent; Full Backtracking |
humanreadable | 1337 | Software Development :: Libraries,Software Development :: Libraries :: Python Modules,Text Processing | 3 - Alpha | DESCRIPTION |
seqdiag | 1333 | Software Development,Software Development :: Documentation,Text Processing :: Markup | 3 - Alpha | seqdiag generate sequence-diagram image file from spec-text file. |
opencc | 1315 | Scientific/Engineering,Software Development,Software Development :: Libraries,Software Development :: Libraries :: Python Modules,Software Development :: Localization,Text Processing :: Linguistic | 5 - Production/Stable | Conversion between Traditional and Simplified Chinese |
js-regex | 1294 | Software Development :: Testing,Text Processing | 4 - Beta | A thin compatibility layer to use Javascript regular expressions in Python |
base58check | 1261 | Internet :: WWW/HTTP,Security :: Cryptography,Software Development,Software Development :: Libraries :: Python Modules,Text Processing,Utilities | 4 - Beta | Base58check encoding and decoding of binary data |
cheetah | 1261 | Internet :: WWW/HTTP,Internet :: WWW/HTTP :: Dynamic Content,Internet :: WWW/HTTP :: Site Management,Software Development :: Code Generators,Software Development :: Libraries :: Python Modules,Software Development :: User Interfaces,Text Processing | 5 - Production/Stable | Cheetah is a template engine and code generation tool. |
pyxdameraulevenshtein | 1256 | Scientific/Engineering :: Bio-Informatics,Scientific/Engineering :: Information Analysis,Text Processing :: Linguistic | 5 - Production/Stable | pyxDamerauLevenshtein implements the Damerau-Levenshtein (DL) edit distance algorithm for Python in Cython for high performance. |
rst2pdf | 1255 | Documentation,Software Development :: Libraries :: Python Modules,Text Processing,Utilities | 4 - Beta | Convert restructured text to PDF via reportlab. |
vdf | 1252 | Software Development :: Libraries :: Python Modules,Text Processing | 5 - Production/Stable | Library for working with Valve's VDF text format |
textacy | 1240 | Text Processing :: Linguistic | 4 - Beta | Higher-level text processing, built on Spacy |
rinohtype | 1239 | Printing,Software Development :: Libraries :: Python Modules,Text Processing :: Fonts,Text Processing :: Markup,Text Processing :: Markup :: XML | 4 - Beta | The Python document processor |
pymorphy2-dicts | 1221 | Scientific/Engineering :: Information Analysis,Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic | 4 - Beta | OpenCorpora.org dictionaries pre-compiled for pymorphy2 |
stanfordcorenlp | 1201 | Scientific/Engineering,Scientific/Engineering :: Artificial Intelligence,Scientific/Engineering :: Information Analysis,Text Processing,Text Processing :: Linguistic | 5 - Production/Stable | Python wrapper for Stanford CoreNLP. |
pystempel | 1199 | Scientific/Engineering,Scientific/Engineering :: Artificial Intelligence,Scientific/Engineering :: Human Machine Interfaces,Scientific/Engineering :: Information Analysis,Text Processing,Text Processing :: General,Text Processing :: Linguistic | No status | Polish stemmer. |
rinoh-typeface-texgyrepagella | 1199 | Text Processing :: Fonts | No status | TeX Gyre Pagella typeface |
rinoh-typeface-texgyrecursor | 1199 | Text Processing :: Fonts | No status | TeX Gyre Cursor typeface |
rinoh-typeface-texgyreheros | 1198 | Text Processing :: Fonts | No status | TeX Gyre Heros typeface |
rinoh-typeface-dejavuserif | 1184 | Text Processing :: Fonts | No status | DejaVu Serif typeface |
syntok | 1172 | Scientific/Engineering :: Information Analysis,Software Development :: Libraries,Text Processing,Text Processing :: Linguistic | 3 - Alpha | sentence segmentation and word tokenization toolkit |
tableprint | 1163 | Text Processing :: General | 3 - Alpha | Pretty ASCII printing of tabular data |
pycm | 1079 | Text Processing :: Fonts | 3 - Alpha | Confusion Matrix In Python |
recurrent | 1077 | Text Processing :: Linguistic | No status | Natural language parsing and formatting of recurring events |
wordfreq | 1074 | Scientific/Engineering,Software Development,Text Processing :: Linguistic | No status | wordfreq is a Python library for looking up the frequencies of words in many |
pylatex | 1069 | Software Development :: Code Generators,Text Processing :: Markup :: LaTeX | 4 - Beta | A Python library for creating LaTeX files |
acora | 1067 | Text Processing | No status | Fast multi-keyword search engine for text strings |
sas7bdat | 1065 | Text Processing,Utilities | 5 - Production/Stable | A sas7bdat file reader for Python |
naturalsort | 1040 | Scientific/Engineering :: Human Machine Interfaces,Software Development,Software Development :: Libraries :: Python Modules,System :: Systems Administration,Text Processing :: General | 6 - Mature | Simple natural order sorting API for Python |
markdown-strings | 1040 | Text Processing :: Markup | 5 - Production/Stable | Create markdown formatted text |
epitran | 1034 | Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic | No status | Tools for transcribing languages into IPA. |
money-parser | 1029 | Text Processing,Text Processing :: General | No status | Price and currency parsing utility |
spark | 1022 | Internet :: WWW/HTTP,Internet :: WWW/HTTP :: Dynamic Content,Software Development :: Code Generators,Software Development :: Libraries :: Python Modules,Text Processing | 3 - Alpha | A Super-Small, Super-Fast, and Super-Easy web framework |
pytest-codeblocks | 1012 | Text Processing | 4 - Beta | Extract code blocks from markdown |
semstr | 1000 | Text Processing :: Linguistic | 4 - Beta | Scheme Evaluation and Mapping for Structural Text Representation |
subword-nmt | 998 | Scientific/Engineering :: Artificial Intelligence,Text Processing | No status | Unsupervised Word Segmentation for Neural Machine Translation and Text Generation |
abydos | 997 | Scientific/Engineering :: Artificial Intelligence,Software Development :: Libraries :: Python Modules,Text Processing :: Indexing,Text Processing :: Linguistic | 4 - Beta | Abydos NLP/IR library |
ultimate-sitemap-parser | 988 | Internet :: WWW/HTTP :: Indexing/Search,Text Processing :: Indexing,Text Processing :: Markup :: XML | 3 - Alpha | Ultimate Sitemap Parser |
docx-mailmerge | 975 | Text Processing | No status | Performs a Mail Merge on docx (Microsoft Office Word) files |
wordfilter | 972 | Communications,Text Processing :: Linguistic | No status | A small module meant for use in text generators that lets you filter strings for bad words. |
rtfde | 971 | Text Processing :: Markup :: HTML,Text Processing :: Filters,Communications :: Email :: Filters | No status | A library for extracting HTML content from RTF encapsulated HTML as commonly found in the exchange MSG email format. |
mwtypes | 953 | Scientific/Engineering,Software Development :: Libraries :: Python Modules,Text Processing :: General,Utilities | No status | A set of types for processing MediaWiki data. |
isbnlib | 931 | Software Development :: Libraries :: Python Modules,Text Processing :: General | 5 - Production/Stable | Extract, clean, transform, hyphenate and metadata for ISBNs (International Standard Book Number). |
pweave | 921 | Text Processing :: Markup | 4 - Beta | Literate programming with reST or LaTeX in Sweave style |
xmlunittest | 918 | Software Development :: Libraries :: Python Modules,Software Development :: Testing,Text Processing :: Markup :: XML | 4 - Beta | Library using lxml and unittest for unit testing XML. |
datrie | 907 | Scientific/Engineering :: Information Analysis,Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic | 4 - Beta | Super-fast, efficiently stored Trie for Python. |
django-autoslug-iplweb | 904 | Software Development :: Libraries :: Python Modules,Text Processing :: General | 5 - Production/Stable | An automated slug field for Django. |
tailhead | 884 | Software Development :: Libraries :: Python Modules,System :: Logging,System :: Systems Administration,System :: System Shells,Text Processing | No status | tailhead is a simple implementation of GNU tail and head. |
camel-tools | 883 | Scientific/Engineering,Scientific/Engineering :: Artificial Intelligence,Scientific/Engineering :: Information Analysis,Text Processing,Text Processing :: Linguistic | 5 - Production/Stable | A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi. |
scrubadub | 881 | Software Development :: Libraries,Scientific/Engineering :: Information Analysis,Text Processing,Utilities | 5 - Production/Stable | Clean personally identifiable information from dirty dirty text. |
weighted-levenshtein | 874 | Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic | 3 - Alpha | Library providing functions to calculate Levenshtein distance, Optimal String Alignment distance, and Damerau-Levenshtein distance, where the cost of each operation can be weighted by letter. |
pydocx | 874 | Text Processing :: Markup :: HTML,Text Processing :: Markup :: XML | 3 - Alpha | docx (OOXML) to html converter |
es2csv | 873 | Database,Internet,System :: Systems Administration,Text Processing,Utilities | 5 - Production/Stable | A CLI tool for exporting data from Elasticsearch into a CSV file. |
nagisa | 871 | Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic | No status | A Japanese tokenizer based on recurrent neural networks |
humps | 855 | Software Development,Text Editors,Text Editors :: Word Processors,Text Processing,Text Processing :: General,Utilities | 4 - Beta | camelCase converter |
sphinx-intl | 852 | Documentation,Software Development :: Documentation,Text Processing :: General,Utilities | 4 - Beta | Sphinx utility that make it easy to translate and to apply translation. |
pytest-reporter-html1 | 846 | Software Development :: Testing,Text Processing :: Markup :: HTML | 4 - Beta | A basic HTML report template for Pytest |
optimus | 842 | Internet :: WWW/HTTP,Internet :: WWW/HTTP :: Site Management,Software Development :: Libraries :: Python Modules,Text Processing :: Markup | 5 - Production/Stable | A simple building environment to produce static HTML from Jinja templates and
with assets compress managing with webassets |
ocrmypdf | 835 | Scientific/Engineering :: Image Recognition,Text Processing :: Indexing,Text Processing :: Linguistic | 5 - Production/Stable | OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched |
docraptor | 819 | Internet :: WWW/HTTP :: Dynamic Content,Multimedia :: Graphics :: Presentation,Office/Business,Office/Business :: Financial :: Spreadsheet,Printing,Text Processing :: Markup :: HTML | 5 - Production/Stable | A wrapper for the DocRaptor HTML to PDF/XLS service. |
documenttemplate | 817 | Text Processing :: Markup | 6 - Mature | Document Templating Markup Language (DTML) |
dexml | 813 | Text Processing,Text Processing :: Markup,Text Processing :: Markup :: XML | 5 - Production/Stable | a dead-simple Object-XML mapper for Python |
postal | 810 | Internet :: WWW/HTTP :: Indexing/Search,Scientific/Engineering :: GIS,Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic | No status | Python bindings to libpostal for fast international address parsing/normalization |
pytidylib | 805 | Text Processing :: Markup :: HTML,Text Processing :: Markup :: XML,Utilities | 5 - Production/Stable | Python wrapper for HTML Tidy (tidylib) on Python 2 and 3 |
fake-bpy-module-latest | 800 | Multimedia :: Graphics :: 3D Modeling,Multimedia :: Graphics :: 3D Rendering,Text Editors :: Integrated Development Environments (IDE) | No status | Collection of the fake Blender Python API module for the code completion. |
actdiag | 794 | Software Development,Software Development :: Documentation,Text Processing :: Markup | 3 - Alpha | actdiag generate activity-diagram image file from spec-text file. |
json-diff | 786 | Software Development :: Libraries :: Python Modules,Text Processing :: General | 4 - Beta | Generates diff between two JSON files |
pyrush | 779 | Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic | 3 - Alpha | A fast implementation of RuSH (Rule-based sentence Segmenter using Hashing). |
pysrt | 766 | Multimedia :: Video,Software Development :: Libraries,Text Processing :: Markup | 4 - Beta | SubRip (.srt) subtitle parser and writer |
breadability | 761 | Internet :: WWW/HTTP,Software Development :: Pre-processors,Text Processing :: Filters,Text Processing :: Markup :: HTML | 5 - Production/Stable | Port of Readability HTML parser in Python |
jedi-language-server | 759 | Software Development :: Code Generators,Software Development :: Libraries :: Python Modules,Text Editors :: Integrated Development Environments (IDE),Utilities | 3 - Alpha | A language server for Jedi! |
django-rest-framework-mongoengine | 744 | Internet,Internet :: WWW/HTTP :: Site Management,Software Development :: Libraries :: Python Modules,Software Development :: Testing,Text Processing :: Markup :: HTML | 5 - Production/Stable | Model Serializer that supports MongoEngine, for Django Rest Framework. |
rst2txt | 739 | Documentation,Internet :: WWW/HTTP :: Site Management,Printing,Software Development,Software Development :: Documentation,Text Processing,Text Processing :: General,Text Processing :: Markup,Utilities | 5 - Production/Stable | Convert reStructuredText to plain text |
html | 734 | Software Development :: Code Generators,Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: HTML | No status | simple, elegant HTML, XHTML and XML generation |
metadata-parser | 734 | Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: HTML | No status | A module to parse metadata out of urls and html documents |
fastdameraulevenshtein | 731 | Scientific/Engineering :: Bio-Informatics,Scientific/Engineering :: Information Analysis,Text Processing :: Linguistic | No status | Cython implementation of true Damerau-Levenshtein algorithm. |
pyrxp | 722 | Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: XML | 5 - Production/Stable | Python RXP interface - fast validating XML parser |
headerparser | 720 | Communications :: Email,Communications :: Usenet News,Internet :: WWW/HTTP,Text Processing | 4 - Beta | argparse for mail-style headers |
area4 | 712 | Software Development :: Libraries,Software Development :: Libraries :: Python Modules,System,Terminals,Text Processing,Utilities | 5 - Production/Stable | Dividers in Python, the easy way! |
kytea | 711 | Text Processing :: Linguistic | No status | An text analysis toolkit KyTea binding |
endesive | 708 | Communications :: Email,Multimedia :: Graphics,Office/Business,Security :: Cryptography,Software Development :: Libraries :: Python Modules,Text Processing | 4 - Beta | Library for digital signing and verification of digital signatures in mail, PDF and XML documents. |
xmpppy | 704 | Communications,Communications :: Chat,Database,Internet,Software Development :: Libraries,System :: Networking,Text Processing,Utilities | 4 - Beta | XMPP implementation in Python |
flask-htmlmin | 703 | Internet :: WWW/HTTP :: Dynamic Content,Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: HTML | No status | Minimize render templates html |
panphon | 695 | Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic | No status | Tools for using the International Phonetic Alphabet with phonological features |
sentence-splitter | 688 | Database,Internet :: WWW/HTTP :: Indexing/Search,Text Processing :: Indexing,Text Processing :: Linguistic | 3 - Alpha | Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder |
address | 683 | Software Development :: Libraries,Text Processing | No status | address is an address parsing library, taking the guesswork out of using addresses in your applications. |
hanlp | 682 | Scientific/Engineering :: Artificial Intelligence,Text Processing :: Linguistic | 3 - Alpha | HanLP: Han Language Processing |
entry-points-txt | 675 | Software Development :: Libraries :: Python Modules,Text Processing | 4 - Beta | Read & write entry_points.txt files |
kafka-helper | 674 | Text Processing :: Linguistic | 4 - Beta | Makes it easy to use the kafka-python library with Apache Kafka on Heroku |
atoma | 671 | Software Development :: Libraries,Text Processing :: Markup :: XML | 4 - Beta | Atom and RSS feed parser for Python 3 |
sumy | 665 | Education,Internet,Scientific/Engineering :: Information Analysis,Text Processing :: Filters,Text Processing :: Linguistic,Text Processing :: Markup :: HTML | 4 - Beta | Module for automatic summarization of text documents and HTML pages. |
rouge | 663 | Text Processing :: Linguistic | No status | Full Python ROUGE Score Implementation (not a wrapper) |
codechat | 662 | Software Development :: Documentation,Text Processing :: Markup | 3 - Alpha | The CodeChat system for software documentation |
pdfminer2 | 660 | Text Processing | 5 - Production/Stable | PDF parser and analyzer |
django-pagedown | 653 | Text Editors | No status | A Django app that allows the easy addition of Stack Overflow's 'PageDown' markdown editor to a django form field |
upsidedown | 650 | Text Processing | 5 - Production/Stable | "Flip" characters in a string to create an "upside-down" impression. |
inlinestyler | 645 | Communications :: Email,Text Processing :: Markup :: HTML | 5 - Production/Stable | Inlines external CSS into HTML elements. |
sphinxext-rediraffe | 639 | Documentation :: Sphinx,Documentation,Software Development :: Documentation,Text Processing,Utilities | No status | Sphinx Extension that redirects non-existent pages to working pages. |
plum-py | 633 | Text Processing | 2 - Pre-Alpha | Pack/Unpack Memory. |
fortune | 625 | Games/Entertainment,Text Processing :: Filters | No status | Python version of old BSD Unix fortune program |
svgutils | 619 | Multimedia :: Graphics :: Editors :: Vector-Based,Scientific/Engineering :: Visualization,Text Processing :: Markup | 4 - Beta | Python SVG editor |
pynlpl | 618 | Text Processing :: Linguistic | 5 - Production/Stable | PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for example the computation of n-grams, frequency lists and distributions, language models. There are also more complex data types, such as Priority Queues, and search algorithms, such as Beam Search. |
coverage2clover | 618 | Text Processing :: Markup :: XML,Software Development :: Quality Assurance,Software Development :: Testing | 5 - Production/Stable | A tool to convert python-coverage xml report to Atlassian Clover xml report format |
py3langid | 616 | Scientific/Engineering :: Artificial Intelligence,Text Processing :: Linguistic | 3 - Alpha | Fork of the language identification tool langid.py, featuring a modernized codebase and faster execution times. |
chardet2 | 595 | Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic | 4 - Beta | Character encoding auto-detection in Python 3 |
pangu | 591 | Education,Software Development :: Internationalization,Software Development :: Libraries,Software Development :: Libraries :: Python Modules,Text Processing,Text Processing :: General,Text Processing :: Linguistic,Utilities | 5 - Production/Stable | Paranoid text spacing for good readability, to automatically insert whitespace between CJK (Chinese, Japanese, Korean) and half-width characters (alphabetical letters, numerical digits and symbols). |
naya | 587 | Software Development :: Libraries,Text Processing :: Markup | 4 - Beta | A fast streaming JSON parser written in pure python |
citeurl | 577 | Text Processing :: Filters,Text Processing :: Markup :: HTML,Text Processing :: Markup :: Markdown,Internet :: WWW/HTTP :: WSGI :: Application,Software Development :: Libraries :: Python Modules | 5 - Production/Stable | an extensible tool to process legal citations in text |
fake-bge-module-latest | 573 | Multimedia :: Graphics :: 3D Modeling,Multimedia :: Graphics :: 3D Rendering,Text Editors :: Integrated Development Environments (IDE) | No status | Collection of the fake Blender Game Engine (BGE) Python API module for the code completion. |
mkdocs-git-authors-plugin | 568 | Documentation,Text Processing | No status | Mkdocs plugin to display git authors of a page |
clam | 565 | Internet :: WWW/HTTP :: WSGI :: Application,Text Processing :: Linguistic | 4 - Beta | Turns command-line NLP tools into fully-fledged RESTful webservices with a auto-generated web-interface for human end-users. |
case-converter | 563 | Text Processing,Utilities | 5 - Production/Stable | A string case conversion package. |
excel2json-3 | 561 | Internet :: WWW/HTTP :: Indexing/Search,Software Development :: Libraries :: Python Modules,System :: Archiving :: Packaging,Text Processing | 1 - Planning | Convert MS Excel file formats to JSON |
uniseg | 560 | Text Processing | 4 - Beta | Determine Unicode text segmentations |
pystemmer | 559 | Database,Internet :: WWW/HTTP :: Indexing/Search,Text Processing :: Indexing,Text Processing :: Linguistic | 5 - Production/Stable | Snowball stemming algorithms, for information retrieval |
tree-sitter | 547 | Software Development :: Compilers,Text Processing :: Linguistic | No status | Python bindings to the Tree-sitter parsing library |
wikitextparser | 545 | Text Processing | No status | A simple parsing tool for MediaWiki's wikitext markup. |
pysubs2 | 525 | Software Development :: Libraries :: Python Modules,Text Processing :: Markup,Multimedia :: Video | 5 - Production/Stable | A library for editing subtitle files |
gender | 524 | Text Processing,Text Processing :: Filters | 3 - Alpha | Get gender from name and email address |
coinaddrvalidator | 518 | Internet :: WWW/HTTP,Software Development,Security :: Cryptography,Text Processing,Utilities,Software Development :: Libraries :: Python Modules | 4 - Beta | A crypto-currency address inspection/validation library. |