charset-normalizer6497378Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic,Utilities No statusThe Real First Universal Charset Detector. Open, modern and actively maintained alternative to Chardet.
docutils2587648Documentation,Software Development :: Documentation,Text Processing 4 - BetaDocutils -- Python Documentation Utilities
chardet2167450Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic 5 - Production/StableUniversal encoding detector for Python 3
lxml2074641Text Processing :: Markup :: HTML,Text Processing :: Markup :: XML,Software Development :: Libraries :: Python Modules 5 - Production/StablePowerful and Pythonic XML processing library combining libxml2/libxslt with the ElementTree API.
pyyaml6040296Software Development :: Libraries :: Python Modules,Text Processing :: Markup 5 - Production/StableYAML parser and emitter for Python
pygments1370394Text Processing :: Filters,Utilities 6 - MaturePygments is a syntax highlighting package written in Python.
regex871433Scientific/Engineering :: Information Analysis,Software Development :: Libraries :: Python Modules,Text Processing,Text Processing :: General 5 - Production/StableAlternative regular expression module, to replace re.
markdown737793Communications :: Email :: Filters,Internet :: WWW/HTTP :: Dynamic Content :: CGI Tools/Libraries,Internet :: WWW/HTTP :: Site Management,Software Development :: Documentation,Software Development :: Libraries :: Python Modules,Text Processing :: Filters,Text Processing :: Markup :: HTML,Text Processing :: Markup :: Markdown 5 - Production/StablePython implementation of Markdown.
defusedxml702240Text Processing :: Markup :: XML 5 - Production/StableXML bomb protection for Python stdlib modules
xmltodict692842Text Processing :: Markup :: XML No statusMakes working with XML feel like you are working with JSON
jedi575564Software Development :: Libraries :: Python Modules,Text Editors :: Integrated Development Environments (IDE),Utilities 4 - BetaAn autocompletion tool for Python that can be used for text editors.
parso571092Software Development :: Libraries :: Python Modules,Text Editors :: Integrated Development Environments (IDE),Utilities 4 - BetaA Python Parser
nltk447875Scientific/Engineering,Scientific/Engineering :: Artificial Intelligence,Scientific/Engineering :: Human Machine Interfaces,Scientific/Engineering :: Information Analysis,Text Processing,Text Processing :: Filters,Text Processing :: General,Text Processing :: Indexing,Text Processing :: Linguistic 5 - Production/StableNatural Language Toolkit
mistune413606Text Processing :: Markup 4 - BetaA sane Markdown parser with useful plugins and renderers
humanfriendly391351Communications,Scientific/Engineering :: Human Machine Interfaces,Software Development,Software Development :: Libraries :: Python Modules,Software Development :: User Interfaces,System :: Shells,System :: System Shells,System :: Systems Administration,Terminals,Text Processing :: General,Text Processing :: Linguistic,Utilities 6 - MatureHuman friendly output for text interfaces using Python
unidecode390692Text Processing,Text Processing :: Filters No statusASCII transliterations of Unicode text
pandocfilters342327Text Processing :: Filters 3 - AlphaUtilities for writing pandoc filters in python
tinycss2318584Text Processing 5 - Production/Stabletinycss2
snowballstemmer313601Database,Internet :: WWW/HTTP :: Indexing/Search,Text Processing :: Indexing,Text Processing :: Linguistic 5 - Production/StableThis package provides 29 stemmers for 28 languages generated from Snowball algorithms.
sphinx294922Documentation,Documentation :: Sphinx,Internet :: WWW/HTTP :: Site Management,Printing,Software Development,Software Development :: Documentation,Text Processing,Text Processing :: General,Text Processing :: Indexing,Text Processing :: Markup,Text Processing :: Markup :: HTML,Text Processing :: Markup :: LaTeX,Utilities 5 - Production/StablePython documentation generator
prettytable230490Text Processing No statusA simple Python library for easily displaying tabular data in a visually appealing ASCII table format
sphinxcontrib-htmlhelp209172Documentation,Documentation :: Sphinx,Text Processing,Utilities 5 - Production/Stablesphinxcontrib-htmlhelp is a sphinx extension which renders HTML help files
humanize154685Text Processing,Text Processing :: General 5 - Production/StablePython humanize utilities
texttable137487Software Development :: Libraries :: Python Modules,Text Processing,Utilities 5 - Production/Stablemodule for creating simple ASCII tables
pypandoc128106Text Processing,Text Processing :: Filters 4 - BetaThin wrapper for pandoc.
docstring-parser121934Documentation :: Sphinx,Text Processing :: Markup,Software Development :: Libraries :: Python Modules 4 - Beta"Parse Python docstrings in reST, Google and Numpydoc format"
natsort114580Scientific/Engineering :: Information Analysis,Utilities,Text Processing 5 - Production/StableSimple yet flexible natural sorting in Python.
jellyfish99910Text Processing :: Linguistic 4 - Betaa library for doing approximate and phonetic matching of strings.
markdown-it-py93965Software Development :: Libraries :: Python Modules,Text Processing :: Markup 3 - AlphaPython port of markdown-it. Markdown parsing, done right!
mdit-py-plugins82218Software Development :: Libraries :: Python Modules,Text Processing :: Markup 3 - AlphaCollection of plugins for markdown-it-py
urlextract72726Text Processing,Text Processing :: Markup :: HTML,Software Development :: Libraries :: Python Modules 5 - Production/StableCollects and extracts URLs from given text.
terminaltables71530Software Development :: Libraries,Terminals,Text Processing :: Markup 5 - Production/StableGenerate simple tables in terminals from a nested list of strings. (has wheels!)
reportlab70085Printing,Text Processing :: Markup 5 - Production/StableThe Reportlab Toolkit
elementpath62591Software Development :: Libraries,Text Processing :: Markup :: XML 5 - Production/StableXPath 1.0/2.0/3.0 parsers and selectors for ElementTree and lxml
feedparser58924Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: XML 5 - Production/StableUniversal feed parser, handles RSS 0.9x, RSS 1.0, RSS 2.0, CDF, Atom 0.3, and Atom 1.0 feeds
xmlschema56207Software Development :: Libraries,Text Processing :: Markup :: XML 5 - Production/StableAn XML Schema validator and decoder
python-crfsuite47885Software Development,Software Development :: Libraries :: Python Modules,Scientific/Engineering,Scientific/Engineering :: Information Analysis,Text Processing :: Linguistic 4 - BetaPython binding for CRFsuite
mkdocs42324Documentation,Text Processing 5 - Production/StableProject documentation with Markdown.
parsimonious39288Scientific/Engineering :: Information Analysis,Software Development :: Libraries,Text Processing :: General 3 - Alpha(Soon to be) the fastest pure-Python PEG parser I could muster
junitparser38525Text Processing 5 - Production/StableManipulates JUnit/xUnit Result XML files
xmlsec36542Text Processing :: Markup :: XML 5 - Production/StablePython bindings for the XML Security Library
pymdown-extensions35209Internet :: WWW/HTTP :: Dynamic Content,Software Development :: Libraries :: Python Modules,Text Processing :: Filters,Text Processing :: Markup :: HTML 5 - Production/StableExtension pack for Python Markdown.
pyphen32889Text Processing,Text Processing :: Linguistic 4 - BetaPure Python module to hyphenate text
mkdocs-material-extensions29980Internet :: WWW/HTTP :: Dynamic Content,Software Development :: Libraries :: Python Modules,Text Processing :: Filters,Text Processing :: Markup :: HTML 5 - Production/StableExtension pack for Python Markdown.
tangled-up-in-unicode29669Database,Scientific/Engineering :: Information Analysis,Text Processing,Text Processing :: General,Utilities No statusAccess to the Unicode Character Database (UCD)
myst-parser28926Software Development :: Libraries :: Python Modules,Text Processing :: Markup 4 - BetaAn extended commonmark compliant parser, with bridges to docutils & sphinx.
weasyprint28271Internet :: WWW/HTTP,Text Processing :: Markup :: HTML,Multimedia :: Graphics :: Graphics Conversion,Printing 5 - Production/StableThe Awesome Document Factory
num2words27493Software Development :: Internationalization,Software Development :: Libraries :: Python Modules,Software Development :: Localization,Text Processing :: Linguistic 5 - Production/StableModules to convert numbers to words. Easily extensible.
dominate25459Internet :: WWW/HTTP :: Dynamic Content,Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: HTML No statusDominate is a Python library for creating and manipulating HTML documents using an elegant DOM API.
anyconfig24988Software Development :: Libraries :: Python Modules,Text Processing :: Markup,Utilities 4 - BetaLibrary provides common APIs to load and dump configuration files in various formats
pdfkit23538Text Processing,Text Processing :: General,Text Processing :: Markup,Text Processing :: Markup :: HTML,Text Processing :: Markup :: XML,Utilities No statusWkhtmltopdf python wrapper to convert html to pdf using the webkit rendering engine and qt
colorclass22793Software Development :: Libraries,Terminals,Text Processing :: Markup 5 - Production/StableColorful worry-free console applications for Linux, Mac OS X, and Windows.
markdown222282Software Development :: Libraries :: Python Modules,Software Development :: Documentation,Text Processing :: Filters,Text Processing :: Markup :: HTML 5 - Production/StableA fast and complete Python implementation of Markdown
pathvalidate21173Software Development :: Libraries,Software Development :: Libraries :: Python Modules,System :: Filesystems,Text Processing 5 - Production/Stablepathvalidate is a Python library to sanitize/validate a string such as filenames/file-paths/etc.
python-stdnum20634Office/Business :: Financial,Software Development :: Libraries :: Python Modules,Text Processing :: General 5 - Production/StablePython module to handle standardized numbers and codes
python-benedict19304Education :: Testing,Software Development :: Build Tools,System :: Filesystems,Text Processing :: Markup :: XML,Utilities 5 - Production/Stablepython-benedict is a dict subclass with keylist/keypath support, normalized I/O operations (base64, csv, ini, json, pickle, plist, query-string, toml, xml, yaml) and many utilities... for humans, obviously.
sphinx-tabs18157Documentation :: Sphinx,Documentation,Software Development :: Documentation,Text Processing,Utilities 5 - Production/StableTabbed views for Sphinx
anyascii17683Text Processing,Text Processing :: Linguistic No statusUnicode to ASCII transliteration
rjsmin16602Internet :: WWW/HTTP :: Dynamic Content,Software Development :: Libraries,Software Development :: Libraries :: Python Modules,Text Processing,Text Processing :: Filters,Utilities 5 - Production/StableJavascript Minifier
rcssmin16003Internet :: WWW/HTTP :: Dynamic Content,Software Development :: Libraries,Software Development :: Libraries :: Python Modules,Text Processing,Text Processing :: Filters,Utilities 5 - Production/StableCSS Minifier
jinja24726799Internet :: WWW/HTTP :: Dynamic Content,Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: HTML 5 - Production/StableA small but fast and easy to use stand-alone template engine written in pure python.
markupsafe4122239Internet :: WWW/HTTP :: Dynamic Content,Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: HTML 5 - Production/StableImplements a XML/HTML/XHTML Markup safe string for Python
beautifulsoup41750144Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: HTML,Text Processing :: Markup :: SGML,Text Processing :: Markup :: XML 5 - Production/StableScreen-scraping library
fonttools503237Text Processing :: Fonts,Multimedia :: Graphics,Multimedia :: Graphics :: Graphics Conversion 5 - Production/StableTools to manipulate font files
html5lib468819Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: HTML 5 - Production/StableHTML parser based on the WHATWG HTML specification
text-unidecode423272Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic 5 - Production/StableThe most basic Text::Unidecode port
commonmark362012Documentation,Internet :: WWW/HTTP :: Dynamic Content,Software Development :: Documentation,Software Development :: Libraries :: Python Modules,Text Processing :: Markup,Utilities 4 - BetaPython parser for the CommonMark Markdown spec
bs4324087Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: HTML,Text Processing :: Markup :: SGML,Text Processing :: Markup :: XML 5 - Production/StableDummy package for Beautiful Soup
brotli283796Software Development :: Libraries,Software Development :: Libraries :: Python Modules,System :: Archiving,System :: Archiving :: Compression,Text Processing :: Fonts,Utilities 4 - BetaPython bindings for the Brotli compression library
sphinxcontrib-serializinghtml217443Documentation,Documentation :: Sphinx,Text Processing,Utilities 5 - Production/Stablesphinxcontrib-serializinghtml is a sphinx extension which outputs "serialized" HTML files (json and pickle).
sphinxcontrib-jsmath209392Documentation,Documentation :: Sphinx,Text Processing,Utilities 5 - Production/StableA sphinx extension which renders display math in HTML via JavaScript
sphinxcontrib-devhelp209302Documentation,Documentation :: Sphinx,Text Processing,Utilities 5 - Production/Stablesphinxcontrib-devhelp is a sphinx extension which outputs Devhelp document.
sphinxcontrib-applehelp209302Documentation,Documentation :: Sphinx,Text Processing,Utilities 5 - Production/Stablesphinxcontrib-applehelp is a sphinx extension which outputs Apple help books
sphinxcontrib-qthelp209267Documentation,Documentation :: Sphinx,Text Processing,Utilities 5 - Production/Stablesphinxcontrib-qthelp is a sphinx extension which outputs QtHelp document.
gensim194153Scientific/Engineering :: Artificial Intelligence,Scientific/Engineering :: Information Analysis,Text Processing :: Linguistic 5 - Production/StablePython framework for fast Vector Space Modelling
sentencepiece178320Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic 5 - Production/StableSentencePiece python wrapper
pytimeparse149235Text Processing 4 - BetaTime expression parser
netaddr134000Communications,Documentation,Education,Education :: Testing,Home Automation,Internet,Internet :: Log Analysis,Internet :: Name Service (DNS),Internet :: Proxy Servers,Internet :: WWW/HTTP,Internet :: WWW/HTTP :: Indexing/Search,Internet :: WWW/HTTP :: Site Management,Security,Software Development,Software Development :: Libraries,Software Development :: Libraries :: Python Modules,Software Development :: Quality Assurance,Software Development :: Testing,Software Development :: Testing :: Traffic Generation,System :: Benchmark,System :: Clustering,System :: Distributed Computing,System :: Installation/Setup,System :: Logging,System :: Monitoring,System :: Networking,System :: Networking :: Firewalls,System :: Networking :: Monitoring,System :: Networking :: Time Synchronization,System :: Recovery Tools,System :: Shells,System :: Software Distribution,System :: Systems Administration,System :: System Shells,Text Processing,Text Processing :: Filters,Utilities 5 - Production/StablePythonic manipulation of IPv4, IPv6, CIDR, EUI and MAC network addresses
coloredlogs130672System :: Logging,Terminals,Text Processing :: Markup :: HTML 5 - Production/StableColored stream handler for the logging module
parsedatetime110047Software Development :: Libraries :: Python Modules,Text Processing 5 - Production/StableParse human-readable date/time text.
inflect75065Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic 3 - AlphaCorrectly generate plurals, singular nouns, ordinals, indefinite articles; convert numbers to words
intervaltree61650Scientific/Engineering :: Artificial Intelligence,Scientific/Engineering :: Bio-Informatics,Scientific/Engineering :: Information Analysis,Software Development :: Libraries,Text Processing :: General,Text Processing :: Linguistic,Text Processing :: Markup 4 - BetaEditable interval tree data structure for Python 2 and 3
ftfy60690Software Development :: Libraries :: Python Modules,Text Processing :: Filters 5 - Production/StableFixes some problems with Unicode text after the fact
htmlmin53005Text Processing :: Markup :: HTML 4 - BetaAn HTML Minifier
lark-parser51789Software Development :: Libraries :: Python Modules,Text Processing :: General,Text Processing :: Linguistic 5 - Production/Stablea modern parsing library
parsel51067Text Processing :: Markup,Text Processing :: Markup :: HTML,Text Processing :: Markup :: XML 5 - Production/StableParsel is a library to extract data from HTML and XML using XPath and CSS selectors
mkdocs-material46316Documentation,Software Development :: Documentation,Text Processing :: Markup :: HTML 5 - Production/StableA Material Design theme for MkDocs
lark42894Software Development :: Libraries :: Python Modules,Text Processing :: General,Text Processing :: Linguistic5 - Production/Stablea modern parsing library
cssutils39279Internet,Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: HTML4 - BetaA CSS Cascading Style Sheets library for Python
nameparser38673Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic5 - Production/StableA simple Python module for parsing human names into their individual components.
textblob30376Text Processing :: Linguistic4 - BetaSimple, Pythonic text processing. Sentiment analysis, POS tagging, noun phrase parsing, and more.
url-normalize30235Internet,Software Development :: Libraries :: Python Modules,Text Processing :: Indexing,UtilitiesNo statusURL normalization for Python (python3)
ansiwrap30126Software Development :: Libraries :: Python Modules,Text Processing,Text Processing :: Filters4 - Betatextwrap, but savvy to ANSI colors and styles
textwrap329969Software Development :: Libraries :: Python Modules,Text Processing,Text Processing :: Filters4 - Betatextwrap from Python 3.6 backport (plus a few tweaks)
pyxb29845Software Development :: Code Generators,Text Processing :: Markup :: XML5 - Production/StablePython XML Schema Bindings
pyfiglet29530Text Processing,Text Processing :: Fonts5 - Production/StablePure-python FIGlet implementation
diff-match-patch27210Text Processing6 - MatureThe Diff Match and Patch libraries offer robust algorithms to perform the operations required for synchronizing plain text.
ebcdic25632Text Processing4 - BetaAdditional EBCDIC codecs
polib24055Software Development :: Internationalization,Software Development :: Libraries :: Python Modules,Software Development :: Localization,Text Processing :: Linguistic4 - BetaA library to manipulate gettext files (po and mo files).
jsmin22775Internet :: WWW/HTTP :: Dynamic Content,Software Development :: Pre-processors,Text Processing :: Filters5 - Production/StableJavaScript minifier.
chevron21771Text Processing :: Markup4 - BetaMustache templating language renderer
sphinxcontrib-websupport20889Documentation,Documentation :: Sphinx,Text Processing,Utilities5 - Production/StableSphinx API for Web Apps
pattern18968Internet :: WWW/HTTP :: Indexing/Search,Multimedia :: Graphics,Scientific/Engineering :: Artificial Intelligence,Scientific/Engineering :: Visualization,Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: HTML5 - Production/StableWeb mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.
sacrebleu18596Scientific/Engineering,Scientific/Engineering :: Artificial Intelligence,Text Processing5 - Production/StableHassle-free computation of shareable, comparable, and reproducible BLEU scores
restructuredtext-lint17356Text Processing :: Markup3 - AlphareStructuredText linter
python-bidi16587Text Processing4 - BetaPure python implementation of the BiDi layout algorithm
beautifulsoup15895Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: HTML,Text Processing :: Markup :: SGML,Text Processing :: Markup :: XML5 - Production/StableScreen-scraping library
jieba15883Text Processing,Text Processing :: Indexing,Text Processing :: LinguisticNo statusChinese Words Segmentation Utilities
stanza15296Scientific/Engineering,Scientific/Engineering :: Artificial Intelligence,Scientific/Engineering :: Information Analysis,Text Processing,Text Processing :: Linguistic,Software Development,Software Development :: Libraries4 - BetaA Python NLP Library for Many Human Languages, by the Stanford NLP Group
langid15103Scientific/Engineering :: Artificial Intelligence,Text Processing :: Linguistic5 - Production/ is a standalone Language Identification (LangID) tool.
svglib14917Documentation,Multimedia :: Graphics :: Graphics Conversion,Printing,Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: XML,Utilities4 - BetaAn experimental library for reading and converting SVG.
xhtml2pdf14762Documentation,Multimedia,Office/Business,Printing,Text Processing,Text Processing :: Filters,Text Processing :: Fonts,Text Processing :: General,Text Processing :: Indexing,Text Processing :: Markup,Text Processing :: Markup :: HTML,Text Processing :: Markup :: XML,Utilities4 - BetaPDF generator using HTML and CSS
cheetah314560Internet :: WWW/HTTP,Internet :: WWW/HTTP :: Dynamic Content,Internet :: WWW/HTTP :: Site Management,Software Development :: Code Generators,Software Development :: Libraries :: Python Modules,Software Development :: User Interfaces,Text Processing4 - BetaCheetah is a template engine and code generation tool
ptable13843Text ProcessingNo statusA simple Python library for easily displaying tabular data in a visually appealing ASCII table format
parsy13789Software Development :: Compilers,Software Development :: Interpreters,Text Processing5 - Production/Stableeasy-to-use parser combinators, for parsing in pure Python
pyahocorasick13752Software Development :: Libraries,Text Editors :: Text Processing5 - Production/Stablepyahocorasick is a fast and memory efficient library for exact or approximate multi-pattern string search. With the ahocorasick.Automaton class, you can find multiple key strings occurrences at once in some input text. You can use it as a plain dict-like Trie or convert a Trie to an automaton for efficient Aho-Corasick search. Implemented in C and tested on Python 2.7 and 3.4+. Works on Linux, Mac and Windows. BSD-3-clause license.
ansi2html13470Text Processing,Text Processing :: Markup,Text Processing :: Markup :: HTML5 - Production/StableConvert text with ANSI color codes to HTML
demoji13466Text Processing,Utilities3 - AlphaAccurately remove and replace emojis in text strings.
publicsuffixlist13389Internet :: Name Service (DNS),Text Processing :: Filters4 - Betapublicsuffixlist implement
htmldate12704Scientific/Engineering,Scientific/Engineering :: Information Analysis,Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic,Text Processing :: Markup :: HTML4 - BetaThis module can handle all the steps needed from web page download to HTML parsing, including scraping and textual analysis. Its goal is to find the creation date of a page all common structural patterns, text-based heuristics and robust date extraction. It takes URLs, HTML files or trees as input and outputs a date.
segtok12594Scientific/Engineering :: Information Analysis,Software Development :: Libraries,Text Processing,Text Processing :: Linguistic3 - Alphasentence segmentation and word tokenization tools
strictyaml12565Text Processing :: Markup,Software Development :: Libraries4 - BetaStrict, typed YAML parser
jupytext12203Text Processing :: Markup5 - Production/StableJupyter notebooks as Markdown documents, Julia, Python or R scripts
pyenchant11783Software Development :: Libraries,Text Processing :: Linguistic5 - Production/StablePython bindings for the Enchant spellchecking system
breathe10708Documentation,Text Processing,Utilities4 - BetaSphinx Doxygen renderer
jaconv10529Text Processing4 - BetaPure-Python Japanese character interconverter for Hiragana, Katakana, Hankaku and Zenkaku
trafilatura10141Internet :: WWW/HTTP,Scientific/Engineering :: Information Analysis,Text Processing :: Linguistic,Text Processing :: Markup :: HTML4 - BetaDownloads web pages, scrapes main text and comments while preserving some structure, and converts to TXT, CSV, XML & TEI-XML
marisa-trie9976Scientific/Engineering :: Information Analysis,Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic4 - BetaStatic memory-efficient & fast Trie-like structures for Python (based on marisa-trie C++ library)
titlecase9885Text Processing :: Filters4 - BetaPython Port of John Gruber's
dartsclone9845Text Processing :: LinguisticNo statusPython binding of Darts Clone
pyhcl9528Text Processing5 - Production/StableHCL configuration parser for python
pyrtf39356Software Development :: Libraries :: Python Modules,Text Editors :: Text Processing4 - BetaPyRTF - Rich Text Format Document Generation
tempita8827Text Processing4 - BetaA very small text templating language
pyarabic8790Text Processing :: Linguistic5 - Production/StableArabic text tools for Python
vadersentiment8762Scientific/Engineering :: Artificial Intelligence,Scientific/Engineering :: Information Analysis,Text Processing :: General,Text Processing :: Linguistic4 - BetaVADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains.
ngram8471Text Processing,Text Processing :: Indexing,Text Processing :: Linguistic5 - Production/StableA `set` subclass providing fuzzy search based on N-grams.
mecab-python38368Software Development :: Libraries :: Python Modules,Text Processing :: LinguisticNo statuspython wrapper for mecab: Morphological Analysis engine
fpdf28289Printing,Software Development :: Libraries :: Python Modules,Text Processing :: Markup,Multimedia :: Graphics,Multimedia :: Graphics :: Presentation5 - Production/StableSimple PDF generation for Python
hachoir8187Multimedia,Scientific/Engineering :: Information Analysis,Software Development :: Disassemblers,Software Development :: Interpreters,Software Development :: Libraries :: Python Modules,System :: Filesystems,Text Processing,Utilities5 - Production/StablePackage of Hachoir parsers used to open binary files
yake8052Scientific/Engineering :: Information Analysis,Software Development :: Libraries,Text Processing,Text Processing :: Linguistic3 - AlphaKeyword extraction Python package
m2r7768Text Processing4 - BetaMarkdown and reStructuredText in a single file.
whoosh7731Software Development :: Libraries :: Python Modules,Text Processing :: Indexing3 - AlphaFast, pure-Python full text indexing, search, and spell checking library.
pdfrw7725Multimedia :: Graphics :: Graphics Conversion,Printing,Software Development :: Libraries,Text Processing,Utilities4 - BetaPDF file reader/writer library
httpie7654Internet :: WWW/HTTP,Software Development,System :: Networking,Terminals,Text Processing,Utilities5 - Production/StableHTTPie - a CLI, cURL-like tool for humans.
cdifflib7334Software Development,Text Processing :: General4 - BetaC implementation of parts of difflib
pybtex6906Text Editors :: Text Processing,Text Processing :: Markup :: LaTeX,Text Processing :: Markup :: XML4 - BetaA BibTeX-compatible bibliography processor in Python
latexcodec6809Text Processing :: Filters,Text Processing :: Markup :: LaTeX5 - Production/StableA lexer and codec to work with LaTeX code in Python.
iso-6396773Text Processing :: Linguistic5 - Production/StablePython library for ISO 639 standard
m2r26584Text Processing4 - BetaMarkdown and reStructuredText in a single file.
bashlex6410Software Development :: Libraries :: Python Modules,System :: System Shells,Text Processing4 - BetaPython parser for bash
goslate6304Software Development :: Internationalization,Software Development :: Localization,Text Processing :: Linguistic5 - Production/StableGoslate: Free Google Translate API
mechanize6042Internet,Internet :: File Transfer Protocol (FTP),Internet :: WWW/HTTP,Internet :: WWW/HTTP :: Browsers,Internet :: WWW/HTTP :: Indexing/Search,Internet :: WWW/HTTP :: Site Management,Internet :: WWW/HTTP :: Site Management :: Link Checking,Software Development :: Libraries,Software Development :: Libraries :: Python Modules,Software Development :: Testing,Software Development :: Testing :: Traffic Generation,System :: Archiving :: Mirroring,System :: Networking :: Monitoring,System :: Systems Administration,Text Processing,Text Processing :: Markup,Text Processing :: Markup :: HTML,Text Processing :: Markup :: XML5 - Production/StableStateful programmatic web browsing.
pansi6012Software Development,Software Development :: Libraries,System :: Console Fonts,System :: Shells,Terminals,Terminals :: Terminal Emulators/X Terminals,Text Processing,Text Processing :: Markup,Utilities1 - PlanningANSI escape code library for Python
myst-nb5978Software Development :: Libraries :: Python Modules,Text Processing :: Markup3 - AlphaA Jupyter Notebook Sphinx reader built on top of the MyST markdown parser.
draftjs-exporter5917Internet :: WWW/HTTP,Internet :: WWW/HTTP :: Dynamic Content,Internet :: WWW/HTTP :: Site Management,Software Development :: Libraries :: Application Frameworks,Software Development :: Libraries :: Python Modules,Text Editors :: Word ProcessorsNo statusLibrary to convert the Facebook Draft.js editor's raw ContentState to HTML
tinysegmenter5859Scientific/Engineering :: Artificial Intelligence,Scientific/Engineering :: Information Analysis,Text Processing :: Linguistic4 - BetaVery compact Japanese tokenizer
mdformat5825Documentation,Software Development :: Libraries :: Python Modules,Text Processing :: MarkupNo statusCommonMark compliant Markdown formatter
pybtex-docutils5751Text Editors :: Text Processing,Text Processing :: Markup :: XML4 - BetaA docutils backend for pybtex.
tcolorpy5621Software Development :: Libraries,Software Development :: Libraries :: Python Modules,Terminals,Text Processing3 - Alphatcolopy is a Python library to apply true color for terminal text.
beautifultable5491Printing,Text Processing3 - AlphaUtility package to print visually appealing ASCII tables to terminal
parsley5469Software Development :: Code Generators,Text Processing :: General6 - MatureParsing and pattern matching made easy.
tatsu5243Software Development :: Code Generators,Software Development :: Compilers,Software Development :: Interpreters,Text Processing :: General5 - Production/StableTatSu takes a grammar in a variation of EBNF as input, and outputs a memoizing PEG/Packrat parser in Python.
tailer5223System :: Logging,Text ProcessingNo statusPython tail is a simple implementation of GNU tail and head.
vobject5143Text Processing5 - Production/StableA full-featured Python package for parsing and creating iCalendar and vCard files
imgkit5133Text Processing,Text Processing :: General,Text Processing :: Markup,Text Processing :: Markup :: HTML,Text Processing :: Markup :: XML,UtilitiesNo statusWkhtmltopdf python wrapper to convert html to image using the webkit rendering engine and qt
deep-translator5071Education,Software Development,Communications,Text Processing,Text Processing :: Linguistic5 - Production/StableA flexible python tool to translate between different languages in a simple way.
tabula-py5067Text Processing :: General5 - Production/StableSimple wrapper for tabula-java, read tables from PDF into DataFrame
pytils5020Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic4 - BetaUtils for easy processing string in russian.
pycld35013Text Processing :: Linguistic3 - AlphaCLD3 Python bindings
pypinyin4995Utilities,Text Processing5 - Production/Stable汉字拼音转换模块/工具.
bbcode4912Text Processing :: Markup5 - Production/StableA pure python bbcode parser and formatter.
readability-lxml4882Internet,Software Development :: Libraries :: Python Modules,Text Processing :: Indexing,UtilitiesNo statusfast html to text parser (article readability tool) with python3 support
xmldiff4817Text Processing :: Markup :: XML4 - BetaCreates diffs of XML files
compressed-rtf4797System :: Archiving :: Compression,Text Processing5 - Production/StableCompressed Rich Text Format (RTF) compression and decompression package
pdfminer4771Text Processing4 - BetaPDF parser and analyzer
parse-torrent-name4699Text Processing5 - Production/StableParse torrent name of a movie or TV show
in-place4505System :: Filesystems,Text Processing :: Filters4 - BetaIn-place file processing
empy4473Software Development :: Libraries :: Python Modules,Text Editors :: Text Processing,Text Processing :: Filters,Text Processing :: General,Text Processing :: Markup,Utilities6 - MatureA powerful and robust templating system for Python.
art4405Text Processing :: Fonts,Text Editors,Text Processing :: General,Utilities,Multimedia,Printing5 - Production/StableASCII Art Library For Python
pycld24396Text Processing :: Linguistic4 - BetaPython bindings around Google Chromium's embedded compact language detection library (CLD2)
pandoc4345Software Development,Text Processing5 - Production/StablePandoc Documents for Python
django-autoslug4309Software Development :: Libraries :: Python Modules,Text Processing :: General5 - Production/StableAn automated slug field for Django.
dawg-python4163Scientific/Engineering :: Information Analysis,Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic3 - AlphaPure-python reader for DAWGs created by dawgdic C++ library or DAWG Python extension.
word2number4160Scientific/Engineering :: Information Analysis,Software Development :: Libraries :: Python Modules,Text Editors :: Text Processing,Text Processing :: LinguisticNo statusConvert number words eg. three hundred and forty two to numbers (342).
smartypants4116Text Processing :: Filters5 - Production/StablePython with the SmartyPants
funcparserlib4115Software Development :: Compilers,Software Development :: Interpreters,Software Development :: Libraries,Software Development :: Libraries :: Python Modules,Text Processing5 - Production/StableRecursive descent parsing library based on functional combinators
anytemplate4062Software Development :: Libraries :: Python Modules,Text Processing :: Markup,Utilities4 - BetaA python template abstraction layer module
pymorphy24049Software Development :: Libraries :: Python Modules,Scientific/Engineering :: Information Analysis,Text Processing :: Linguistic4 - BetaMorphological analyzer (POS tagger + inflection engine) for Russian language.
jsonpath3990Software Development :: Libraries :: Python Modules,Text Processing :: Markup4 - BetaAn XPath for JSON
justext3895Internet :: WWW/HTTP,Software Development :: Pre-processors,Text Processing :: Filters,Text Processing :: Markup :: HTML5 - Production/StableHeuristic based boilerplate removal tool
unicode3753Text Editors :: Text Processing,Utilities5 - Production/StableDisplay unicode character properties
textstat3694Text ProcessingNo statusCalculate statistical features from text
mdx-truly-sane-lists3669Text Processing :: Markup,UtilitiesNo statusExtension for Python-Markdown that makes lists truly sane. Custom indents for nested lists and fix for messy linebreaks.
mdutils3638Utilities,Software Development :: Documentation,Text Processing :: Markup5 - Production/StableUseful package for creating Markdown files while executing python code.
meld33558Text Processing :: Markup :: HTML5 - Production/Stablemeld3 is an HTML/XML templating engine.
biplist3557Software Development :: Libraries :: Python Modules,Text Processing :: Markup5 - Production/Stablebiplist is a library for reading/writing binary plists.
lunr3481Text Processing3 - AlphaA Python implementation of Lunr.js
english3468Text Processing,Text Processing :: Linguistic1 - PlanningEnglish language utility library for Python
pyjson53467Text Processing :: General4 - BetaJSON5 serializer and parser for Python 3 written in Cython.
genshi3466Internet :: WWW/HTTP :: Dynamic Content,Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: HTML,Text Processing :: Markup :: XML4 - BetaA toolkit for generation of output for the web
chameleon3364Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: HTML,Text Processing :: Markup :: XMLNo statusXML-based template compiler.
dict2xml3362Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: XML5 - Production/StableSmall utility to convert a python dictionary into an XML string
pyyaml-include3302Software Development :: Libraries :: Python Modules,Text Processing :: Markup5 - Production/StableExtending PyYAML with a custom constructor for including YAML files within YAML files
pymorphy2-dicts-ru3164Scientific/Engineering :: Information Analysis,Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic4 - BetaRussian dictionaries for pymorphy2
csscompressor3061Software Development :: Build Tools,Text Processing :: General,Utilities5 - Production/StableA python port of YUI CSS Compressor
mwparserfromhell3031Text Processing :: Markup4 - BetaMWParserFromHell is a parser for MediaWiki wikicode.
blockdiag3028Software Development,Software Development :: Documentation,Text Processing :: Markup4 - Betablockdiag generate block-diagram image file from spec-text file.
plantuml-markdown2973Software Development :: Documentation,Software Development :: Libraries :: Python Modules,Text Processing :: Filters,Text Processing :: Markup :: HTML5 - Production/StableA PlantUML plugin for Markdown
sphinx-design2943Software Development :: Libraries :: Python Modules,Text Processing :: Markup3 - AlphaA sphinx extension for designing beautiful, view size responsive web components.
pylatexenc2938Scientific/Engineering,Text Processing :: General,Text Processing :: Markup :: LaTeX5 - Production/StableSimple LaTeX parser providing latex-to-unicode and unicode-to-latex conversion
lorem2893Software Development :: Libraries,Software Development :: Libraries :: Python Modules,Text Processing,Text Processing :: Linguistic4 - BetaGenerator for random text that looks like Latin.
stopwordsiso2892Text Processing :: Linguistic3 - AlphaCollection of stopwords for multiple languages. Using ISO 639-1 language code.
detect-delimiter2864Software Development :: Libraries,Text Processing,Utilities4 - BetaDetects the delimiter used in CSV, TSV and other ad hoc file formats.
courlan2862Internet :: WWW/HTTP,Scientific/Engineering :: Information Analysis,Text Processing :: Filters4 - BetaClean, filter, normalize, and sample URLs
pynliner2794Communications :: Email,Text Processing :: Markup :: HTML3 - AlphaPython CSS-to-inline-styles conversion tool for HTML using BeautifulSoup and cssutils
pyscss2790Software Development :: Code Generators,Software Development :: Libraries :: Python Modules,Text Processing :: Markup5 - Production/StablepyScss, a Scss compiler for Python
whatthepatch2770Software Development,Software Development :: Libraries :: Python Modules,Software Development :: Version Control,Text Processing2 - Pre-AlphaA patch parsing library
sphinx-external-toc2741Software Development :: Libraries :: Python Modules,Text Processing :: Markup3 - AlphaA sphinx extension that allows the site-map to be defined in a single YAML file.
emoji-country-flag2685Internet :: WWW/HTTP :: Dynamic Content,Communications :: Chat,Printing,Text Processing :: General5 - Production/StableEn/Decode unicode country flags emoji
hellosign-python-sdk2657Text Processing :: Linguistic3 - AlphaAn API wrapper written in Python to interact with HelloSign's API (
mf2py2551Text Processing :: Markup :: HTML3 - AlphaPython Microformats2 parser
kitchen2546Software Development :: Libraries :: Python Modules,Text Processing :: General3 - AlphaKitchen contains a cornucopia of useful code
konlpy2491Scientific/Engineering,Scientific/Engineering :: Artificial Intelligence,Scientific/Engineering :: Information Analysis,Text Processing,Text Processing :: Filters,Text Processing :: General,Text Processing :: Linguistic4 - BetaPython package for Korean natural language processing.
esprima2473Software Development :: Code Generators,Software Development :: Compilers,Software Development :: Libraries :: Python Modules,Text Processing :: General4 - BetaECMAScript parsing infrastructure for multipurpose analysis in Python
junos-eznc2465Software Development :: Libraries,Software Development :: Libraries :: Application Frameworks,Software Development :: Libraries :: Python Modules,System :: Networking,Text Processing :: Markup :: XML5 - Production/StableJunos 'EZ' automation for non-programmers
natto-py2448Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic4 - BetaA Tasty Python Binding with MeCab (FFI-based, no SWIG or compiler necessary)
jinja2schema2444Internet :: WWW/HTTP :: Dynamic Content,Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: HTML4 - BetaType inference for Jinja2 templates.
yorm2391Software Development :: Libraries :: Python Modules,Software Development :: Version Control,System :: Filesystems,Text Editors :: Text Processing4 - BetaAutomatic object YAML mapping for Python.
capturer2362Communications,Scientific/Engineering :: Human Machine Interfaces,Software Development,Software Development :: Libraries :: Python Modules,Software Development :: User Interfaces,System :: Shells,System :: Systems Administration,System :: System Shells,Terminals,Text Processing :: General4 - BetaEasily capture stdout/stderr of the current process and subprocesses
jupyter-book2350Software Development :: Libraries :: Python Modules,Text Processing :: Markup4 - BetaJupyter Book: Create an online book with Jupyter Notebooks
coala2341Scientific/Engineering :: Information Analysis,Software Development :: Quality Assurance,Text Processing :: Linguistic4 - BetaLinting and Fixing Code for All Languages
benepar2229Scientific/Engineering :: Artificial Intelligence,Text Processing :: LinguisticNo statusBerkeley Neural Parser
polyglot2228Scientific/Engineering :: Artificial Intelligence,Text Processing :: Linguistic4 - BetaPolyglot is a natural language pipeline that supports massive multilingual applications.
pythainlp2198Scientific/Engineering :: Artificial Intelligence,Text Processing,Text Processing :: General,Text Processing :: Linguistic5 - Production/StableThai Natural Language Processing library
markdown-include2175Internet :: WWW/HTTP :: Site Management,Software Development :: Documentation,Software Development :: Libraries :: Python Modules,Text Processing :: Filters,Text Processing :: Markup :: HTML5 - Production/StableThis is an extension to Python-Markdown which provides an "include" function, similar to that found in LaTeX (and also the C pre-processor and Fortran). I originally wrote it for my FORD Fortran auto-documentation generator.
ucca2167Text Processing :: Linguistic4 - BetaUniversal Conceptual Cognitive Annotation
misaka2122Text Processing :: Markup,Text Processing :: Markup :: HTML,Utilities4 - BetaA Python binding for Sundown.
prettierfier2115Text Processing :: Markup :: HTML,Text Processing :: Markup :: XMLNo statusIntelligently pretty-print HTML/XML with inline tags.
edit-distance2100Scientific/Engineering,Software Development,Text Processing,Utilities4 - BetaComputing edit distance on arbitrary Python sequences.
postal-address2098Office/Business,Software Development :: Libraries :: Python Modules,Software Development :: Localization,Text Processing,Utilities4 - BetaParse, normalize and render postal addresses.
feedgen2048Communications,Internet,Text Processing,Text Processing :: Markup,Text Processing :: Markup :: XML4 - BetaFeed Generator (ATOM, RSS, Podcasts)
bitmath2036Scientific/Engineering :: Mathematics,Software Development :: Libraries,Software Development :: Libraries :: Python Modules,Software Development :: User Interfaces,Software Development :: Widget Sets,System :: Filesystems,Text Processing :: Filters,Utilities5 - Production/StablePythonic module for representing and manipulating file sizes with different prefix notations.
esmre2022Software Development :: Libraries :: Python Modules,Text Processing :: Indexing4 - BetaRegular expression accelerator
mkdocs-include-markdown-plugin1977Software Development :: Documentation,Documentation,Text Processing,Text Processing :: Markup :: Markdown5 - Production/StableMkdocs Markdown includer plugin.
jxmlease1970Text Processing :: Markup :: XMLNo statusjxmlease converts between XML and intelligent Python data structures.
py-gfm1892Text Processing :: MarkupNo statusAn implementation of Github-Flavored Markdown written as an extension to the Python Markdown library.
quantulum31873Scientific/Engineering,Text Processing :: Linguistic3 - AlphaExtract quantities from unstructured text.
phonetics1852Internet :: WWW/HTTP :: Indexing/Search,Text Processing :: General,Text Processing :: Indexing,Text Processing :: Linguistic3 - AlphaNone
mistletoe1791Software Development :: Libraries :: Python Modules,Text Processing :: Markup3 - AlphaA fast, extensible Markdown parser in pure Python.
markdown-inline-graphviz-extension1778Documentation,Text Processing5 - Production/StableRender inline graphs with Markdown and Graphviz (python3 version)
pytextrank1776Scientific/Engineering :: Artificial Intelligence,Scientific/Engineering :: Human Machine Interfaces,Scientific/Engineering :: Information Analysis,Text Processing :: General,Text Processing :: Indexing,Text Processing :: Linguistic5 - Production/StablePython implimentation of TextRank for text document NLP parsing and summarization
riotwatcher1769Software Development :: Libraries :: Python Modules,Text Processing,Games/Entertainment :: Real Time Strategy,Games/Entertainment :: Role-PlayingNo statusRiotWatcher is a thin wrapper on top of the Riot Games API for League of Legends.
python-card-me1746Text Processing5 - Production/StableUNKNOWN
django-htmlmin1742Text Processing :: Markup :: HTML5 - Production/StableHTML minifier for Python frameworks (not only Django, despite the name).
jaro-winkler1739Text Processing,Text Processing :: General,Text Processing :: IndexingNo statusOriginal, standard and customisable versions of the Jaro-Winkler functions.
sphinxext-opengraph1736Documentation :: Sphinx,Documentation,Software Development :: Documentation,Text Processing,Utilities5 - Production/StableSphinx Extension to enable OGP support
pyuca1705Text Processing5 - Production/Stablea Python implementation of the Unicode Collation Algorithm
rst2ansi1619Software Development :: Libraries :: Python Modules,Text Processing :: Markup,Utilities4 - BetaA rst converter to ansi-decorated console output
hanziconv1607System,Text Processing,Utilities4 - BetaHanzi Converter for Traditional and Simplified Chinese
srt1600Multimedia :: Video,Software Development :: Libraries,Text Processing5 - Production/StableA tiny library for parsing, modifying, and composing SRT files.
asciimatics1583Text Processing :: General5 - Production/StableA cross-platform package to create ASCII art and cinematic storyboard sequences
jsoncomment1581Software Development :: Pre-processors,Text Editors :: Text Processing4 - BetaA wrapper to JSON parsers allowing comments, multiline strings and trailing commas
stemming1580Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic5 - Production/StablePython implementations of various stemming algorithms.
xsdata1573Software Development :: Code Generators,Text Processing :: Markup :: XMLNo statusPython XML Binding
nwdiag1569Software Development,Software Development :: Documentation,Text Processing :: Markup3 - Alphanwdiag generate network-diagram image file from spec-text file.
fuzzy1556Text Processing,Text Processing :: General,Text Processing :: Indexing,Text Processing :: Linguistic4 - BetaFast Python phonetic algorithms
selectolax1555Text Processing :: Markup :: HTML,Internet,Internet :: WWW/HTTP5 - Production/StableFast HTML5 parser with CSS selectors.
django-inlinecss1542Communications :: Email,Text Processing :: Markup :: HTMLNo statusA Django app useful for inlining CSS (primarily for e-mails)
overpunch1523Text Processing5 - Production/StableOverpunch Parser/Formatter
mdformat-gfm1522Documentation,Text Processing :: MarkupNo statusMdformat plugin for GitHub Flavored Markdown compatibility
mdv1479Text Processing :: MarkupNo statusTerminal Markdown Viewer
html5-parser1474Text Processing,Text Processing :: Markup,Text Processing :: Markup :: HTML,Text Processing :: Markup :: XML5 - Production/StableFast C based HTML 5 parsing for python
coala-bears1439Scientific/Engineering :: Information Analysis,Software Development :: Quality Assurance,Text Processing :: Linguistic4 - BetaBears for coala (Code Analysis Application)
lepl1431Software Development,Software Development :: Libraries,Software Development :: Libraries :: Python Modules,Text Processing,Text Processing :: Filters,Text Processing :: General,Utilities5 - Production/StableA Parser Library for Python 3 (and 2.6): Recursive Descent; Full Backtracking
humanreadable1337Software Development :: Libraries,Software Development :: Libraries :: Python Modules,Text Processing3 - AlphaDESCRIPTION
seqdiag1333Software Development,Software Development :: Documentation,Text Processing :: Markup3 - Alphaseqdiag generate sequence-diagram image file from spec-text file.
opencc1315Scientific/Engineering,Software Development,Software Development :: Libraries,Software Development :: Libraries :: Python Modules,Software Development :: Localization,Text Processing :: Linguistic5 - Production/StableConversion between Traditional and Simplified Chinese
js-regex1294Software Development :: Testing,Text Processing4 - BetaA thin compatibility layer to use Javascript regular expressions in Python
base58check1261Internet :: WWW/HTTP,Security :: Cryptography,Software Development,Software Development :: Libraries :: Python Modules,Text Processing,Utilities4 - BetaBase58check encoding and decoding of binary data
cheetah1261Internet :: WWW/HTTP,Internet :: WWW/HTTP :: Dynamic Content,Internet :: WWW/HTTP :: Site Management,Software Development :: Code Generators,Software Development :: Libraries :: Python Modules,Software Development :: User Interfaces,Text Processing5 - Production/StableCheetah is a template engine and code generation tool.
pyxdameraulevenshtein1256Scientific/Engineering :: Bio-Informatics,Scientific/Engineering :: Information Analysis,Text Processing :: Linguistic5 - Production/StablepyxDamerauLevenshtein implements the Damerau-Levenshtein (DL) edit distance algorithm for Python in Cython for high performance.
rst2pdf1255Documentation,Software Development :: Libraries :: Python Modules,Text Processing,Utilities4 - BetaConvert restructured text to PDF via reportlab.
vdf1252Software Development :: Libraries :: Python Modules,Text Processing5 - Production/StableLibrary for working with Valve's VDF text format
textacy1240Text Processing :: Linguistic4 - BetaHigher-level text processing, built on Spacy
rinohtype1239Printing,Software Development :: Libraries :: Python Modules,Text Processing :: Fonts,Text Processing :: Markup,Text Processing :: Markup :: XML4 - BetaThe Python document processor
pymorphy2-dicts1221Scientific/Engineering :: Information Analysis,Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic4 - dictionaries pre-compiled for pymorphy2
stanfordcorenlp1201Scientific/Engineering,Scientific/Engineering :: Artificial Intelligence,Scientific/Engineering :: Information Analysis,Text Processing,Text Processing :: Linguistic5 - Production/StablePython wrapper for Stanford CoreNLP.
pystempel1199Scientific/Engineering,Scientific/Engineering :: Artificial Intelligence,Scientific/Engineering :: Human Machine Interfaces,Scientific/Engineering :: Information Analysis,Text Processing,Text Processing :: General,Text Processing :: LinguisticNo statusPolish stemmer.
rinoh-typeface-texgyrepagella1199Text Processing :: FontsNo statusTeX Gyre Pagella typeface
rinoh-typeface-texgyrecursor1199Text Processing :: FontsNo statusTeX Gyre Cursor typeface
rinoh-typeface-texgyreheros1198Text Processing :: FontsNo statusTeX Gyre Heros typeface
rinoh-typeface-dejavuserif1184Text Processing :: FontsNo statusDejaVu Serif typeface
syntok1172Scientific/Engineering :: Information Analysis,Software Development :: Libraries,Text Processing,Text Processing :: Linguistic3 - Alphasentence segmentation and word tokenization toolkit
tableprint1163Text Processing :: General3 - AlphaPretty ASCII printing of tabular data
pycm1079Text Processing :: Fonts3 - AlphaConfusion Matrix In Python
recurrent1077Text Processing :: LinguisticNo statusNatural language parsing and formatting of recurring events
wordfreq1074Scientific/Engineering,Software Development,Text Processing :: LinguisticNo statuswordfreq is a Python library for looking up the frequencies of words in many
pylatex1069Software Development :: Code Generators,Text Processing :: Markup :: LaTeX4 - BetaA Python library for creating LaTeX files
acora1067Text ProcessingNo statusFast multi-keyword search engine for text strings
sas7bdat1065Text Processing,Utilities5 - Production/StableA sas7bdat file reader for Python
naturalsort1040Scientific/Engineering :: Human Machine Interfaces,Software Development,Software Development :: Libraries :: Python Modules,System :: Systems Administration,Text Processing :: General6 - MatureSimple natural order sorting API for Python
markdown-strings1040Text Processing :: Markup5 - Production/StableCreate markdown formatted text
epitran1034Software Development :: Libraries :: Python Modules,Text Processing :: LinguisticNo statusTools for transcribing languages into IPA.
money-parser1029Text Processing,Text Processing :: GeneralNo statusPrice and currency parsing utility
spark1022Internet :: WWW/HTTP,Internet :: WWW/HTTP :: Dynamic Content,Software Development :: Code Generators,Software Development :: Libraries :: Python Modules,Text Processing3 - AlphaA Super-Small, Super-Fast, and Super-Easy web framework
pytest-codeblocks1012Text Processing4 - BetaExtract code blocks from markdown
semstr1000Text Processing :: Linguistic4 - BetaScheme Evaluation and Mapping for Structural Text Representation
subword-nmt998Scientific/Engineering :: Artificial Intelligence,Text ProcessingNo statusUnsupervised Word Segmentation for Neural Machine Translation and Text Generation
abydos997Scientific/Engineering :: Artificial Intelligence,Software Development :: Libraries :: Python Modules,Text Processing :: Indexing,Text Processing :: Linguistic4 - BetaAbydos NLP/IR library
ultimate-sitemap-parser988Internet :: WWW/HTTP :: Indexing/Search,Text Processing :: Indexing,Text Processing :: Markup :: XML3 - AlphaUltimate Sitemap Parser
docx-mailmerge975Text ProcessingNo statusPerforms a Mail Merge on docx (Microsoft Office Word) files
wordfilter972Communications,Text Processing :: LinguisticNo statusA small module meant for use in text generators that lets you filter strings for bad words.
rtfde971Text Processing :: Markup :: HTML,Text Processing :: Filters,Communications :: Email :: FiltersNo statusA library for extracting HTML content from RTF encapsulated HTML as commonly found in the exchange MSG email format.
mwtypes953Scientific/Engineering,Software Development :: Libraries :: Python Modules,Text Processing :: General,UtilitiesNo statusA set of types for processing MediaWiki data.
isbnlib931Software Development :: Libraries :: Python Modules,Text Processing :: General5 - Production/StableExtract, clean, transform, hyphenate and metadata for ISBNs (International Standard Book Number).
pweave921Text Processing :: Markup4 - BetaLiterate programming with reST or LaTeX in Sweave style
xmlunittest918Software Development :: Libraries :: Python Modules,Software Development :: Testing,Text Processing :: Markup :: XML4 - BetaLibrary using lxml and unittest for unit testing XML.
datrie907Scientific/Engineering :: Information Analysis,Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic4 - BetaSuper-fast, efficiently stored Trie for Python.
django-autoslug-iplweb904Software Development :: Libraries :: Python Modules,Text Processing :: General5 - Production/StableAn automated slug field for Django.
tailhead884Software Development :: Libraries :: Python Modules,System :: Logging,System :: Systems Administration,System :: System Shells,Text ProcessingNo statustailhead is a simple implementation of GNU tail and head.
camel-tools883Scientific/Engineering,Scientific/Engineering :: Artificial Intelligence,Scientific/Engineering :: Information Analysis,Text Processing,Text Processing :: Linguistic5 - Production/StableA suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.
scrubadub881Software Development :: Libraries,Scientific/Engineering :: Information Analysis,Text Processing,Utilities5 - Production/StableClean personally identifiable information from dirty dirty text.
weighted-levenshtein874Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic3 - AlphaLibrary providing functions to calculate Levenshtein distance, Optimal String Alignment distance, and Damerau-Levenshtein distance, where the cost of each operation can be weighted by letter.
pydocx874Text Processing :: Markup :: HTML,Text Processing :: Markup :: XML3 - Alphadocx (OOXML) to html converter
es2csv873Database,Internet,System :: Systems Administration,Text Processing,Utilities5 - Production/StableA CLI tool for exporting data from Elasticsearch into a CSV file.
nagisa871Software Development :: Libraries :: Python Modules,Text Processing :: LinguisticNo statusA Japanese tokenizer based on recurrent neural networks
humps855Software Development,Text Editors,Text Editors :: Word Processors,Text Processing,Text Processing :: General,Utilities4 - BetacamelCase converter
sphinx-intl852Documentation,Software Development :: Documentation,Text Processing :: General,Utilities4 - BetaSphinx utility that make it easy to translate and to apply translation.
pytest-reporter-html1846Software Development :: Testing,Text Processing :: Markup :: HTML4 - BetaA basic HTML report template for Pytest
optimus842Internet :: WWW/HTTP,Internet :: WWW/HTTP :: Site Management,Software Development :: Libraries :: Python Modules,Text Processing :: Markup5 - Production/StableA simple building environment to produce static HTML from Jinja templates and with assets compress managing with webassets
ocrmypdf835Scientific/Engineering :: Image Recognition,Text Processing :: Indexing,Text Processing :: Linguistic5 - Production/StableOCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
docraptor819Internet :: WWW/HTTP :: Dynamic Content,Multimedia :: Graphics :: Presentation,Office/Business,Office/Business :: Financial :: Spreadsheet,Printing,Text Processing :: Markup :: HTML5 - Production/StableA wrapper for the DocRaptor HTML to PDF/XLS service.
documenttemplate817Text Processing :: Markup6 - MatureDocument Templating Markup Language (DTML)
dexml813Text Processing,Text Processing :: Markup,Text Processing :: Markup :: XML5 - Production/Stablea dead-simple Object-XML mapper for Python
postal810Internet :: WWW/HTTP :: Indexing/Search,Scientific/Engineering :: GIS,Software Development :: Libraries :: Python Modules,Text Processing :: LinguisticNo statusPython bindings to libpostal for fast international address parsing/normalization
pytidylib805Text Processing :: Markup :: HTML,Text Processing :: Markup :: XML,Utilities5 - Production/StablePython wrapper for HTML Tidy (tidylib) on Python 2 and 3
fake-bpy-module-latest800Multimedia :: Graphics :: 3D Modeling,Multimedia :: Graphics :: 3D Rendering,Text Editors :: Integrated Development Environments (IDE)No statusCollection of the fake Blender Python API module for the code completion.
actdiag794Software Development,Software Development :: Documentation,Text Processing :: Markup3 - Alphaactdiag generate activity-diagram image file from spec-text file.
json-diff786Software Development :: Libraries :: Python Modules,Text Processing :: General4 - BetaGenerates diff between two JSON files
pyrush779Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic3 - AlphaA fast implementation of RuSH (Rule-based sentence Segmenter using Hashing).
pysrt766Multimedia :: Video,Software Development :: Libraries,Text Processing :: Markup4 - BetaSubRip (.srt) subtitle parser and writer
breadability761Internet :: WWW/HTTP,Software Development :: Pre-processors,Text Processing :: Filters,Text Processing :: Markup :: HTML5 - Production/StablePort of Readability HTML parser in Python
jedi-language-server759Software Development :: Code Generators,Software Development :: Libraries :: Python Modules,Text Editors :: Integrated Development Environments (IDE),Utilities3 - AlphaA language server for Jedi!
django-rest-framework-mongoengine744Internet,Internet :: WWW/HTTP :: Site Management,Software Development :: Libraries :: Python Modules,Software Development :: Testing,Text Processing :: Markup :: HTML5 - Production/StableModel Serializer that supports MongoEngine, for Django Rest Framework.
rst2txt739Documentation,Internet :: WWW/HTTP :: Site Management,Printing,Software Development,Software Development :: Documentation,Text Processing,Text Processing :: General,Text Processing :: Markup,Utilities5 - Production/StableConvert reStructuredText to plain text
html734Software Development :: Code Generators,Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: HTMLNo statussimple, elegant HTML, XHTML and XML generation
metadata-parser734Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: HTMLNo statusA module to parse metadata out of urls and html documents
fastdameraulevenshtein731Scientific/Engineering :: Bio-Informatics,Scientific/Engineering :: Information Analysis,Text Processing :: LinguisticNo statusCython implementation of true Damerau-Levenshtein algorithm.
pyrxp722Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: XML5 - Production/StablePython RXP interface - fast validating XML parser
headerparser720Communications :: Email,Communications :: Usenet News,Internet :: WWW/HTTP,Text Processing4 - Betaargparse for mail-style headers
area4712Software Development :: Libraries,Software Development :: Libraries :: Python Modules,System,Terminals,Text Processing,Utilities5 - Production/StableDividers in Python, the easy way!
kytea711Text Processing :: LinguisticNo statusAn text analysis toolkit KyTea binding
endesive708Communications :: Email,Multimedia :: Graphics,Office/Business,Security :: Cryptography,Software Development :: Libraries :: Python Modules,Text Processing4 - BetaLibrary for digital signing and verification of digital signatures in mail, PDF and XML documents.
xmpppy704Communications,Communications :: Chat,Database,Internet,Software Development :: Libraries,System :: Networking,Text Processing,Utilities4 - BetaXMPP implementation in Python
flask-htmlmin703Internet :: WWW/HTTP :: Dynamic Content,Software Development :: Libraries :: Python Modules,Text Processing :: Markup :: HTMLNo statusMinimize render templates html
panphon695Software Development :: Libraries :: Python Modules,Text Processing :: LinguisticNo statusTools for using the International Phonetic Alphabet with phonological features
sentence-splitter688Database,Internet :: WWW/HTTP :: Indexing/Search,Text Processing :: Indexing,Text Processing :: Linguistic3 - AlphaText to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder
address683Software Development :: Libraries,Text ProcessingNo statusaddress is an address parsing library, taking the guesswork out of using addresses in your applications.
hanlp682Scientific/Engineering :: Artificial Intelligence,Text Processing :: Linguistic3 - AlphaHanLP: Han Language Processing
entry-points-txt675Software Development :: Libraries :: Python Modules,Text Processing4 - BetaRead & write entry_points.txt files
kafka-helper674Text Processing :: Linguistic4 - BetaMakes it easy to use the kafka-python library with Apache Kafka on Heroku
atoma671Software Development :: Libraries,Text Processing :: Markup :: XML4 - BetaAtom and RSS feed parser for Python 3
sumy665Education,Internet,Scientific/Engineering :: Information Analysis,Text Processing :: Filters,Text Processing :: Linguistic,Text Processing :: Markup :: HTML4 - BetaModule for automatic summarization of text documents and HTML pages.
rouge663Text Processing :: LinguisticNo statusFull Python ROUGE Score Implementation (not a wrapper)
codechat662Software Development :: Documentation,Text Processing :: Markup3 - AlphaThe CodeChat system for software documentation
pdfminer2660Text Processing5 - Production/StablePDF parser and analyzer
django-pagedown653Text EditorsNo statusA Django app that allows the easy addition of Stack Overflow's 'PageDown' markdown editor to a django form field
upsidedown650Text Processing5 - Production/Stable"Flip" characters in a string to create an "upside-down" impression.
inlinestyler645Communications :: Email,Text Processing :: Markup :: HTML5 - Production/StableInlines external CSS into HTML elements.
sphinxext-rediraffe639Documentation :: Sphinx,Documentation,Software Development :: Documentation,Text Processing,UtilitiesNo statusSphinx Extension that redirects non-existent pages to working pages.
plum-py633Text Processing2 - Pre-AlphaPack/Unpack Memory.
fortune625Games/Entertainment,Text Processing :: FiltersNo statusPython version of old BSD Unix fortune program
svgutils619Multimedia :: Graphics :: Editors :: Vector-Based,Scientific/Engineering :: Visualization,Text Processing :: Markup4 - BetaPython SVG editor
pynlpl618Text Processing :: Linguistic5 - Production/StablePyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for example the computation of n-grams, frequency lists and distributions, language models. There are also more complex data types, such as Priority Queues, and search algorithms, such as Beam Search.
coverage2clover618Text Processing :: Markup :: XML,Software Development :: Quality Assurance,Software Development :: Testing5 - Production/StableA tool to convert python-coverage xml report to Atlassian Clover xml report format
py3langid616Scientific/Engineering :: Artificial Intelligence,Text Processing :: Linguistic3 - AlphaFork of the language identification tool, featuring a modernized codebase and faster execution times.
chardet2595Software Development :: Libraries :: Python Modules,Text Processing :: Linguistic4 - BetaCharacter encoding auto-detection in Python 3
pangu591Education,Software Development :: Internationalization,Software Development :: Libraries,Software Development :: Libraries :: Python Modules,Text Processing,Text Processing :: General,Text Processing :: Linguistic,Utilities5 - Production/StableParanoid text spacing for good readability, to automatically insert whitespace between CJK (Chinese, Japanese, Korean) and half-width characters (alphabetical letters, numerical digits and symbols).
naya587Software Development :: Libraries,Text Processing :: Markup4 - BetaA fast streaming JSON parser written in pure python
citeurl577Text Processing :: Filters,Text Processing :: Markup :: HTML,Text Processing :: Markup :: Markdown,Internet :: WWW/HTTP :: WSGI :: Application,Software Development :: Libraries :: Python Modules5 - Production/Stablean extensible tool to process legal citations in text
fake-bge-module-latest573Multimedia :: Graphics :: 3D Modeling,Multimedia :: Graphics :: 3D Rendering,Text Editors :: Integrated Development Environments (IDE)No statusCollection of the fake Blender Game Engine (BGE) Python API module for the code completion.
mkdocs-git-authors-plugin568Documentation,Text ProcessingNo statusMkdocs plugin to display git authors of a page
clam565Internet :: WWW/HTTP :: WSGI :: Application,Text Processing :: Linguistic4 - BetaTurns command-line NLP tools into fully-fledged RESTful webservices with a auto-generated web-interface for human end-users.
case-converter563Text Processing,Utilities5 - Production/StableA string case conversion package.
excel2json-3561Internet :: WWW/HTTP :: Indexing/Search,Software Development :: Libraries :: Python Modules,System :: Archiving :: Packaging,Text Processing1 - PlanningConvert MS Excel file formats to JSON
uniseg560Text Processing4 - BetaDetermine Unicode text segmentations
pystemmer559Database,Internet :: WWW/HTTP :: Indexing/Search,Text Processing :: Indexing,Text Processing :: Linguistic5 - Production/StableSnowball stemming algorithms, for information retrieval
tree-sitter547Software Development :: Compilers,Text Processing :: LinguisticNo statusPython bindings to the Tree-sitter parsing library
wikitextparser545Text ProcessingNo statusA simple parsing tool for MediaWiki's wikitext markup.
pysubs2525Software Development :: Libraries :: Python Modules,Text Processing :: Markup,Multimedia :: Video5 - Production/StableA library for editing subtitle files
gender524Text Processing,Text Processing :: Filters3 - AlphaGet gender from name and email address
coinaddrvalidator518Internet :: WWW/HTTP,Software Development,Security :: Cryptography,Text Processing,Utilities,Software Development :: Libraries :: Python Modules4 - BetaA crypto-currency address inspection/validation library.