cihai

Python library for CJK (Chinese, Japanese, Korean) character data. Look up readings, definitions, and variants from the UNIHAN database and beyond.

Quickstart

Install and make your first lookup in 5 minutes.

Quickstart
Topics

Features, examples, extending, troubleshooting.

Topics
API Reference

Every public class, function, and exception.

API Reference
Datasets

UNIHAN and planned data sources.

Datasets
Internals

Private APIs – no stability guarantee.

Internals
Contributing

Development setup, code style, release process.

Project

Install

$ pip install cihai
$ uv add cihai

At a glance

from cihai.core import Cihai

c = Cihai()

if not c.unihan.is_bootstrapped:  # download and install UNIHAN to db
    c.unihan.bootstrap()

query = c.unihan.lookup_char('好')
glyph = query.first()
print("lookup for 好: %s" % glyph.kDefinition)
# lookup for 好: good, excellent, fine; well

query = c.unihan.reverse_char('good')
print('matches for "good": %s ' % ', '.join([glph.char for glph in query]))
# matches for "good": 㑘, 㑤, 㓛, 㘬, 㙉, 㚃, ...

See Quickstart for detailed installation and first steps.