cihai - Python library for CJK (chinese, japanese, korean) data

API / Library (this repository)

$ pip install --user cihai
from cihai.core import Cihai

c = Cihai()

if not c.unihan.is_bootstrapped:  # download and install Unihan to db

query = c.unihan.lookup_char('好')
glyph = query.first()
print("lookup for 好: %s" % glyph.kDefinition)
# lookup for 好: good, excellent, fine; well

query = c.unihan.reverse_char('good')
print('matches for "good": %s ' % ', '.join([glph.char for glph in query]))
# matches for "good": 㑘, 㑤, 㓛, 㘬, 㙉, 㚃, 㚒, 㚥, 㛦, 㜴, 㜺, 㝖, 㤛, 㦝, ...

See API documentation and /examples.

CLI (cihai-cli)

$ pip install --user cihai[cli]
# character lookup
$ cihai info 好
char: 好
kCantonese: hou2 hou3
kDefinition: good, excellent, fine; well
kHangul: 호
kJapaneseOn: KOU
kKorean: HO
kMandarin: hǎo
kTang: '*xɑ̀u *xɑ̌u'
kTotalStrokes: '6'
kVietnamese: háo
ucn: U+597D

# reverse lookup
$ cihai reverse library
char: 圕
kCangjie: WLGA
kCantonese: syu1
kCihaiT: '308.302'
kDefinition: library
kMandarin: tú
kTotalStrokes: '13'
ucn: U+5715


All datasets that cihai uses have stand-alone tools to export their data. No library required.


poetry is a required package to develop.

git clone

cd cihai

poetry install -E "docs test coverage lint format"

Makefile commands prefixed with watch_ will watch files and rerun.


poetry run py.test

Helpers: make test Rerun tests on file change: make watch_test (requires entr(1))


Default preview server: http://localhost:8035

cd docs/ and make html to build. make serve to start http server.

Helpers: make build_docs, make serve_docs

Rebuild docs on file change: make watch_docs (requires entr(1))

Rebuild docs and run server via one terminal: make dev_docs (requires above, and a make(1) with -J support, e.g. GNU Make)

Formatting / Linting

The project uses black and isort (one after the other) and runs flake8 via CI. See the configuration in pyproject.toml and setup.cfg:

make black isort: Run black first, then isort to handle import nuances make flake8, to watch (requires entr(1)): make watch_flake8


As of 0.10, poetry handles virtualenv creation, package requirements, versioning, building, and publishing. Therefore there is no or requirements files.

Update __version__ in and pyproject.toml:

git commit -m 'build(cihai): Tag v0.1.1'
git tag v0.1.1
git push
git push --tags
poetry build
poetry deploy