Vibe Coding is Fast, But Fixing Chinese Localization Isn’t — Meet zhtw

🌏 閱讀中文版本


Vibe Coding has changed how we write software. Cursor, GitHub Copilot, Claude — these AI tools have made development several times faster.

But there’s a catch for Taiwanese developers: AI-generated Chinese content often uses Mainland Chinese terminology.

This isn’t about character sets (Simplified vs Traditional). It’s about regional vocabulary differences.

For example:

  • Mainland: 軟件 (ruǎnjiàn) → Taiwan: 軟體 (ruǎntǐ) — both mean “software”
  • Mainland: 數據庫 (shùjùkù) → Taiwan: 資料庫 (zīliàokù) — both mean “database”
  • Mainland: 調用 (diàoyòng) → Taiwan: 呼叫 (hūjiào) — both mean “to call (a function)”

Same meaning, different words. And for Taiwanese developers, reading Mainland terminology in their codebase just feels… off.

I recently took over a Vibe Coding project. Opened it up: 400 files, filled with Mainland Chinese terminology.

Fix them by hand? That would take forever.

So I built zhtw.

zhtw: One Command, All Fixed

pip install zhtw

After installation, two commands solve the problem:

zhtw check ./src

# Auto-fix
zhtw fix ./src

That 400-file project? Fixed in seconds.

Key Insight: Vibe Coding speeds up development, zhtw speeds up localization. AI handles the output, you handle quality control, zhtw handles terminology conversion.

Want to preview changes before applying them? Use --dry-run:

zhtw fix ./src --dry-run

This lists all changes without modifying any files.

Why Not Just Use a “Simplified to Traditional” Converter?

Most Chinese conversion tools only handle character conversion, not terminology conversion.

What’s the difference?

Simplified Character Conversion zhtw (Terminology Conversion)
软件 軟件 軟體 ✅
数据库 數據庫 資料庫 ✅
调用 調用 呼叫 ✅
异步 異步 非同步 ✅
内存 內存 記憶體 ✅

zhtw includes 400+ curated terminology mappings, all manually reviewed for accuracy.

Key Insight: Terminology conversion ≠ character conversion. Taiwanese developers want “資料庫” (Taiwan term), not “數據庫” (Mainland term written in Traditional characters).

Conservative by Design: Better to Under-Convert Than Over-Convert

zhtw follows a conservative strategy: only convert terms that are explicitly defined in the terminology list.

Why? Over-conversion causes more problems than it solves.

For example, some terms like “權限” (permissions) are used in both regions. If a tool converts it to “許可權”, it actually breaks correct usage. zhtw skips such terms entirely.

Key Insight: The conservative approach means you can safely run zhtw fix on your entire project without worrying about breaking correctly-used terms.

Use Cases

zhtw works beyond just code:

Vibe Coding Project Cleanup

AI-generated code, comments, and documentation — process everything at once.

Technical Documentation Localization

Convert Simplified Chinese docs to Taiwan terminology for open source contributions or internal use.

i18n String Files

Your zh-TW locale files have Mainland terms mixed in? zhtw can check and fix them.

Personal Knowledge Base

Batch-convert notes in Obsidian, Notion, or any markdown-based system.

Advanced Features

Exclude Files: .zhtwignore

Like .gitignore, create a .zhtwignore file to exclude paths:

# .zhtwignore
vendor/
node_modules/
*.min.js
test-fixtures/

Inline Ignoring

Need to keep Simplified Chinese in specific places (e.g., test fixtures)? Use comments:

# Single line ignore
test_data = "软件"  # zhtw:disable-line

# Block ignore
# zhtw:disable
legacy_strings = {
    "软件": "software",
    "硬件": "hardware",
}
# zhtw:enable

JSON Output

For CI integration or programmatic use:

zhtw check ./src --json

Outputs structured JSON for easy parsing.

Technical Details

zhtw uses the Aho-Corasick algorithm with O(n) time complexity, making it fast even for large projects.

It runs completely offline — your code never leaves your machine. This is especially important for enterprise projects.

Conclusion

In the Vibe Coding era, AI writes most of our code. But quality control is still our responsibility.

zhtw solves a specific problem: making AI-generated content match Taiwanese developers’ terminology preferences.

If you work with Traditional Chinese and care about regional consistency, give it a try:

pip install zhtw

The project is open source. Contributions welcome: https://github.com/rajatim/zhtw

Questions or suggestions? Open an issue.


Sources

Leave a Comment