Skip to content

Latest commit

 

History

History
26 lines (16 loc) · 1.24 KB

README.md

File metadata and controls

26 lines (16 loc) · 1.24 KB

TOCFL

The Test of Chinese as a Foreign Language (TOCFL) (Chinese: 華語文能力測驗; pinyin: Huáyǔwén Nénglì Cèyàn) is a standardized test of Taiwanese Mandarin language proficiency for non-native speakers, including foreign students. While there are many vocabulary lists available online, a lot of them are either incomplete / outdated or behind paywalls.

This repo provides a dataset based on (linked from the official TOCFL website):

coct.naer.edu.tw/download/tech_report

Excel Sheet

Vocabulary

Taiwan Chinese Language Proficiency Benchmark Vocabulary List_111-11-14.xlsx

The vocabulary list is great, it gives frequency for written AND spoken. It also provides pinyin to differentiate same char with different meaning pronounciation.

Characters

Taiwan Chinese Language Proficiency Benchmark Chinese Character List_111-09-20.xlsx

Other

https://github.com/tomcumming/tocfl-word-list also provides TOCFL lists, but seems to be incomplete (or outdated). The source used to compile the list is not entirely clear.