KankenOnline/README.md

13 lines
742 B
Markdown
Raw Permalink Normal View History

2024-10-09 22:44:40 +00:00
# KankenOnline
This project intends to provide a website that can generate practice questions for Kanji Kentei level 1.
I draw from both definition data and numerous texts from Aozora Bunko to create each style of question from 1 to 9.
## Running
- `wget https://huggingface.co/datasets/globis-university/aozorabunko-clean/resolve/main/aozorabunko-dedupe-clean.jsonl.gz` to get the Aozora data
- `gunzip aozorabunko-dedupe-clean.jsonl` to extract the data to a single file
## Sources
- [Aozora Bunko cleaned corpus on GitHub](https://github.com/globis-org/aozorabunko-extractor?tab=readme-ov-file)
- [Hugging Face download](https://huggingface.co/datasets/globis-university/aozorabunko-clean)
# This early build based on Flask tutorial