Interface of v7, AI integrated keyboard. Photo: NVCC . |
Sharing with Tri Thuc - Znews , Tri Duc (born in 2003) talked about the idea of applying artificial intelligence to change the way Vietnamese is entered. The v7 typing tool, his student project, has now developed into a research paper and been accepted at IJCAI 2025, a prestigious conference on AI.
Despite being popular for decades, Telex or VNI typing still has many limitations in user experience. Therefore, v7 was born to become a lightweight prediction tool, helping to shorten the time to type Vietnamese thanks to AI integration.
Passion for languages and technology
His love of languages and technology led him to major in Applied Artificial Intelligence at Ho Chi Minh City University of Technology.
During his studies, he approached projects such as a large language model (LLM) for Vietnamese, a software for translating ethnic minority languages, or a chatbot to support admissions. “Those experiences helped me accumulate a solid foundation of knowledge, nurture my passion and desire to apply AI to create useful products for the community,” he shared.
Tri Duc wants to bring value from applying AI to life. Photo: NVCC. |
In addition, with a background in Mandarin and Cantonese, Duc realized the correlation of pinyin/jyutping with Vietnamese spelling. This factor also makes you see that in contrast to the complexity of hieroglyphs, the Chinese pinyin typing system only needs to enter “yn” to get the name of our country in Han characters. While Telex or VNI needs 10 keys to get the word “Vietnam”.
Through his observations, Duc realized that when communicating quickly, users often abbreviate by keeping the first consonant, such as “hs” for “student”. “If humans can easily understand this writing style, AI can completely understand it if trained with the right data,” he said about the circumstances that gave rise to the idea.
Instead of having to write the full character and then add accents when using traditional typing tools like Telex or VNI which follow the complementary mechanism, v7 uses AI to suggest the word you want to write. The technology will accurately predict the complete word with the fewest keys possible.
In Vietnamese spelling structure, a word consists of initial consonant, rhyme and tone. For example, the word “Nguyen” is composed of “ng”, “uyen”, and a falling tone. Based on this principle, v7 typing engine is built to predict complete words with only initial consonant and tone, which helps to significantly reduce the number of keystrokes while maintaining accuracy.
The challenge of teaching Vietnamese to AI
According to Duc, the biggest challenge was teaching AI to "understand" Vietnamese to serve this typing tool. He tried many models before choosing GPT-2 as the foundation, with Transformers architecture for good context understanding and accurate word prediction.
After choosing the underlying architecture, Duc completely replaced the Tokenizer (vocabulary encoder) with a Vietnamese vocabulary built by himself. The engineer filtered out all valid, correctly spelled Vietnamese words to ensure comprehensive processing, predicting any word the user wants to write.
Another challenge lies in balancing predictive performance and response speed, ensuring that the model can run in real time on both computers and phones, but is still powerful enough to make the best predictions. After two months of continuous testing, the current version correctly brings nearly 70% of the words users type to the top, with a latency of just 0.03 seconds.
Regarding the input method of the keyboard, according to many studies that Duc consulted from linguists Cao Xuan Hao or Henri Maspero, Vietnamese has not only 6, but 8 tones. To take advantage of this feature, v7 uses an 8-tone system instead of the usual 6 (including a flat tone and 5 accented tones: sharp, flat, question, falling, heavy). In this keyboard, when typing “v7”, the model will suggest the word “Viet”. This is also the idea for the product name.
After sharing v7 on his social network, Duc said that he was very happy and surprised when the model received attention, support and desire to experience. "That gave me a clear feeling about the need for a smarter and faster Vietnamese typing tool," he said.
The group of authors of the scientific research article. From left: Nhat Khang, Hieu Nghia, and Tri Duc. Photo: NVCC. |
Currently, the keyboard is still in the prototype stage, with open source code on GitHub for programmers or technology users to test and contribute. A complete application version for Windows and macOS is also being developed for common users to easily install and use.
In the future, the top priority for v7 is the keyboard version on iPhone, to improve the way Vietnamese text is entered on smartphones. In addition, the model will be improved in accuracy by training more on daily conversation data, helping the AI better understand common contexts.
Duc's journey has contributed to a breath of creativity, catching up with technology trends in the context of Vietnam investing heavily in AI infrastructure. One moment that makes him proud is when v7 first created a complete sentence. "That was when a small model, probably only 1/10,000 the size of ChatGPT today, could still think like a human," Duc said.
Source: https://znews.vn/ky-su-tre-dung-ai-thay-doi-cach-go-tieng-viet-post1552246.html
Comment (0)