
Recently, CMC Technology Application Institute (CMC ATI) announced the CATI-VLM (Visual Document Understanding) model developed by the research team from a 5TB large data warehouse, reaching Top 12 in the world and Top 1 in Vietnam in the rankings just announced by Robust Reading Competition (RRC) in June 2025 in the Document Visual Question Answering (DocVQA) category.
Mr. Dang Minh Tuan, Director of CMC ATI, shared: "We are very happy that the research capacity of the CMC team has been affirmed through a prestigious global playground like RRC. We are proud that in just a short time, the team can achieve a high ranking, standing shoulder to shoulder with big names from developed countries. More importantly, this is a clear demonstration of the ability to master technology to solve specific problems of Vietnamese and specialized fields in Vietnam."
In the context of digital transformation and AI transformation in Vietnam taking place strongly, OCR (Optical Character Recognition) technology plays an increasingly important role in digitizing documents, automating business processes, saving costs and improving management efficiency.
However, with the characteristics of Vietnamese with accents and handwriting, the recognition problem does not stop at 'reading words', but requires the model to have the ability to understand the context comprehensively.
CATI-VLM differs from traditional OCR in that it not only extracts characters, but also understands multiple layers of information: text content, non-text elements (tick boxes, checkboxes, charts, signatures, formulas), layout (page structure, tables, forms) and style (fonts, highlights…).
The model can answer visual questions posed on document images, similar to ChatGPT, without needing to learn specific forms in advance.

Robust Reading Competition (RRC) is a prestigious scientific playground, organized by the Computer Vision Center of the Universitat Autònoma de Barcelona (UAB) Spain, a prestigious research facility in the world in the field of computer vision.
Initiated in 2011, always accompanying the International Conference on Text Analysis and Recognition ICDAR - one of the world's largest forums on document analysis and computer vision, the competition has become an important event, attracting researchers, engineers from prestigious universities, research institutes and technology companies such as Tsinghua University, Hyundai Motor Group, and Tencent...
RRC's missions are designed to drive technological advancement, anchored in practical problems ranging from translation and enterprise data management to urban analytics and historical document processing.

Source: https://vietnamnet.vn/ai-loi-make-in-vietnam-cua-cmc-duoc-xep-hang-top-12-the-gioi-2417479.html
Comment (0)