Many candidates commented that the math questions were long and difficult to complete in 90 minutes. However, this was not an obstacle for AI. Photo: Duy Hieu . |
On the afternoon of June 26, candidates completed the math test in the 2025 high school graduation exam, with a time limit of 90 minutes. This was the first test after the Ministry of Education and Training applied a new format, which is said to be more difficult than in previous years.
While this year's math problems may be difficult for candidates because they are long and time-consuming, AI chatbots do not take much time to process. To test the effectiveness of AI, Tri Thuc - Znews used 4 chatbots including ChatGPT, Google Gemini, Claude AI and Grok AI to solve some essay questions of this year's high school graduation exam.
Fast processing, "hit or miss" results
Chatbots were used to answer the short questions of test code 0109. Among them, ChatGPT and Gemini gave the most correct results with the least delay. Both chatbots answered 6 questions with a time of 7-15 seconds for each question. However, Gemini was able to solve the above problems with the 2.5 Flash model (no reasoning), which helps to process quickly and comprehensively.
Meanwhile, Claude completely failed at his calculations, giving all the wrong answers. Despite being asked to recalculate, Anthropic’s chatbot kept giving the same answer. Grok answered about half of the questions correctly, but with a long response time (over 2 minutes for each question).
For ChatGPT and Grok, solving these questions requires the inference version, which takes much longer. Gemini is very fast, maybe only 5 seconds for the fastest question, and only uses the 2.5 Flash model.
ChatGPT presents the thought process very vividly. |
In terms of speed, Gemini had the fastest processing time, averaging less than 10 seconds per problem, but had more complex, wordy, and difficult-to-follow solutions. Next was ChatGPT's reasoning model, which averaged 25 seconds. Meanwhile, while still getting the correct results, Grok took a lot of time to reason, with 148 seconds for a moderately difficult question.
Although asked in Vietnamese, all three models presented their reasoning process in English. ChatGPT had the briefest description, with many illustrations, graphs, and easy-to-understand analysis. Gemini also clarified and presented the model's thinking in order.
Grok, in particular, has the most human-like thought process. The model constantly uses “however, wait, on the other hand” to question itself, much like a student would when solving a math problem. This can cause the chatbot to overthink the problem and slow down the response time.
It took Grok 148 seconds to elaborate on his results. |
AI solves math differently than humans
A study from Apple found that inference models don’t actually use their brains, but instead just learn by rote from available data. The study also suggests that AI has a completely different thought process than humans, so they try to imitate the way we solve a problem. However, it’s possible that the reasoning process is just a fabrication of the model.
In the context of the increasingly difficult high school graduation exam requiring high analytical thinking, using AI for reference and learning is no longer strange to students. Among the chatbots used above, ChatGPT and Gemini are 2 suitable options for self-learners to refer to solutions to difficult problems.
![]() |
Hanoi students in the 2025 high school graduation exam. Photo: Viet Ha . |
However, although AI produces results quickly and easily, its reasoning process is not yet fully understood by developers. In an academic environment, human thinking ability is still the core factor. Mr. Tuan Nguyen, a lecturer at an international university in Ho Chi Minh City, said that using AI is normal, but students need to understand the lesson, practice critical thinking skills and master smart tools to study more effectively.
Mr. Tran Manh Tung, Head of the Mathematics Department at Newton Secondary School, commented that the exam was similar in format to the sample exam previously released by the Ministryof Education and Training. “However, if we put it on the scale, the real exam was more difficult and more differentiated than the mock exam,” he commented.
This year’s exam consists of three parts corresponding to three Roman numerals. The first two parts are multiple choice, not too difficult for candidates to get points easily, said Mr. Tung. However, the remaining parts are short questions, similar to the essay format from many years ago, except that candidates only need to fill in the results and do not need to present them.
Source: https://znews.vn/ai-chi-mat-10-giay-de-giai-bai-toan-thi-tot-nghiep-thpt-post1563990.html
Comment (0)