zalo700.jpg
Zalo is the pioneer in AI in Vietnam

Just one year after OpenAI launched ChatGPT, Vietnam produced an LLM of its own.

Technology experts have been surprised by Zalo’s LLM. Just within six months, Zalo, , deployed a calculating infrastructure and built an LLM with 7 billion parameters in the Vietnamese language. 

As of late 2023, Zalo’s LLM had 150 percent of capacity compared with OpenAi’s GDP-3.5 on the basis of VMLU (Vietnamese Multitask Language Understanding Benchmark Suite), which consists of 10,000 questions in 50 various fields of natural sciences, social sciences, STEM (Science, Technology, Engineering, and Mathematics), etc.

Zalo’s engineers' difficulties

Asked about the difficulties Zalo’s engineers had to cope with when training LLM in Vietnamese, Nguyen Ba Dat, product director of Zalo AI, said there were three "lacks"– the lack of calculating infrastructure, lack of data, and lack of resources.

Regarding the calculating infrastructure, Vietnamese engineers had to join an unequal competition with rivals. 

While large corporations such as OpenAI and Meta have thousands of Nvidia’s newest GPU (graphics processing units), Vietnam, at the time of starting LLM training, was still not fully equipped with the necessary server infrastructure.

Regarding the second "lack". Dat said it is more difficult to train LLM in Vietnamese than other languages, such as English and Chinese, because the latter languages have rich digitized data, while Vietnamese is listed among ‘low-resource’ languages, with a data resource ‘poorer’ by tens of times. 

Compared with developed countries, Vietnam also has disadvantages in human resources and lacks experiences in LLM training.

It was also necessary to have a poweful calculating infrastructure system. With maximum efforts, just within the second half of 2023, Zalo built a system with 8 DGX H100, the newest GPU from Nvidia.

While waiting for GPU from Nvidia, Zalo engineers tried to take full advantage of existing small civil-use GPUs to implement a series of research work. 

Thanks to this, by the time Zalo received the big calculating infrastructure, its engineers were ready in knowledge and capability to train LLM immediately.

According to Dat, Zalo succeeded in its LLM thanks to a reasonable data training strategy which compensated for the ‘low data source’ shortcomings.

Opportunities and challenges

Training LLM is just the first step on the AI path. In addition to research and training of larger models in quantity and better in quality, Zalo is applying LLM to create value for users which is the finish point of the development team.

The apps could be smart chatbots which give support in customer care, or tools which help increase productivity and content creation capability.

Zalo has recently tried Kiki Giao Thong app, integrated in Official Account displayed on the Zalo platform. The app has been praised by the community thanks to its Q&As about violations of Vietnamese traffic laws, with outstanding accuracy.

“For Zalo AI engineers, challenge doesn’t mean ‘difficulty’, but ‘opportunity’ to do significant things. This not only boosts development, but also brings joy and motivation,” Dat said.

However, while feeling encouraged by the achievements, he said the development team still needs to make great efforts, because developing AI for Vietnam is a "path with great challenges".

Vietnam now ranks 59th out of 193 countries in Government AI Readiness Index report conducted by Oxford Insights. It had a one-grade promotion in the 2023 index among ASEAN countries, ranking fifth out of 10 regional countries.

Zalo is the pioneer in AI in Vietnam, starting in 2017. Zalo is now running 4 AI Labs with 80 researchers, with strong infrastructure, which includes a server system comprising 8 DGX H100 with leading processing capability in Vietnam, with performance of 256 petaFLOPS (FLoating-point Operations Per Second). 

Zalo’s most outstanding AI products include the Kiki voice assistant, dictation and voice-to-text technology, text-to-speed technology, FaceID, eKYC (Electronic Know Your Customer) and GenAI (AI Avatar, AI Sticker).

Dau Linh