Word Segmentation Service for Vietnamese

Unlike English, Vietnamese is a single-syllable language. When processing Vietnamese texts, we cannot just split the words by spaces and punctuations.

For example, âm and tính when standing alone may mean something but a phrase like âm tính creates a totally different meaning.

We need to correctly split the phrases in order to do other analysis tasks properly.

VnCoreNLP is one of the excellent libraries developed for that purpose and I find it most accurate. But it’s slow as hell when loading models so I’ve created this service as a wrapper so that it loads the models only once.

It’s a simple Spring Boot application. Check it out here: https://github.com/ndthuan/vi-word-segmenter.

A forked version of the library with some improvements: https://github.com/ndthuan/VnCoreNLP.

Pre-built Docker images: https://hub.docker.com/r/ndthuan/vi-word-segmenter.

Go client: https://github.com/ndthuan/go-vi-wordseg-client.

Pictures Taken at Youthcan! Launch Ceremony

So proud to have been selected as one of the voluntary photographers for SOS Children’s Villages Vietnam at Youthcan! Launch Ceremony in Ho Chi Minh City on October 25th, 2019.

Youthcan! is a global initiative that was first launched in 2017. It aims to empower underprivileged young people with improved skills and new skills in order to enhance their employability as well as their salary.

This long term program cannot make its way to success without corporate partnerships. Therefore, the event in HCMC welcomed leaders and representatives from a number of regional and local companies. You could join it too! Please reach out to SOS Children’s Villages Vietnam for more information.

From my side, it’s always a great honor to contribute my little support in spreading this positive energy. I want subjects of my pictures to shine and I hope someday I can take pictures of you doing good things. 👍

Have a look at some of the pictures I photographed (hosted on OneDrive):