VocaDB translations system



One of the unique features of VocaDB (at least I’m not aware of any other website that has this) is the option to choose your content display language: whether you want titles displayed and sorted in the original language, Romanized or English. Although it’s one of my favorite features, it’s still a compromise and leaves a lot to be improved.

First, it’s important to note that for VocaDB, user interface language and content language (titles of entries such as songs) are two completely different concepts and can be changed separately. A user might want to view the user interface in English, but display titles in original language (not translated) instead. Someone else might want to see translated titles whenever possible. User interface translations are built into the code while entry titles are stored in a relational (SQL) database. While the number of user interface strings is large as well, there’s still many many more entry titles to be translated. The user interface also doesn’t need to be user-editable, unlike the database content. This means that the user interface can be translated into more languages than the content.

Early in VocaDB’s development I had to make a choice: should the translations of the primary (display) name be entered in concrete languages such as Japanese, English and Chinese, or “virtual languages”. One of the common things in programming is that variable length lists are difficult to implement efficiently. A small number of choices can be inlined, which is much more efficient when it comes to display and sorting. In database design it’s beneficial to minimize the number of joins, even if the tables are indexed. Additionally, not all names will be translated into all languages, so a fallback name would have to be generated, making the database queries even more complicated. Thus, I would’ve had to limit the list of languages to a very small number, basically Japanese, Romanized and English, but this would’ve left out Chinese, Korean and Spanish names. Instead, I made the third choice “non-English”, meaning the title in it’s native language. Other names can be entered as aliases, but aliases cannot be used for display or sorting.

Database model for songs and names
Database model for songs and names

As can be seen above, names are inlined for the entities (there’s EnglishName, JapaneseName (that’s non-English) and RomajiName). There’s also the OriginalName field which can be any language depending on the default name language option. Matching artist strings (that contain the producers and vocalists) are also included. AdditionalNamesString is a comma-separated list of aliases. With these, it’s possible to get a list of all names for a song without joining the SongNames table. Additionally, names are automatically substituted when no name is available for that language option. For example, if only non-English name is entered, the same name is copied to RomajiName and EnglishName fields so there’s no need for conditional operators when sorting songs by name.

I’d very much want all languages to be treated equal. The main problem in the current system is that it gives unfair preference to English, and to lesser extent, Japanese. All names can be translated into English, and Japanese is used as the “de facto” non-English name for language neutral concepts such as genres.

Would it be possible to allow entering multiple translations for the primary name? Yes, but it would require much more time in performance optimization. Maybe some day…

Tags: -

Leave a Reply

%d bloggers like this: