Soo I’m working on a database that needs to support multiple languages (two for now, but who knows). I stumbled across this blog post that explains how to develop what it calls a “translation subschema” (haven’t seen it called like this anywhere else so I don’t know if it’s actually how you’d call it), which seems like a very nice way of dealing with things.

I’m not very experienced with DB stuff, so it took me a while to fully understand what it was doing, but now that (I think) I do, I’m wondering if I could just ignore the Languages table, and just use a language field in the tables TextContent and Translations, without loosing any functionality. (except of course having a table listing the available languages, which is not however something I’m interested in)

In my head everything would still work, I’d insert stuff with

INSERT INTO TextContent (OriginalText, OriginalLanguage)
VALUES ("Ciao", "it");

DECLARE TextContentId INT = SCOPE_IDENTITY();

INSERT INTO Translations (TextContentId, Language, Translation)
VALUES (@TextContentId, "it", "Ciao");
INSERT INTO Translations (TextContentId, Language, Translation)
VALUES (@TextContentId, "en", "Hello");

and given a TextContentId, i’d retrieve the correct translation with

SELECT Translation FROM Translations WHERE TextContentId = TCId AND Language = "en"

At this point, I’m thinking I could drop TextContent too, and just have a Translations table with TextContentId, Language, and Translation, with (TextContentId, Language) as primary key.

Am I missing something? I’m trying to simplify this solution but I don’t want to risk making random errors.

Edit: translations on the DB are for user inserted text, which will also provide translations. The client then will only receive text it the correct language.

  • talkingpumpkin@lemmy.world
    link
    fedilink
    arrow-up
    2
    arrow-down
    1
    ·
    1 day ago
    INSERT INTO TextContent (OriginalText, OriginalLanguage)
    VALUES ("Ciao", "it");
    

    Shouldn’t that be TextContent(TextContentId, OriginalText)? Something like

    (then you should make the id a primary key, index originaltext and make the id in the other table a foreign key)

    I could drop TextContent too, and just have a Translations table with TextContentId

    Sure, but the you would have to reference the text via TextContentId in your code, which would be very annoying.

    Instead you could have a function, say t("Ciao") that kinda runs something like (of course loading all the translations in ram at startup and referencing that would be better than running a query for each and every string).

    select t.translation
      from textcontent tc
           join translations t on t.textcontentid = tc.textcontentid
     where tc.originaltext = ?
       and t.language = ?
    

    The function could also return the originaltext and log an error message if a translation is not found.

    BTW 1: most frameworks/languages have i18n facilities/libraries - you may investigate one and use it instead of rolling your own.

    BTW 2: why would you put the translations in a database? what’s the advantage compared to files?

    • Eager Eagle@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      edit-2
      1 day ago

      Instead you could have a function, say t(“Ciao”) that kinda runs something like (of course loading all the translations in ram at startup and referencing that would be better than running a query for each and every string).

      this backfires when the same text translates to different strings depending on the context

      e.g. EN “Play” may translate to PT “Jogar” (as in play a game) or “Reproduzir” (as in play a video)

    • orsetto@lemmy.dbzer0.comOP
      link
      fedilink
      arrow-up
      2
      ·
      edit-2
      1 day ago

      Shouldn’t that be TextContent(TextContentId, OriginalText)? Something like

      (then you should make the id a primary key, index originaltext and make the id in the other table a foreign key)

      Yup, TextContentId is PK, and I’d set to auto_increment so that i wouldn’t have to worrya bout it.

      Sure, but the you would have to reference the text via TextContentId in your code, which would be very annoying

      Not really, the translations I keep in the database are for entity that would be retrueved from the client with already the translated values instead of ID (should have specified this in the post maybe)

      BTW1/2: the translations on the database are for entities on the db (stuff like user generated content, which will also input the translations). For general text on the page I’m using a specific framework, which would be difficult to use for the user generated content, at least how it’s set up now (it’s angular’s ngx-translate)

      • talkingpumpkin@lemmy.world
        link
        fedilink
        arrow-up
        1
        ·
        1 day ago

        the translations on the database are for entities on the db

        Oh, then you could consider having one extra table per entity (one-to-many) with the translatable stuff:

        create table some_entity (
            id .. primary key,
            -- fields for attributes that are not translated
            price ..,
            created_on ..,
            deleted_on ..,
            ..
        );
        create table some_entity_i18n(
            id .. primary key,
            some_entity_id .. foreign key references some_entity(id),
            locale ..,
            -- one field per translatable attribute
            title ..,
            description ..,
            ..
        );
        

        IMHO putting everything in one big table will only complicate things in the long run.