Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: bake translations as part of the build processes #28483

Merged
merged 18 commits into from
May 29, 2024

Conversation

mistercrunch
Copy link
Member

@mistercrunch mistercrunch commented May 13, 2024

There's been some confusion around the state of the translations in the repo as well as within different builds (docker, pypi, official releases,...).

In this PR, we assume that "compiled" or redundant files that can be generated don't belong in the repository, and should be baked as part of the build process.

Done here:

  • removed all translation json files from the repo, so the same way as .mo (binary files compile from .po files, and used by the backend / pybable), .json files (used by the frontend/jed) aren't in the repo anymore
  • updated CI / docs / docker builds to build the translation the right way, make scripts DRY and referenced the scripts in ci/docs/docker builds
  • moved po2json stuff for the frontend under a npm run build-translation script so it's bundled with the app in superset-frontend
  • created an empty language pack for en that's empty, as opposed to a large json file of about 200kb that has a map of empty translation, this has a positive impact on the bootstrap payload size in original page load for the en locale

@github-actions github-actions bot added i18n Namespace | Anything related to localization i18n:spanish Translation related to Spanish language i18n:italian Translation related to Italian language i18n:french Translation related to French language i18n:chinese Translation related to Chinese language i18n:japanese Translation related to Japanese language i18n:russian Translation related to Russian language i18n:korean Translation related to Korean language i18n:dutch i18n:slovak i18n:ukrainian i18n:portuguese i18n:brazilian i18n:traditional-chinese labels May 13, 2024
@mistercrunch mistercrunch changed the title feat: bake transactions into Docker images feat: bake transactions as part of the build processes May 13, 2024
@rusackas rusackas changed the title feat: bake transactions as part of the build processes feat: bake translations as part of the build processes May 13, 2024
@rusackas
Copy link
Member

rusackas commented May 13, 2024

Tweaked the title. :)

I fear that this won't cover all use cases (e.g. people running from the repo). I suppose this could be done in package.json as part of npm run build and (probably) npm run dev / npm run dev-server to cover all the bases. Those are the processes used when installing superset via docker or otherwise anyway, so it might be a nice catch-all.

@mistercrunch
Copy link
Member Author

I suppose this could be done in package.json

Oh I like the idea of bringing the frontend part in there, looks like there's already other translation-related stuff there we can centralize/simplify

@github-actions github-actions bot added doc Namespace | Anything related to documentation dependencies:npm github_actions Pull requests that update GitHub Actions code labels May 16, 2024
@mistercrunch mistercrunch marked this pull request as ready for review May 18, 2024 22:22

```bash
pybabel extract -F superset/translations/babel.cfg -o superset/translations/messages.pot -k _ -k __ -k t -k tn -k tct .
./scripts/translations/babel_update.sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just capturing a note for one of us to open a PR later. It seems that if we touch the .pot file, the .po and .json/.mo files have a huge diff, as things aren't ordered deterministically. It might be good to add a sorting step in the file output, via this script or some other place, e.g.:

JSON:
jq -S . input.json > output.json

PO:

import polib

po = polib.pofile('path_to_file.po')
po.save_as_mofile('path_to_file.mo')  # Save if needed
sorted_entries = sorted(po, key=lambda x: x.msgid)
po[:] = sorted_entries
po.save('path_to_file.po')

@rusackas
Copy link
Member

I just want to test this locally before approving, but this is looking great!!!

@michael-s-molina / @villebro I think this could be considered non-breaking, but do you have any thoughts/objections?

@@ -1128,6 +1128,9 @@ def test_get_column_names_from_metric(self):
"my_col"
]

@pytest.mark.skip(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dpgaspar I think you may have written those 2 tests before that I'm disabling here. I believe these tests were looking for a failure of some kind, but were (are at least currently were) picking up on the language pack entry for "Error message". Turns out the "bootstrap payload" is including the language pack for the current locale, and if it happens to be en it included a lot of empty strings with empty mappings. I'm now making sure that the en language pack has an empty payload as opposed to a map with empty targets.

Now i'm thinking these 2 unit tests are wrong, and maybe unneeded, so I'm skipping them here and sending this message in a bottle to figure out if we want to follow up with anything on those.

@mistercrunch
Copy link
Member Author

@michael-s-molina @villebro , looking to merge this - let me know if you see any breaking change. Note that I added an entry in UPDATING.md giving a heads up that translations are now bundled.

Copy link
Member

@rusackas rusackas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we're having a hard time getting more eyes on this, but from what I see, this seems like a good improvement. If there is anything that would be considered a breaking change here, it seems like it'll be patchable to address those issues before 5.0.

@mistercrunch mistercrunch merged commit 8d57a35 into master May 29, 2024
31 checks passed
@mistercrunch mistercrunch deleted the build_translations branch May 29, 2024 23:58
@mistercrunch
Copy link
Member Author

I think it comes down to - if you build from repo, you have to run an extra [documented] command now. Oh also for apache releases I added a command to run, so as long as the release manager follows instructions we're good.

EnxDev pushed a commit to EnxDev/superset that referenced this pull request May 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies:npm doc Namespace | Anything related to documentation github_actions Pull requests that update GitHub Actions code i18n:brazilian i18n:chinese Translation related to Chinese language i18n:dutch i18n:french Translation related to French language i18n:general Related to translations i18n:italian Translation related to Italian language i18n:japanese Translation related to Japanese language i18n:korean Translation related to Korean language i18n:portuguese i18n:russian Translation related to Russian language i18n:slovak i18n:spanish Translation related to Spanish language i18n:traditional-chinese i18n:ukrainian i18n Namespace | Anything related to localization preset-io size/XXL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants