Combining LaTeX, Jinja2 and Babel for a multi-version, multi-language Curriculum Vitae

jjpk.me

Some people simply cannot have just one CV. If you are open to several sectors and types of jobs, you will mostly end up having several variants of your CV. If, on top of that, you're open to opportunities in several countries, each of these variants will need to be translated into every language you might end up working in.

Having to edit all your variants and translations every time you want to add an item to your CV is quite cumbersome. Some elements will be common across your variants (eg. your degrees), and even without that, keeping track of translations when you have to apply edits regularly can be a little tricky. In this article, we're looking at how I handle my own CV, combining LaTeX, Jinja2 templates and Babel/gettext message catalogs, and using a Python script for the build.

Two variants

I'm going to use my own needs as an example here. I need to maintain two variants of my CV: one for academia, and one that's more "corporate" for more techy positions. For languages, I need both versions translated into English and French.

For the variants, let's keep it simple:

  • The academic CV is two-column with widths at 50/50. It will include: education, teaching, publications and a research summary.
  • The corporate CV is also two-column, but at 30/60. It will mention skills in the small column, while education and experience will go into the larger one.

Note that education is common to both variants. They will also share an "about me" section which will serve as a header (with your name, address, phone number and so on).

Overview (and dependencies)

We'll be using a Python script to build the CVs. The Python dependencies are given below.

# requirements.txt
Jinja2
Babel

Here's the build process:

  1. The CV is broken into small sections, which we store in Jinja2 template files.
  2. Babel extracts translatable strings from the templates into a PO template (.pot) file and one catalog (.po) file per supported locale.
  3. You translate the .po files.
  4. The Python script compiles the catalogs and processes the templates to create one LaTeX source file per variant and per language.
  5. We simply compile each of these into a PDF (here, I'll be using LuaLaTeX).

Structure and build

Now, let's go through the various components we need.

Directory structure and samples

Here's what I suggest for the project. For now, I'm leaving i18n out of this.

/
`- common/
|  `- parts/
|  |  `- aboutme.tex.j2
|  |  `- education.tex.j2
|  `- base.tex.j2
`- academic/
|  `- parts/
|  |  `- teaching.tex.j2
|  |  `- publications.tex.j2
|  |  `- research.tex.j2
|  `- academic.tex.j2
`- corporate/
|  `- parts/
|  |  `- skills.tex.j2
|  |  `- experience.tex.j2
|  `- corporate.tex.j2
`- build.py
`- requirements.txt

The root files are sitting right under academic/ and corporate/. These extend common/base.tex.j2, which contains basic LaTeX configuration (document class, packages, ...). The parts/ subdirectories contain the various "components" of our CVs: some of them are common to both variants, the others are variant-specific. These templates are included by academic.tex.j2 and corporate.tex.j2. Here are some sample files:

% base.tex.j2
\documentclass{article}

\begin{document}
\JINJA{block body}\JINJA{endblock}
\end{document}

But wait! you'll say. What's that \JINJA macro? Indeed! It is not a LaTeX command, and Jinja2 blocks are typically written as {% block body %}{% endblock %}. But here's the problem: LaTeX is full of curly braces, and those would conflict with Jinja2's syntax. Luckily for us, the template engine allows us to "redefine" this, and so we will be using \JINJA{...} instead of {% ... and } instead of %}. We'll see how this can be set up once we get to the Python build script.

% academic.tex.j2
\JINJA{extends "common/base.tex.j2"}
\JINJA{block body}
\begin{minipage}[t]{.5\linewidth}
    \JINJA{include "common/parts/aboutme.tex.j2"}
    \JINJA{include "common/parts/education.tex.j2"}
    \JINJA{include "academic/parts/teaching.tex.j2"}
\end{minipage}%
\begin{minipage}[t]{.5\linewidth}
    \JINJA{include "academic/parts/research.tex.j2"}
    \JINJA{include "academic/parts/publications.tex.j2"}
\end{minipage}
\JINJA{endblock}
% corporate.tex.j2
\JINJA{extends "common/base.tex.j2"}
\JINJA{block body}
\begin{minipage}[t]{.3\linewidth}
    \JINJA{include "common/parts/aboutme.tex.j2"}
    \JINJA{include "corporate/parts/skills.tex.j2"}
\end{minipage}%
\begin{minipage}[t]{.6\linewidth}
    \JINJA{include "common/parts/education.tex.j2"}
    \JINJA{include "corporate/parts/experience.tex.j2"}
\end{minipage}
\JINJA{endblock}
% education.tex.j2
\section{Education}
\begin{itemize}
    \item Some degree
    \item Some other degree
    \item Yet another degree
\end{itemize}

The rest of the files are pretty similar, and well, most of it is up to you so I'll stop here.

The Python build script

Both Jinja2 and Babel are Python components. For this reason, it makes sense to script our build with that language, so we can use the full extent of those dependencies. I will explain the various parts of the script in the comments.

from jinja2 import Environment, FileSystemLoader
from os.path import dirname, abspath, basename, join
from os import makedirs
from subprocess import run, DEVNULL
import sys

BASE_PATH = dirname(abspath(__file__))
DIST_DIR  = 'dist'
DIST_PATH = join(BASE_PATH, DIST_DIR)


def build():
    # Create a Jinja2 environment. This is where the Jinja2 syntax is redefined.
    # Note that we're going a little further than just {% here and eliminating a
    # few other conflicts.
    env = Environment(
        block_start_string='\JINJA{',
        block_end_string='}',
        variable_start_string='\JINJAVAR{',
        variable_end_string='}',
        comment_start_string='\#{',
        comment_end_string='}',
        line_statement_prefix='%%',
        line_comment_prefix='%#',
        trim_blocks=True,
        autoescape=False,
        loader=FileSystemLoader(BASE_PATH)
    )

    # Create the build directory if necessary
    makedirs(DIST_PATH, exist_ok=True)

    # Process each CV variant in turn
    for cv in ('academic', 'corporate'):
        # Get the base template for the variant (eg. academic.tex.j2)
        template = env.get_template('%s/%s.tex.j2' % (cv, cv))
        # Render the template to a string
        rendered = template.render()

        # Write the resulting .tex source into a file
        tex_output_path = join(DIST_PATH, '%s.tex' % cv)
        with open(tex_output_path, 'w') as output:
            output.write(rendered)

        # Call the LuaLaTeX compiler (or whichever you prefer) on this source file
        cmd = ['lualatex', '-interaction', 'nonstopmode',
               basename(tex_output_path)]
        status = run(cmd, cwd=DIST_PATH, stdout=DEVNULL, stderr=DEVNULL)

        # Check the result
        if status.returncode != 0:
            return 1

    return 0


if __name__ == "__main__":
    sys.exit(build())

Now, if we run this script, it will indeed generate our two variants in dist/ : academic.pdf and corporate.pdf.

Translation (i18n)

Right now, all we have is one CV per variant. To also get one file per language, we need to add translation logic into all of this. To get there, we will be using Babel, which it turn uses gettext. Here's how it typically goes:

  1. In our templates, we isolate our translatable text so that it may be extracted by Babel/gettext.
  2. With that we generate PO files, which we then translate.
  3. We use Babel/gettext again, this time to compile the catalogs (MO files).
  4. We adjust our build script so that it can find the MO catalogs and process templates accordingly.

The templates

There is a Jinja2 extension to handle i18n, so we won't have much to do here. The usual syntax is {% trans %}This is translatable.{% endtrans %} but we must not forget our earlier syntax changes!

% education.tex.j2
\section{\JINJA{trans}Education\JINJA{endtrans}}
\begin{itemize}
    \item \JINJA{trans}Some degree\JINJA{endtrans}
    \item \JINJA{trans}Some other degree\JINJA{endtrans}
    \item \JINJA{trans}Yet another degree\JINJA{endtrans}
\end{itemize}

How you split your text into translatable units is really up to you, but I would suggest avoiding very long sentences/blocks.

Configuring Babel for extraction

Now that our templates are ready, we can ask Babel to extract our translatable strings and gather them into PO files. There's an issue though: we may or may not have completely changed the Jinja syntax. In other words: the default Babel configuration will look for {% trans %} blocks and will not detect our contents!

Luckily for us, Babel plays nice, and will work with a modified Jinja syntax, provided that you describe it, much like we did in the Python build script. To do this, we need a Babel mapping file. I'll but putting it into a new common/locale directory.

# common/locale/babel.ini
[jinja2: **.j2]
extensions=jinja2.ext.i18n,jinja2.ext.autoescape,jinja2.ext.with_
block_start_string=\JINJA{
block_end_string=}
variable_start_string=\JINJAVAR{
variable_end_string=}
comment_start_string=\#{
comment_end_string=}
line_statement_prefix=%%
line_comment_prefix=%#

The first line tells Babel to process Jinja templates (*.j2 files). To do so, it needs to load a few extensions, and the other lines are basically the same as those we have in the build script! We're simply telling Babel how to read our Jinja. Now that this is done, we can have our strings extracted and readied for translation:

$ pybabel extract -F common/locale/babel.ini -o common/locale/messages.pot .
$ pybabel init -i common/locale/messages.pot -d common/locale -l en_GB
$ pybabel init -i common/locale/messages.pot -d common/locale -l fr_FR

The first command will create a template PO file, while the other two actually create the translatable catalogs (.po files). Once you've run all that, you should get the following structure:

/
`- common/
|  `- locale/
|  |  `- babel.ini
|  |  `- messages.pot
|  |  `- en_GB/
|  |  |  `- messages.po
|  |  `- fr_FR/
|  |  |  `- messages.po

Translating and compiling message catalogs

Since the "raw" text in our templates is in English, I will be ignoring the en_GB catalog here: Babel will default to those strings if I do so. I need to translate the fr_FR catalog though:

msgid "Education"
msgstr "Éducation"

msgid "Some degree"
msgstr "Un diplôme"

msgid "Some other degree"
msgstr "Un autre diplôme"

msgid "Yet another degree"
msgstr "Encore un autre diplôme"

When all the translating is done, you can compile your catalogs with:

$ pybabel compile -d common/locale

This will creates binary .mo files alongside the readable .po ones. Those will be used by Babel and Jinja to translate our CV.

Adjusting the build script

In order to get our translations into the LaTeX source files, we need to make a few adjustments to our build script. Again, the details are in the comments.

from babel.support import Translations

# ...

LOCALE_PATH = join(BASE_PATH, 'common', 'locale')
LOCALES     = ['en_GB', 'fr_FR']

# ...

def build():
    # Activating the Jinja i18n extensions
    env = Environment(
        # ...
        extensions=['jinja2.ext.i18n',
                    'jinja2.ext.autoescape',
                    'jinja2.ext.with_']
    )

    # ...

    # Process each locale and each CV variant in turn
    for locale in LOCALES:
        # Load the translations for the current locale
        translations = Translations.load(LOCALE_PATH, [locale])
        env.install_gettext_translations(translations)

        for cv in ('academic', 'corporate'):
            # Get the base template for the variant (eg. academic.tex.j2)
            template = env.get_template('%s/%s.tex.j2' % (cv, cv))

            # Render the template to a string
            rendered = template.render()

            # Write the resulting .tex source into a file
            tex_output_path = join(DIST_PATH, '%s_%s.tex' % (cv, locale))
            with open(tex_output_path, 'w') as output:
                output.write(rendered)

            # ...

Now, whenever an update is made, the following needs to happen:

# Update the PO template
$ pybabel extract -F common/locale/babel.ini -o common/locale/messages.pot .

# Update the associated PO catalogs
$ pybabel update -i common/locale/messages.pot -d common/locale -l en_GB
$ pybabel update -i common/locale/messages.pot -d common/locale -l fr_FR

# ... translate new entries...

# Compile the PO catalogs
$ pybabel compile -d common/locale

# Build the CVs
$ python3 build.py

This will update the corresponding PDFs in dist/: academic_en_GB.pdf, academic_fr_FR.pdf, corporate_en_GB.pdf and corporate_fr_FR.pdf. All that's left for you to do is to send them!

Bonus: you can actually have all of this integrated into the Python script (after all, subprocess.run is already imported...) This, of course, makes the script a little lengthy, so I'm not going to put it here. You can however click here for a Gitlab snippet of the script I'm using.