Partial colouring of text in Matplotlib with LaTeX

Matplotlib doesn't natively support complex formatting of strings, such as partially colouring a label or bolding some words in an annotation. As discussed in the previous post complex formatting in Matplotlib the reason is that Matplotlib's underlying text object handles a whole string. The previous post showed how to use LaTeX to perform complex formatting. This post builds from the previous one to cover colouring in strings, using LaTeX and alternatively using Matplotlibs standard objects.

Imagine, an annotation along these lines:

This text is one string, this is red, this is green, this is bold and this is italic.

Using LaTeX we can define the font, mark the text with bold and italics, but adding the colour to our comment is a new challenge. Possible solutions people try are:

  1. Put the sentence in to multiple text objects, and then stack them vertically or horizontally, one text object for each element that's coloured [1]
  2. Use the Postscript backend and use LaTeX's \color command, then convert this to SVG with Inkscape [2]
  3. Use the PGF backend and use LaTeX's \color command, then convert to SVG with Inkscape
  4. Create the text directly in Inkscape or a similar external application

For a single comment it's easiest to take the last option and simply edit the plot in Inkscape. I'll show both of the other options. The advantage of using LaTeX is that it supports all the different sorts of formatting, but it is another system to set-up adding complexity to plot creation. Using multiple Matplotlib text objects lets us create all the formatting types, but it is very low-level and fiddly.

Partial colouring with LaTeX

LaTeX partial colouring of text elements example plot image

Figure 1: LaTeX partial colour example

I'm doing partial colouring with LaTeX using the PGF backend rather than the standard postscript one, the details on how to set it up are in the previous post. The constraint of this method is that output is to PDF and then we convert to SVG manually.

LaTeX supports colours [3], it's just a matter of choosing the packages to use. The xcolors package has the most flexibility, we load it as part of the pgf.preamble. Note that the order of loading is significant due to the way LaTeX packages interact with each other.

The xcolor [4] extension supports a wealth of options, I've chosen to use HTML colours which makes the format:

\textcolor[HTML]{0BFF01}{Text to colour}

Where it will recognise any standard hexadecimal Web colour: note that it has to be in capital letters though. As with the previous examples using LaTeX we cannot define mark-up across multiple lines of our string because each line is sent to LaTeX separately so it causes errors if the mark-up isn't complete.

The real advantage with this method is that it's easy to have complex formatting in every element of text, including labels. Figure 1 shows the results.

Example code

#!/usr/bin/env python3
# Set-up PGF as the backend for saving a PDF
import matplotlib
from matplotlib.backends.backend_pgf import FigureCanvasPgf
matplotlib.backend_bases.register_backend('pdf', FigureCanvasPgf)

import matplotlib.pyplot as plt

plt.style.use('fivethirtyeight')

pgf_with_latex = {
    "pgf.texsystem": "xelatex",     # Use xetex for processing
    "text.usetex": True,            # use LaTeX to write all text
    "font.family": "serif",         # use serif rather than sans-serif
    "font.serif": "Ubuntu",
    "font.sans-serif": [],          # Unset sans-serif
    "font.monospace": "Ubuntu Mono",
    "axes.labelsize": 10,
    "font.size": 10,
    "legend.fontsize": 8,
    "axes.titlesize": 14,           # Title size when one figure
    "xtick.labelsize": 8,
    "ytick.labelsize": 8,
    "figure.titlesize": 12,         # Overall figure title
    "pgf.rcfonts": False,           # Ignore Matplotlibrc
    "text.latex.unicode": True,
    "pgf.preamble": [
        r'\usepackage{xcolor}',     # xcolor for colours
        r'\usepackage{fontspec}',
        r'\setmainfont{Ubuntu}',
        r'\setmonofont{Ubuntu Mono}',
        r'\usepackage{unicode-math}',
        r'\setmathfont{Ubuntu}'
    ]
}
matplotlib.rcParams.update(pgf_with_latex)

fig = plt.figure(figsize=(8, 6), dpi=400)
plt.bar([1, 2, 3, 4], [125, 100, 90, 110], label="Product A",
        width=0.5, align='center')
ax1 = plt.axis()

# LaTeX \newline doesn't work, but we can add multiple lines together
annot1_txt = r'Our \textcolor[HTML]{0BFF01}{\textit{"Green Shoots"}} '
annot1_txt += r'Marketing campaign, started '
annot1_txt += '\n'
annot1_txt += r'in \textcolor[HTML]{BBBBBB}{Q3}, showed some impact in '
annot1_txt += r'\textcolor[HTML]{BBBBBB}{Q4}. Further '
annot1_txt += r'\textcolor[HTML]{011EFE}{\textbf{positive}} '
annot1_txt += '\n'
annot1_txt += r' \textcolor[HTML]{011EFE}{\textbf{impact}} is expected in '
annot1_txt += r'\textit{later quarters.}'

# Annotate using an altered arrowstyle for the head_width, the rest
# of the arguments are standard
plt.annotate(annot1_txt, xy=(4, 80), xytext=(1.50, 105),
            arrowprops=dict(arrowstyle='-|>, head_width=0.3',
                            linewidth=1, color='black'),
            bbox=dict(boxstyle="round", color='yellow', ec="0.5",
                      alpha=1))

# Standard description of the plot
# Set xticks, Font is set globally
plt.xticks([1, 2, 3, 4], ['Q1', 'Q2', 'Q3', 'Q4'])
plt.xlabel(r'\textbf{Time} - FY quarters')
plt.ylabel(r'\textbf{Sales} - unadjusted')
plt.title('Total sales by quarter')
plt.legend(loc='best')

plt.savefig('20160314colour1.pdf', bbox_inches='tight', transparent=True)

Partial colouring using text objects

Matplotlib text object partial colouring of text elements example plot image

Figure 2: Text area partial colour example

This alternative approach splits the text string across multiple text objects: one text object for each part of the sentence that we want to colour or format [5]. We pack the text elements into boxes to create the lines of the comment.

This method is quite fiddly to get right as you have to split the text manually and make sure you have the correct part of the formatting dictionary associated with the right section of text. The advantages are that we're using the standard Matplotlib configuration and text handling capabilities; as we're not handing off to LaTeX it's also much faster to create a plot.

The results are shown in Figure 2.

The steps are:

The TextArea() method [6] lets you define a string and a set of text properties as a dictionary: the text properties can be any of the standard string properties such as style, weight or color. This allows us to format each part of the string as we require.

Packing them into a multiline comment is complicated. First we have to create individual lines of text, and then stack them on top of each other. To create each line of text we pack the text areas into a horizontal box, using the HPacker() method [7], this accepts either a single TextArea object, or a list of TextArea objects. We use the VPacker() method [8] to put the individual lines into a box with each line stacked on top of the other.

The last step is to use the AnnotationBbox() method [9] to place the text annotation on the figure: like the standard annotation we use the xy parameter to define where we're pointing, and xybox to specify where the box should appear. Note that the parameters xycoords and boxcoords define which type of co-ordinates system to use, in this case we're using data which means we use the figures X and Y ticks. The last required call is to tell Matplotlib to add the new artist object to the Figure with the add_artist() method which is applied to the Axes.

Example code

#!/usr/bin/env python3
import matplotlib.pyplot as plt
import matplotlib.offsetbox as of

plt.style.use('fivethirtyeight')

# Set-up in a similar way to the LaTeX example
plt.rcParams['font.family'] = 'serif'
plt.rcParams['font.serif'] = 'Ubuntu'
plt.rcParams['font.sans-serif'] = []
plt.rcParams['font.monospace'] = 'Ubuntu Mono'
plt.rcParams['axes.labelsize'] = 10
plt.rcParams['font.size'] = 10
plt.rcParams['legend.fontsize'] = 8
plt.rcParams['axes.titlesize'] = 14
plt.rcParams['xtick.labelsize'] = 8
plt.rcParams['ytick.labelsize'] = 8
plt.rcParams['figure.titlesize'] = 12

fig = plt.figure(figsize=(8, 6), dpi=400)
plt.bar([1, 2, 3, 4], [125, 100, 90, 110], label="Product A",
        width=0.5, align='center')
ax1 = plt.gca()

# Split each line of text into a list of strings and formats
cmt_line1_txt = ["Our '", "Green Shoots", "' Marketing campaign, started in ",]
cmt_line1_fmt = [{'ha': 'left', 'va': 'bottom'},
            {'color':'#0BFF01', 'weight':'bold', 'ha':'left', 'va':'bottom'},
            {'color':'black', 'ha':'left', 'va':'bottom'},
            ]
cmt_line2_txt = ["Q3 ", "showed some positive impact in ",
            "Q4", ". Further ", "positive ",
            ]
cmt_line2_fmt = [
    {'color':'#BBBBBB', 'ha':'left', 'va':'bottom'},
    {'color':'black', 'ha':'left', 'va':'bottom'},
    {'color':'#BBBBBB', 'ha':'left', 'va':'bottom'},
    {'color':'black', 'ha':'left', 'va':'bottom'},
    {'color':'#011EFE', 'fontweight': 'bold', 'ha':'left', 'va':'bottom'},
]

cmt_line3_txt = ["impact", " is expected in ", "later quarters."]
cmt_line3_fmt=[
{'color':'blue', 'fontweight': 'bold', 'ha':'left', 'va':'bottom'},
{'color':'black', 'ha':'left', 'va':'bottom'},
{'style': 'italic', 'ha':'left', 'va':'bottom'}
]

# Create a single list for each line of text and format
# The list for each line is a number of TextArea objects
cmt_line1_lst = []
for txt, frmt in zip(cmt_line1_txt, cmt_line1_fmt):
    cmt_line1_lst.append(of.TextArea(txt, textprops=dict(frmt)))

cmt_line2_lst = []
for txt, frmt in zip(cmt_line2_txt, cmt_line2_fmt):
    cmt_line2_lst.append(of.TextArea(txt, textprops=dict(frmt)))

cmt_line3_lst = []
for txt, frmt in zip(cmt_line3_txt, cmt_line3_fmt):
    cmt_line3_lst.append(of.TextArea(txt, textprops=dict(frmt)))

# Pack each Text object in a line into a Horizontal Packer
texts_hbox1 = of.HPacker(children=cmt_line1_lst, pad=0, sep=0)
texts_hbox2 = of.HPacker(children=cmt_line2_lst, pad=0, sep=0)
texts_hbox3 = of.HPacker(children=cmt_line3_lst, pad=0, sep=0)

# Put the Horizontally Packed lines into a list
texts_line_store = []
texts_line_store.append(texts_hbox1)
texts_line_store.append(texts_hbox2)
texts_line_store.append(texts_hbox3)

# Put the lines of text into a Vertical Box to get the final result
texts_vbox = of.VPacker(children=texts_line_store, pad=2, sep=0)

# Put the final box on the figure
annot2_bbox=dict(facecolor='yellow', boxstyle='round', edgecolor='0.5', alpha=1)
arrow2=dict(arrowstyle='-|>, head_width=0.3', linewidth=1, facecolor='black')
annot2 = of.AnnotationBbox(texts_vbox, xy=(4, 80),xybox=(2.50, 120),
                        xycoords='data', boxcoords='data',
                        arrowprops=arrow2, bboxprops=annot2_bbox)
ax1.add_artist(annot2)

# Standard description of the plot
# Set xticks, Font is set globally
plt.xticks([1, 2, 3, 4], ['Q1', 'Q2', 'Q3', 'Q4'])
plt.xlabel('Time - FY quarters')
plt.ylabel('Sales - unadjusted')
plt.title('Total sales by quarter')
plt.legend(loc='best')

plt.savefig('20160314colour2.svg', bbox_inches='tight', transparent=True)

Final thoughts

Done! Over the last few posts we've covered everything needed to format Matplotlib text elements, both styling and colours. There's lots more that can be done with the underlying text area and packers to create complex figures, the examples are worth looking through. If you enjoyed the post, or think I've missed some styling elements please leave a comment!


[1]Partial coloring of text using text objects
[2]Second answer of Partial coloring of text in matplotlib using Postscript backend
[3]Wikibooks - LaTeX colours
[4]xcolor LaTeX package
[5]A few people have presented solutions using this method. Esmit's solution to Box around text in matplotlib is a good one. Paul Ivanov contributed a Rainbow text example to the Matploblib documentation.
[6]The definition is in the matplotlib.offsetbox.TextArea documentation and matplotlib.text.Text for text properties
[7]See the matplotlib.offsetbox.HPacker documentation
[8]See the matplotlib.offsetbox.VPacker documentation
[9]The Demo AnnotationBbox is a short example, see the matplotlib.offsetbox.AnnotationBbox for the parameters

Posted in Tech Monday 14 March 2016
Tagged with python matplotlib