String interpolation in Python and R (the best ways)

2020-10-06 useful package tips

Here we go

In layman’s terms, string interpolation is executing code within a string (text).

Let’s keep this post short and to the point. We’ll look at some R code, then move on to Python. I’ll simply show how to use each method of string interpolation, and highlight my preferred method for each language.

String interpolation in R

Good

paste is a good way to paste together text and variables, although not my favorite.

name <- 'Avery'
age <- 24
paste('Hello! My name is', name, 'and I am', age, 'years old.')
## [1] "Hello! My name is Avery and I am 24 years old."

Remember that R is vectorized, so no need for a for loop in cases like this:

name <- c('Avery', 'Susan', 'Joe')
age <- c(24, 20, 40)
paste('Hello! My name is', name, 'and I am', age, 'years old.')
## [1] "Hello! My name is Avery and I am 24 years old."
## [2] "Hello! My name is Susan and I am 20 years old."
## [3] "Hello! My name is Joe and I am 40 years old."

The default separator in paste is a space " ", but obviously you can change that to something else.

x <- 25
y <- 15
paste('x + y', x + y, sep = ' = ')
## [1] "x + y = 40"

Run ?paste for more information.

Best

paste is good, but glue is best. Ever since I discovered the glue function from the glue package, I rarely use paste anymore.

Don’t forget to load the package:

library(glue) # package for easy string interpolation

glue is easy to use. Just put code that you want to execute inside of braces { }. Also, everything goes inside of quotes.

size <- c("Small", "Medium", "Large")
cyls <- sort(unique(mtcars$cyl)) # mtcars is a built-in dataset that comes with R

glue("{size} cars sometimes have {cyls} cylinders. But don't quote me, I'm not a car guy.")
## Small cars sometimes have 4 cylinders. But don't quote me, I'm not a car guy.
## Medium cars sometimes have 6 cylinders. But don't quote me, I'm not a car guy.
## Large cars sometimes have 8 cylinders. But don't quote me, I'm not a car guy.

Personally, I find the glue { } syntax cleaner, easier to read and type, and more intuitive than the base R paste. For tidyverse users, glue style syntax is also popping up in other places in the tidyverse (for example, see the .names argument in the relatively new dplyr::across function).


String interpolation in Python

library(reticulate) # package for running Python within R

Good

Similar to R’s paste:

name = 'Avery'
age = 24
print('Hello! My name is ' + name + ' and I am ' + str(age) + ' years old!')
## Hello! My name is Avery and I am 24 years old!

This method is also pretty clunky. Let’s try something better.

Better

Using the format method is not too shabby. Things are starting to look like R’s glue.

print('Hello! My name is {name} and I am {age} years old!'.format(name = name, age = age))
## Hello! My name is Avery and I am 24 years old!

Notice above how we specify name = name inside of the format method. The placeholders don’t actually represent our variables like you might think. You, the programmer, have to specify placeholder = some_variable. You also don’t have to put anything inside of the {}. If you leave the curly braces empty, Python relies on the order of the arguments that you put inside of the format method.

emotion = 'sad'
print('I am sick and tired of {}! I am so {}.'.format('Covid', emotion))
## I am sick and tired of Covid! I am so sad.

format works fine, but I think Python really knocks it out of the park with something called f-strings.

Best

The syntax is almost exactly the same as glue. Instead of writing glue('some text {code}'), you just add the letter f before any string. This allows you to use the same curly brace syntax as before, easily executing the code within.

language = 'French'
time = '3 years'

print(f'I have been speaking {language} for about {time}. I feel accomplished.')
## I have been speaking French for about 3 years. I feel accomplished.

Careful though. Python isn’t vectorized like R is, so the following code might not work as expected.

languages = ['French', 'Spanish', 'English']
times = ['3 years', '1 year', 'my entire life'] 

print(f'I have been speaking {languages} for {times}. I feel accomplished.')
## I have been speaking ['French', 'Spanish', 'English'] for ['3 years', '1 year', 'my entire life']. I feel accomplished.

You have to do more work, which isn’t too terrible.

for (l, t) in zip(languages, times):
  print(f'I have been speaking {l} for {t}. I feel accomplished.')
## I have been speaking French for 3 years. I feel accomplished.
## I have been speaking Spanish for 1 year. I feel accomplished.
## I have been speaking English for my entire life. I feel accomplished.

Many experienced programmers would say that if you are using a for loop, you probably shouldn’t be. There is usually a better option. Loops in generally are very error prone. Its probably not apparent with this toy example, but in case you were curious here is the same thing as above accomplished with map and a lambda function.

list(
  map(
    lambda l, t: print(f'I have been speaking {l} for {t}. I feel accomplished.'),
    languages, times
    )
  )
## I have been speaking French for 3 years. I feel accomplished.
## I have been speaking Spanish for 1 year. I feel accomplished.
## I have been speaking English for my entire life. I feel accomplished.
## [None, None, None]

I won’t get into map and lambda here, but there are tons of great resources our there on the web. If you don’t understand the code above, just google “python map and lambda.”

That’s all for now folks

Like I said, short and to the point. If you learned something here, especially if you didn’t know about glue and f-strings and you think they are useful, well then that is awesome. Thanks for reading. Stay safe and happy coding!

comments powered by Disqus