A little math can go a long way

I’m exploring the following problem as a potential exercise for students in my Scientific Computing course:

Problem

Calculate the sum of all positive integers smaller or equal to N which are divisible by 3 or 5. For example, if N = 20, then the answer is 3+5+6+9+10+12+15+18+20 = 98

Unit tests

test <- function(.f) {
  structure(
    list(
      tinytest::expect_equal(.f(3), 3, info = "Input: N = 3"),
      tinytest::expect_equal(.f(-100), 0, info = "Input: N = -100"),
      tinytest::expect_equal(.f(20), 98, info = "Input: N = 20"),
      tinytest::expect_equal(.f(6), 14, info = "Input: N = 6"),
      tinytest::expect_equal(.f(0), 0, info = "Input: N = 0")
    ),
    class = "tinytests"
  )
}

Code smell

Here is a horrible way to implement this in R:

sum_multiples_loop <- function(N) {
  out <- 0
  for (i in 1:N) {
    if (i %% 3 == 0 || i %% 5 == 0) {
      out <- out + i
    }
  }
  return(out)
}

sum_multiples_loop(20)

[1] 98

It works, but there are a ton of problems that make this code really slow and we can make it several thousand times faster by the end of this post. But it is also a good idea to test more than one case, and since I don’t want to copy paste calls to each function for many different scenarios, I have written a small suite of unit tests:

test(sum_multiples_loop)

----- FAILED[data]: <-->
 call| test(sum_multiples_loop)
 diff| Expected '0', got '-2418'
 info| Input: N = -100
 
Showing 1 out of 5 results: 1 fails, 4 passes fubar!

Not surprisingly, our function fails when the input is less than 0, so this is something that we need to address later.

Standard vectorization

We can of course vectorize a lot and simplify the whole thing dramatically by using standard R vectorization. Instead of looping over positive integers from 1 to N, we create a vector x which stores those integers. Then we take advantage some vectorizations

sum_multiples_vectorized <- function(N) {
1  if (N <= 0) return(0)
2  x <- seq_len(N)
3  is_divisible <- x %% 3 == 0 | x %% 5 == 0
4  sum(x[is_divisible])
}

test(sum_multiples_vectorized)

1: Take care of edge cases with an early return; alternative is to give an error
2: seq_len(N) is a safer alternative to 1:N (doesn’t matter here because we return early if N is not-positive, but good practice in general)
3: Boolean operations on vectors produce vectors of TRUE and FALSE values, so no need for loops and if statements
4: Subseting a vector with a boolean vector will return a new vector with only those values for which is_divisible is TRUE; sum is vectorized and will return the sum of all elements in x

All ok, 5 results fubar!

We can now compare the performance of the two functions. There are many ways to do this. Here is one popular option:

bench::mark(
  sum_multiples_loop(1e6),
  sum_multiples_vectorized(1e6)
)

Warning: Some expressions had a GC in every iteration; so filtering is
disabled.

# A tibble: 2 × 6
  expression                           min   median `itr/sec` mem_alloc `gc/sec`
  <bch:expr>                      <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
1 sum_multiples_loop(1e+06)        108.9ms  112.4ms      8.85    6.85KB    104. 
2 sum_multiples_vectorized(1e+06)   11.9ms   12.8ms     69.5     37.9MB     87.3

but we can do the math and get much better results:

sum_multiples_math <- function(N) {
  if (N <= 0) {
    return(0)
  }
  triangular_number <- function(n) {
    n * (n + 1) / 2
  }

  a <- 3
  b <- 5
  c <- 15

  a * triangular_number(floor(N / a)) +
    b * triangular_number(floor(N / b)) -
    c * triangular_number(floor(N / c))
}

test(sum_multiples_math)

All ok, 5 results fubar!

compare to the other two functions:

bench::mark(
  sum_multiples_loop(1e6),
  sum_multiples_vectorized(1e6),
  sum_multiples_math(1e6)
)

Warning: Some expressions had a GC in every iteration; so filtering is
disabled.

# A tibble: 3 × 6
  expression                           min   median `itr/sec` mem_alloc `gc/sec`
  <bch:expr>                      <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
1 sum_multiples_loop(1e+06)        111.7ms  113.3ms    8.74e0    6.85KB    101. 
2 sum_multiples_vectorized(1e+06)   11.9ms   12.9ms    6.84e1    37.9MB     86.0
3 sum_multiples_math(1e+06)          615ns    697ns    1.36e6   11.52KB      0

of course, in all cases we can parameterize the function rather than hard-coding it. The challenge would be to find the least common multiple. You could use the function pracma::Lcm

sum_series_divisors <- function(N, a, b) {
  if (N <= 0) {
    return(0)
  }
  triangular_number <- function(n) {
    n * (n + 1) / 2
  }

  lcm <- pracma::Lcm(a, b)

  a * triangular_number(floor(N / a)) +
    b * triangular_number(floor(N / b)) -
    lcm * triangular_number(floor(N / lcm))
}

test(\(x) sum_series_divisors(x, a = 3, b = 5))

All ok, 5 results fubar!

which incurs a cost of course, but that cost is minuscule relative to the other approaches above:

bench::mark(
  sum_multiples_math(1e6),
  sum_series_divisors(1e6, a = 3, b = 5)
)

# A tibble: 2 × 6
  expression                            min  median `itr/sec` mem_alloc `gc/sec`
  <bch:expr>                        <bch:t> <bch:t>     <dbl> <bch:byt>    <dbl>
1 sum_multiples_math(1e+06)           615ns   697ns  1326018.    11.5KB      0  
2 sum_series_divisors(1e+06, a = 3…   5.7µs   6.4µs   151825.   432.8KB     60.8

but let’s not use any libraries and write our own lcm function…