scale_minmax()

An R function to recode variables using min-max scaling.

Author

Joshua A. Goode

Published

July 18, 2024

An Brief Introduction to Min-Max Scaling

In min-max scaling, values are recoded to have a particular range. The simplest approach is to recode values to have range \([0, 1]\) (see Eq. 1).

\[\begin{equation}\tag{Eq. 1: Range [0, 1]} x' = \frac{\left(x - min(x)\right)}{max(x) - min(x)} \end{equation}\]

Alternatively, values can be recoded to have range \([a, b]\) (see Eq. 2).

\[\begin{equation}\tag{Eq. 2: Range [a, b]} x' = \frac{\left(x - min(x)\right)\left(b - a\right)}{max(x) - min(x)} + a \end{equation}\]

Min-Max Scaling in R

Tips

All code blocks on this page can be copied by clicking in the upper right corner.
As with all content on my site, please feel free to reach out if you have any questions.

The Function

scale_minmax <- function(x, to_min = 0, to_max = 1){
  (x - min(x, na.rm = TRUE)) * (to_max - to_min) /
    (max(x, na.rm = TRUE) - min(x, na.rm = TRUE)) +
    to_min
}

Examples

Example 1

To use the function, we just call scale_minmax and provide the object to be recoded. By default, it will recode values to range \([0,1]\).

ex1 <- scale_minmax(mtcars$mpg)

Let’s check a few things to make sure everything is working as we expect. First, we’ll check to make sure that the range matches the values we specified.

c(min(ex1, na.rm = TRUE), max(ex1, na.rm = TRUE))

[1] 0 1

That’s looks good, so let’s check that the correlation is 1.

cor(mtcars$mpg[!is.na(mtcars$mpg)], ex1[!is.na(ex1)])

[1] 1

That also looks good! Finally, let’s check we have the same number of missing values in our new variable.

sum(is.na(mtcars$mpg)) == sum(is.na(ex1))

[1] TRUE

We’re good to go!

Example 2

Alternatively, we can specify the range for our new variable using the to_min and to-max options. A common approach is to recode values to \([-1, 1]\).

ex2 <- scale_minmax(mtcars$mpg, to_min = -1, to_max = 1)

Let’s run through our checks again to convince ourselves that everything still works.

c(min(ex2, na.rm = TRUE), max(ex2, na.rm = TRUE))

[1] -1  1

cor(mtcars$mpg[!is.na(mtcars$mpg)], ex2[!is.na(ex2)])

[1] 1

sum(is.na(mtcars$mpg)) == sum(is.na(ex2))

[1] TRUE

Everything looks good!

Example 3

We can also choose completely random values of min and max if we want to. This example recodes values to range \([1981, 1984]\).

ex3 <- scale_minmax(mtcars$mpg, to_min = 1981, to_max = 1984)

We’ll check everything one last time just to be sure.

c(min(ex3, na.rm = TRUE), max(ex3, na.rm = TRUE))

[1] 1981 1984

cor(mtcars$mpg[!is.na(mtcars$mpg)], ex3[!is.na(ex3)])

[1] 1

sum(is.na(mtcars$mpg)) == sum(is.na(ex3))

[1] TRUE

Everything still looks good!