Sometimes I get questions from C/C++ programmers about Python, and I found people are often confused by a few methods in old Python code when they see Python features. I’m going to start a series of posts just to make it easier to understand. These posts won’t be about very complicated or fancy things, just quick reads to get the sense of how something works. This first post will be about Python’s reduce().
I plan to also include other Python methods rooted from functional programming: lambda, filter(), map(). I personally do not think it is necessary for someone to intentionally change their thinking just because they are using a different language. It is not necessary to force ourselves to code with unfamiliar features unless there is a good reason, for example when performance is really a lot better and important using a new feature, but in all cases, it is good to know what they mean when reading others’ codes. What is the most important is to keep our productivity high. So my goal here is to give quick explanation and examples so that we can understand when we see some Python codes written with these functions.
There is another reason I am saying this: There were talks about removing reduce(), filter(), and map() in Python 3 in the old days, but these were still very popular so eventually they stayed. As you can see even the Python community had debates about things.
An example of reduce()
OK. Now coming back to reduce(). A quick search on Google you can find a post from RealPython, which is a pretty good starting point. RealPython says:
Python’s
reduce()
operates on any iterable—not just lists—and performs the following steps:1. Apply a function (or callable) to the first two items in an iterable and generate a partial result.
2. Use that partial result, together with the third item in the iterable, to generate another partial result.
3. Repeat the process until the iterable is exhausted and then return a single cumulative value.
RealPython
This is a great explanation, but I felt it is still a bit ambiguous when I was first reading it and I had to finish the entire post to be sure what I am learning. I like to understand things by my own real examples. So here is one: I want to calculate the GCD of an array of integers. I put these integers in a list:
my_list = [10, 15, 20, 25]
The algorithm to calculate the GCD can be explained this way: I take the first 2 integers 10 and 15 in the array, calculate the GCD of them (The answer is 5). We then repeat the process of calculating GCD of the previous answer (5) with the next integer of the array (20), and then again the previous answer (5) and the next integer of the array (25)…until at this point we have exhausted the array. We will finally return the GCD of the entire array.
Now going back to the explanation I quoted above, with reduce we can:
- Apply the function (GCD) to the first two items (10, 15) in the iterable (my list) and generate partial result.
- Use the partial result, together with the third item (20) to generate another partial result.
- Repeat the process until the iterable (the list) is exhausted and return the single value.
As you can see, the key is to identify “function” and the “iterable” here. So the entire algorithm can be easily written in one line:
reduce(math.gcd, my_list)
I am using math.gcd here. Of course, we can use reduce with any user-defined function as well.
Using reduce() with initializer
There is another way to use reduce. If we look into the definition of reduce, we see there is the third, optional argument called initializer. If the initializer is provided, what reduce will do is:
- Use initializer as the initial partial result
- Apply the function to partial result and the next element. This will start from the first element.
One example is if we want to add a list of number to a known base number.
base_number = 1000
my_list = [1, 2, 3, 4]
total = reduce(sum, my_list, base_number)
In this example, reduce will execute 1000 + 1, then 1001 + 2, then 1003 + 3, and then 1006 + 4, and finally return 1010.