List Processing with map, filter, and reduce

Publikováno: 11.4.2018

Functional programming in JavaScript is all the rage these days.

And it should be: The benefits for readability, maintainability, and testability are dramatic.

"Functional programmmin...

Celý článek

Functional programming in JavaScript is all the rage these days.

And it should be: The benefits for readability, maintainability, and testability are dramatic.

"Functional programmming" means many things to many people. One of my favorite tools from the FP mindset is programming in a list-processing style. This entails taking arrays—or lists, as I prefer to call them—as your fundamental data structure.

Then, your program becomes a series of operations on elements in the list. And there are many contexts in which this is useful: Mapping AJAX results to React components with map. Removing extraneous data with filter. Doing...Well, fancy things with reduce.

These functions, called "Array Extras", are abstractions over for loops. There is nothing you can do with with these functions that you can't achieve with for, and vice-versa.

There are some finer points as to their usage, though, which are important to keep in mind as you cultivate an increasingly functional mindsetn.

Today, we'll lay some ground work by taking a look at all three of these functions. We might even meet an extra function or two, just for good measure.

But, we'll never figure out what's in store if we don't get started. So let's get started with something you already know: The good ol' for loop.

Iteration with for

We use for loops to iterate over every item in an array. Usually, we do something to each item along the way.

One simple example would be capitalizing every string in an array.

const strings = ['arielle', 'are', 'you', 'there'];
const capitalizedStrings = [];

for (let i = 0; i < strings.length; i += 1) {
    const string = strings[i];
    capitalizedStrings.push(string.toUpperCase());
}

console.log(capitalizedStrings);

In this snippet, we start a list of lowercase names. Then, we initialize an empty array, in which we'll store our capitalized strings.

Inside of the for loop, we simply grab the next string on every iteration; capitalize it; and push it into capitalizedStrings. At the end of the loop, capitalizedStrings contains every word in strings, but...You know, capitalized.

This brings us to our first function: forEach. This is a method on arrays that "automatically" loops through the list for us. In other words, it handles the details of initializing and incrementing a counter for us.

Instead of the above, where we manually index into strings, we can simply call forEach, and receive the next string on every iteration. The updated version would look something like:

const strings = ['arielle', 'are', 'you', 'there'];
const capitalizedStrings = [];

strings.forEach(function (string) {
    capitalizedStrings.push(string.toUpperCase());
})

...This is almost the same as what we started with. But getting rid of that i is already a big win.

This also introduces a major pattern we'll see time and time again. Namely: We prefer to use methods on Array.prototype that abstract away details like initializing and incrementing counters. That way, we can focus on the important logic, rather than the boilerplate.

But first! A message from our sponsor.

Caesar, Known for Brute Force

In the snippets below, we'll use encrypting and decrypting strings in our examples of map, reduce, and filter.

For example:

// Thanks to EvanHahn for this: https://gist.github.com/EvanHahn/2587465
const caesar = require('./caesar');

// We can encrypt the string: 'this is my secret message' with `caesar`
// Here, we scramble the message by shifting each letter forward by 2 letters: 'a' becomes 'c'; 's' becomes  'u'; etc.
const encryptedString = caesar('this is my super secret message.', 2);
console.log(encryptedString); // 'vjku ku oa uwrgt ugetgv uvtkpi.'

The idea is that, if I send you a normal message, like 'this is my super secret message', and someone else gets their hands on it, they can read the secret immediately. This obviously sucks if you're sending sensitive information, like passwords, which someone might be listening for.

...And, yes. Someone is always listening.

Encrypting a string means: "scrambling it to make it hard to read without unscrambling." This way, even if someone is listening, and even if they do intercept your message, it will remain unreadable until they unscramble it. To quote the example above, it's pretty non-obvious what the string 'vjku ku oa uwrgt ugetgv uvtkpi.' is supposed to mean.

The Caesar cipher is one of the simplest ways to scramble a string like this. To encrypt with the Caesar cipher, we pick a key, n, between 1 and 24, and replace every letter in the original string with the one n letters further down the alphabet.

So, if we choose the key 2, a becomes c; b becomes d; c becomes e; etc.

Substituting letters like this makes the original string pretty unreadable. But, we scrambled by simply shifting letters. So, all you need to do to unscramble is shift them back. If you get a message you know was encrypted with the key 2, all you need to do to decrypt is shift letters back two spaces. So, c becomes a; d becomes b; etc.

Unfortunately, this is a pretty useless form of encryption nowadays. It's extremely easy to break. An easy way to decrypt any string encrypted with a Caesar cipher is to just try to decrypt it with every possible key. One of the results will be correct. All you need to do is scan the list of 24 outputs for the one that's English. The others will be just as unreadable as the original string.

Below, I'll use a function called tryAll, which does just that. Namely: It takes an encrypted string, and returns a list of every possible decryption. One of those results will be the string we want. So, this always breaks the cipher.

Of course, scanning a list of 24 possible decryptions...Kind of sucks. It sort of feels like we should be able to automatically throw away the ones that are obviously wrong.

In fact, it's easy to do that. In the section on filter, you'll see a function, called isEnglish, which does just this. In particular, it reads a string; counts how many of the most common 1,000 words in English occur in that string; and, if it finds more than 3 of those words in the sentence, it classifies the string as English. If the string contains fewer than 3 words from that list, it throws it out as "junk".

There are more sophisticated ways to check if a string is English, of course, but this is fine for now.

The implementations of tryAll and isEnglish aren't important, but if you're curious, you can find them at this gist: tryAll / isEnglish.

Transforming Lists with map

Refactoring a for loop to use forEach hints at advantages of this style. But there's still room for improvement in the updated version.

In the example above, we're updating capitalizedStrings within the callback to forEach. There's nothing inherently wrong with this. But, it saves a lot of headache to avoid side effects like that whenever possible.

If we don't have to update data structures that live in a different scope...We probably shouldn't.

In this particular case, we wanted to turn every string in strings into its capitalized version. This is a very common use case for a for loop: Take everything in a list; turn it into something else; and collect the results in a new list.

Transforming every element in a list into a new one and collecting the results is called mapping. JavaScript has a built-in function for just this use case, called, appropriately enough, map.

We used forEach because it abstracts away the need to manage the iteration variable, i. We don't have to manually index into the array, etc., and so we can focus on the logic that really matters.

Similarly, we use map because it abstracts away initializing an empty array, and pushing to it. Just like forEach accepts a callback that does something with each string value, map accepts a callback that does something with each string value.

Let's look at a quick demo before a final explanation. In this example, I'm using a function to "encrypt" a list of messages. We could use a for loop, and we could use forEach...But, it's better with map.

const caesar = require('../caesar');

const key = 12;
const messages = [
    'arielle, are you there?',
    'the ghost has killed the shell',
    'the liziorati attack at dawn'
]

const encryptedMessages = messages.map(function (string) {
    return caesar(string, key);
})

console.log(encryptedMessages);

Note what happened here.

  • We start with a list of messages, in plain English.
  • We use the map method on messages to encrypt each string with the caesar function, and automagically store the result in a new array.

After the above code runs, encryptedMessages looks like: ['mduqxxq, mdq kag ftqdq?', 'ftq staef tme wuxxqp ftq etqxx', 'ftq xuluadmfu mffmow mf pmiz'].

This is a much higher level of abstraction than manually pushing to a list.

Now is a good time to point out that we can use arrow functions to express this sort of thing more concisely:

const caesar = require('../caesar');

const key = 12;
const messages = [
    'arielle, are you there?',
    'the ghost has killed the shell',
    'the liziorati attack at dawn'
];

const encryptedMessages = messages.map(string => string.toUpperCase()):

console.log(encryptedMessages);

Even better.

This is generally how I prefer to write, but I'll stick with function syntax for the sake of clarity.

Throwing Things Away with filter

Another common pattern is using a for loop to process items in a list, but only pushing/preserving some of them.

We usually decide which items to keep, and which to throw away, by doing an if check.

In raw JS, this might look like:

// Secret message! This was a string encrypted with a key between 1 and 24.
const encryptedMessage = 'mduqxxq, mdq kag ftqdq?'];

// We can break this code by just decrypting with _every_ possible key.
const possibilities = tryAll(encryptedMessage);

// Most of these decryption attempts aren't readable. Sad.
// We can speed up finding the ones that are probably junk with an if check
const likelyPossibilities = [];
possibilities.forEach(function (decryptionAttempt) {
    // Keep the string if it looks like an English sentence
    if (isEnglish(decryptionAttempt)) {
        likelyPossibilities.push(decryptionAttempt);
    }
})

First, we try to decrypt the encrypted message with every possible key. That means we end up with an array of 24 possibilities.

In this loop, we test each one to see if it looks like an English string. If so, we keep it. If not, we throw it away.

This is clearly a common use case. So, there's a built-in for it, aptly named filter.

As with map, we pass filter a callback, which also gets each string. The difference is that, filter will only save items in an array if the callback returns true. So, we could express the above as:

// Secret message! This was a string encrypted with a key between 1 and 24.
const encryptedMessage = 'mduqxxq, mdq kag ftqdq?'];

// We can break this code by just decrypting with _every_ possible key.
const possibilities = tryAll(encryptedMessage);

// Most of these decryption attempts aren't readable. Sad.
// We can speed up finding the ones that are probably junk with an if check
const likelyPossibilities = possibilities.filter(function (string) {
    // If isEnglish(string) returns true, filter saves in the output array
    return isEnglish(string);
})

Since this callback just calls isEnglish, we could actually write it even more concisely, like this:

// Secret message! This was a string encrypted with a key between 1 and 24.
const encryptedMessage = 'mduqxxq, mdq kag ftqdq?'];

// We can break this code by just decrypting with _every_ possible key.
const possibilities = tryAll(encryptedMessage);

// Most of these decryption attempts aren't readable. Sad.
// We can speed up finding the ones that are probably junk with an if check
const likelyPossibilities = possibilities.filter(isEnglish);

...Bang.

Bringing Things Together with reduce

So far, we've seen abstractions for three common use cases around arrays:

  • forEach, which allows us to iterate over a list as with for, but eliminates the need to manually manage the iteration variable i; index into the list; etc.
  • map, which lets us transform each element of a list, and collect the results in an array
  • filter, which lets us choose which elements of a list to keep

Another common use case is: Iterating over a list to collect its elements into a single result.

The prototypical example is adding up a bunch of numbers.

// prices of: (big) box of oreos, girl scout cookies, fidget spinner, gold-plated Arch linux magnet
const prices = [12, 19, 7, 209];

// variable to store our total prices in
let totalPrice = 0;
for (let i = 0; i < prices.length; i += 1) {
    totalPrice += prices[i];
}

// Report our total prices: $247
console.log(`Your total is ${totalPrice}.`);

As I'm sure you'd guess, there's an abstraction for this, too: reduce.

Refactoring the above loop with reduce looks like:

const prices = [12, 19, 7, 209];

prices.reduce(function (totalPrice, nextPrice) {
    // totalPrice is the price so far
    console.log(`Total price so far: ${totalPrice}`)
    console.log(`Next price to add: ${nextPrice}`)

    // update totalPrice by adding the next price
    totalPrice += nextPrice

    // return the total price, and start over again
    return totalPrice
}, 0)
// ^ the second argument to `reduce` is the totalPrice to start with on the first iteration

Like map and filter, reduce accepts a callback, which it runs on each element of the array.

Unlike map and filter, the callback we pass to reduce accepts two arguments: a totalPrice, and a nextPrice.

totalPrice is like total in the first example: It's the price we've gotten by adding up all the prices we've seen so far.

nextPrice is the number we got by doing prices[i] in the first example. Recall that map and reduce automatically index into the array for us, and pass this value to their callbacks automatically. reduce does the same thing, but passes that value as the second argument to its callback.

Just like with map and reduce, on each iteration, we have to return the value we're interested in getting back at the end of the loop. Namely, totalPrice.

Finally, note that we pass another argument after the callback to reduce. In this case, we pass 0. This is analogous to the line in the first example where we set const total = 0. That second argument is the initial value of totalPrice.


reduce is harder to grok than map or filter, so let's take a look at another example.

In the previous example, we used reduce to collect an array of numbers into a sum. But reduce is versatile. We can use it to turn a list into a any single result.

For example, we can use it to build up a string.

const courses = ['Introduction to Programming', 'Algorithms & Data Structures', 'Discrete Math'];

const curriculum = courses.reduce(function (courseList, course) {
    // add the name to the class, with a preceding newline and tab for indentation
    return courseList += `\n\t${course}`;
}, 'The Computer Science curriculum consists of:');

console.log(curriculum);

This generates the output:

The Computer Science curriculum consists of:
    Introduction to Programming
    Algorithms & Data Structures
    Discrete Math

I don't see a ton of examples of using reduce to build strings like this, for some reason, but it's common practice in a functional style.


Again, reduce is versatile: We can use it to turn a list into any kind of "single" result.

Even if that single result is another array.

Check it:

const names = ['arielle', 'jung', 'scheherazade'];

const titleCase = function (name) {
    // uppercase first letter of name
    const first = name[0];
    const capitalizedFirst = first.toUpperCase();

    // get rest of name
    const rest = name.slice(1);

    // create list with all letters of the name, including capitalized first letter
    const letters = [capitalizedFirst].concat(rest);

    // join letters, return result
    return letters.join('');
}

const titleCased = names.reduce(function (titleCasedNames, name) {
    // title-case the next name
    const titleCasedName = titleCase(name);

    // add the title-cased name to a list
    titleCasedNames.push(titleCasedName);

    // return list of capitalizedNames
    return titleCasedNames
}, [])
// ^ start with an empty list

console.log(titleCased);
// ["Arielle", "Jung", "Scheherazade"]

So...This is neat. We used reduce to turn a list of lower-cased names into a list of title-cased names.

Previously, we turned a list of numbers into a single sum, and a list of strings into a single string. Here, we turned a list of names into a single list of title-case names.

This is still allowed, because the single list of title-cased names is still a single result. It just happens to be a collection, rather than a primitive type.


You might have noticed something about that last example: We used reduce for the same task that we used map for earlier.

It might not be obvious, yet, but—this is a big deal.

You might guess that we can write map in terms of reduce. You'd be right.

const map = (list, mapFunction) => {
    const output = list.reduce((transformedList, nextElement) => {
        // use the mapFunction to transform the nextElement in the list 
        const transformedElement = mapFunction(nextElement);

        // add transformedElement to our list of transformed elements
        transformedList.push(transformedElement);

        // return list of transformed elements
        return transformedList;
    }, [])
    // ^ start with an empty list

    return output;
}

We can get the same result as above with:

// congratulations...you just implemented `map` with `reduce`. badass.
const titleCased = map(names, titleCase);

console.log(titleCased);
// ["Arielle", "Jung", "Scheherazade"]

Obligatory disclaimer: If you were actually implementing a production-ready map, you'd need to add some features to this implementation. We'll leave all that off in the interest of clarity.


Alright—last one. If we can use reduce to implement map...What about filter?

const filter = (list, predicate) => {
    const output = list.reduce(function (filteredElements, nextElement) {
        // only add `nextElement` if it passes our test
        if (predicate(nextElement)) {
            filteredElements.push(nextElement);
        }

        // return the list of filtered elements on each iteration
        return filteredElements;
        }, [])
    })
}

We'd use this like:

It's alright if this one feels confusing. It's a new idea even for more experienced developers. But, it's worth the work to wrap your head around it: We'll need this little bit of insight to make sense of transduction a little bit later.

...And, trust me, transduction alone is cool enough to be worth the work.

Onwards

Today, you've learned how to use map, filter, and reduce to write more readable code.

Not, of course, that there's anything wrong with using for. But raising the level of abstraction with these functions brings immediate benefits to readability, and maintainability, by corollary.

Next time, we'll round out our arsenal of array methods with a couple more favorites: flatten, concat, and flatMap.

Stay tuned!

Nahoru
Tento web používá k poskytování služeb a analýze návštěvnosti soubory cookie. Používáním tohoto webu s tímto souhlasíte. Další informace