Prévia do material em texto
Eloquent JavaScript 4th edition Marijn Haverbeke Copyright © 2024 by Marijn Haverbeke This work is licensed under a Creative Commons attribution-noncommercial license (http://creativecommons.org/licenses/by-nc/3.0/). All code in the book may also be considered licensed under an MIT license (https://eloquentjavascript. net/code/LICENSE). The illustrations are contributed by various artists: Cover by Péchane Sumi- e. Chapter illustrations by Madalina Tantareanu. Pixel art in Chapters 7 and 16 by Antonio Perdomo Pastor. Regular expression diagrams in Chapter 9 generated with regexper.com by Jeff Avallone. Game concept for Chapter 16 by Thomas Palef. You can buy a print version of this book, with an extra bonus chapter included, printed by No Starch Press at http://a-fwd.com/com=marijhaver-20&asin- com=1593279507. i http://creativecommons.org/licenses/by-nc/3.0/ https://eloquentjavascript.net/code/LICENSE https://eloquentjavascript.net/code/LICENSE http://regexper.com http://lessmilk.com http://a-fwd.com/com=marijhaver-20&asin-com=1593279507 http://a-fwd.com/com=marijhaver-20&asin-com=1593279507 Contents Introduction 1 On programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Why language matters . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 What is JavaScript? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Code, and what to do with it . . . . . . . . . . . . . . . . . . . . . . . 7 Overview of this book . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Typographic conventions . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1 Values, Types, and Operators 10 Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Unary operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Boolean values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Empty values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Automatic type conversion . . . . . . . . . . . . . . . . . . . . . . . . . 18 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2 Program Structure 21 Expressions and statements . . . . . . . . . . . . . . . . . . . . . . . . 21 Bindings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Binding names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 The environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 The console.log function . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Return values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Control flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 Conditional execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 while and do loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 Indenting Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 for loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Breaking Out of a Loop . . . . . . . . . . . . . . . . . . . . . . . . . . 32 ii Updating bindings succinctly . . . . . . . . . . . . . . . . . . . . . . . 32 Dispatching on a value with switch . . . . . . . . . . . . . . . . . . . . 33 Capitalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3 Functions 38 Defining a function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Bindings and scopes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Nested scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Functions as values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Declaration notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Arrow functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 The call stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Optional Arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Growing functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 Functions and side effects . . . . . . . . . . . . . . . . . . . . . . . . . 52 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4 Data Structures: Objects and Arrays 55 The weresquirrel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 Data sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Mutability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 The lycanthrope’s log . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Computing correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 Array loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 The final analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Further arrayology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 Strings and their properties . . . . . . . . . . . . . . . . . . . . . . . . 69 Rest parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 The Math object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Destructuring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 Optional property access . . . . . . . . . . . . . . . . . . . . . . . . . . 74 iii JSON . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 5 Higher-Order Functions 79 Abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Abstracting repetition . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Higher-order functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 Script data set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Filtering arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 Transforming with map . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 Summarizing with reduce . . . . . . . . . . . . . . . . . . . . . . . . . . 85 Composability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 Strings and character codes . . . . . . . . . . . . . . . . . . . . . . . . 88 Recognizing text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 6 The Secret Life of Objects 93 Abstract Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 Prototypes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 Private Properties . . . . . . . . . . . . . . . . . . . . . . . . .input. Functions A lot of the values provided in the default environment have the type function. A function is a piece of program wrapped in a value. Such values can be applied in order to run the wrapped program. For example, in a browser environment, the binding prompt holds a function that shows a little dialog box asking for user input. It is used like this: 24 prompt("Enter passcode"); Executing a function is called invoking, calling, or applying it. You can call a function by putting parentheses after an expression that produces a function value. Usually you’ll directly use the name of the binding that holds the function. The values between the parentheses are given to the program inside the function. In the example, the prompt function uses the string that we give it as the text to show in the dialog box. Values given to functions are called arguments. Different functions might need a different number or different types of arguments. The prompt function isn’t used much in modern web programming, mostly because you have no control over the way the resulting dialog looks, but it can be helpful in toy programs and experiments. The console.log function In the examples, I used console.log to output values. Most JavaScript sys- tems (including all modern web browsers and Node.js) provide a console.log function that writes out its arguments to some text output device. In browsers, the output lands in the JavaScript console. This part of the browser interface is hidden by default, but most browsers open it when you press F12 or, on a Mac, command-option-I. If that does not work, search through the menus for an item named Developer Tools or similar. Though binding names cannot contain period characters, console.log does have one. This is because console.log isn’t a simple binding, but an expression that retrieves the log property from the value held by the console binding. We’ll find out exactly what this means in Chapter 4. Return values Showing a dialog box or writing text to the screen is a side effect. Many functions are useful because of the side effects they produce. Functions may 25 also produce values, in which case they don’t need to have a side effect to be useful. For example, the function Math.max takes any amount of number arguments and gives back the greatest: console.log(Math.max(2, 4)); // → 4 When a function produces a value, it is said to return that value. Anything that produces a value is an expression in JavaScript, which means that function calls can be used within larger expressions. In the following code, a call to Math.min, which is the opposite of Math.max, is used as part of a plus expression: console.log(Math.min(2, 4) + 100); // → 102 Chapter 3 will explain how to write your own functions. Control flow When your program contains more than one statement, the statements are executed as though they were a story, from top to bottom. For example, the following program has two statements. The first asks the user for a number, and the second, which is executed after the first, shows the square of that number: let theNumber = Number(prompt("Pick a number")); console.log("Your number is the square root of " + theNumber * theNumber); The function Number converts a value to a number. We need that conversion because the result of prompt is a string value, and we want a number. There are similar functions called String and Boolean that convert values to those types. Here is the rather trivial schematic representation of straight-line control flow: Conditional execution Not all programs are straight roads. We may, for example, want to create a branching road where the program takes the proper branch based on the situation at hand. This is called conditional execution. 26 Conditional execution is created with the if keyword in JavaScript. In the simple case, we want some code to be executed if, and only if, a certain condition holds. We might, for example, want to show the square of the input only if the input is actually a number: let theNumber = Number(prompt("Pick a number")); if (!Number.isNaN(theNumber)) { console.log("Your number is the square root of " + theNumber * theNumber); } With this modification, if you enter “parrot”, no output is shown. The if keyword executes or skips a statement depending on the value of a Boolean expression. The deciding expression is written after the keyword, between parentheses, followed by the statement to execute. The Number.isNaN function is a standard JavaScript function that returns true only if the argument it is given is NaN. The Number function happens to return NaN when you give it a string that doesn’t represent a valid number. Thus, the condition translates to “unless theNumber is not-a-number, do this”. The statement after the if is wrapped in braces ({ and }) in this example. The braces can be used to group any number of statements into a single state- ment, called a block. You could also have omitted them in this case, since they hold only a single statement, but to avoid having to think about whether they are needed, most JavaScript programmers use them in every wrapped state- ment like this. We’ll mostly follow that convention in this book, except for the occasional one-liner. if (1 + 1 == 2) console.log("It's true"); // → It's true You often won’t just have code that executes when a condition holds true, but also code that handles the other case. This alternate path is represented by the second arrow in the diagram. You can use the else keyword, together with if, to create two separate, alternative execution paths: let theNumber = Number(prompt("Pick a number")); if (!Number.isNaN(theNumber)) { console.log("Your number is the square root of " + theNumber * theNumber); 27 } else { console.log("Hey. Why didn't you give me a number?"); } If you have more than two paths to choose from, you can “chain” multiple if/else pairs together. Here’s an example: let num = Number(prompt("Pick a number")); if (numwe can now write a program that calculates and shows the value of 210 (2 to the 10th power). We use two bindings: one to keep track of our result and one to count how often we have multiplied this result by 2. The loop tests whether the second binding has reached 10 yet and, if not, updates both bindings. let result = 1; let counter = 0; while (counterIt will become clear what a constructor is in Chapter 6. For now, the important thing is not to be bothered by this apparent lack of consistency. Comments Often, raw code does not convey all the information you want a program to convey to human readers, or it conveys it in such a cryptic way that people might not understand it. At other times, you might just want to include some related thoughts as part of your program. This is what comments are for. A comment is a piece of text that is part of a program but is completely ignored by the computer. JavaScript has two ways of writing comments. To write a single-line comment, you can use two slash characters (//) and then the comment text after it: let accountBalance = calculateBalance(account); // It's a green hollow where a river sings accountBalance.adjust(); // Madly catching white tatters in the grass. let report = new Report(); // Where the sun on the proud mountain rings: addToReport(accountBalance, report); // It's a little valley, foaming like light in a glass. 34 A // comment goes only to the end of the line. A section of text between /* and */ will be ignored in its entirety, regardless of whether it contains line breaks. This is useful for adding blocks of information about a file or a chunk of program: /* I first found this number scrawled on the back of an old notebook. Since then, it has often dropped by, showing up in phone numbers and the serial numbers of products that I've bought. It obviously likes me, so I've decided to keep it. */ const myNumber = 11213; Summary You now know that a program is built out of statements, which themselves sometimes contain more statements. Statements tend to contain expressions, which themselves can be built out of smaller expressions. Putting statements after one another gives you a program that is executed from top to bottom. You can introduce disturbances in the flow of control by using conditional (if, else, and switch) and looping (while, do, and for) statements. Bindings can be used to file pieces of data under a name, and they are useful for tracking state in your program. The environment is the set of bindings that are defined. JavaScript systems always put a number of useful standard bindings into your environment. Functions are special values that encapsulate a piece of program. You can invoke them by writing functionName(argument1, argument2). Such a function call is an expression and may produce a value. Exercises If you are unsure how to test your solutions to the exercises, refer to the Intro- duction. Each exercise starts with a problem description. Read this description and try to solve the exercise. If you run into problems, consider reading the hints at the end of the book. You can find full solutions to the exercises online at https://eloquentjavascript.net/code. If you want to learn something from the exercises, I recommend looking at the solutions only after you’ve solved the 35 https://eloquentjavascript.net/code#2 exercise, or at least after you’ve attacked it long and hard enough to have a slight headache. Looping a triangle Write a loop that makes seven calls to console.log to output the following triangle: # ## ### #### ##### ###### ####### It may be useful to know that you can find the length of a string by writing .length after it. let abc = "abc"; console.log(abc.length); // → 3 FizzBuzz Write a program that uses console.log to print all the numbers from 1 to 100, with two exceptions. For numbers divisible by 3, print "Fizz" instead of the number, and for numbers divisible by 5 (and not 3), print "Buzz" instead. When you have that working, modify your program to print "FizzBuzz" for numbers that are divisible by both 3 and 5 (and still print "Fizz" or "Buzz" for numbers divisible by only one of those). (This is actually an interview question that has been claimed to weed out a significant percentage of programmer candidates. So if you solved it, your labor market value just went up.) Chessboard Write a program that creates a string that represents an 8×8 grid, using newline characters to separate lines. At each position of the grid there is either a space or a "#" character. The characters should form a chessboard. Passing this string to console.log should show something like this: # # # # 36 # # # # # # # # # # # # # # # # # # # # # # # # # # # # When you have a program that generates this pattern, define a binding size = 8 and change the program so that it works for any size, outputting a grid of the given width and height. 37 “People think that computer science is the art of geniuses but the actual reality is the opposite, just many people doing things that build on each other, like a wall of mini stones.” —Donald Knuth Chapter 3 Functions Functions are one of the most central tools in JavaScript programming. The concept of wrapping a piece of program in a value has many uses. It gives us a way to structure larger programs, to reduce repetition, to associate names with subprograms, and to isolate these subprograms from each other. The most obvious application of functions is defining new vocabulary. Cre- ating new words in prose is usually bad style, but in programming, it is indis- pensable. Typical adult English speakers have some 20,000 words in their vocabulary. Few programming languages come with 20,000 commands built in. And the vocabulary that is available tends to be more precisely defined, and thus less flexible, than in human language. Therefore, we have to introduce new words to avoid excessive verbosity. Defining a function A function definition is a regular binding where the value of the binding is a function. For example, this code defines square to refer to a function that produces the square of a given number: const square = function(x) { return x * x; }; console.log(square(12)); // → 144 A function is created with an expression that starts with the keyword function . Functions have a set of parameters (in this case, only x) and a body, which contains the statements that are to be executed when the function is called. The body of a function created this way must always be wrapped in braces, even when it consists of only a single statement. 38 A function can have multiple parameters or no parameters at all. In the fol- lowing example, makeNoise does not list any parameter names, whereas roundTo (which rounds n to the nearest multiple of step) lists two: const makeNoise = function() { console.log("Pling!"); }; makeNoise(); // → Pling! const roundTo = function(n, step) { let remainder = n % step; return n - remainder + (remainderown little world (its local environment) and can often be understood without knowing a lot about what’s going on in the global environment. 39 Bindings declared with let and const are in fact local to the block in which they are declared, so if you create one of those inside of a loop, the code before and after the loop cannot “see” it. In pre-2015 JavaScript, only functions created new scopes, so old-style bindings, created with the var keyword, are visible throughout the whole function in which they appear—or throughout the global scope, if they are not in a function. let x = 10; // global if (true) { let y = 20; // local to block var z = 30; // also global } Each scope can “look out” into the scope around it, so x is visible inside the block in the example. The exception is when multiple bindings have the same name—in that case, code can see only the innermost one. For example, when the code inside the halve function refers to n, it is seeing its own n, not the global n. const halve = function(n) { return n / 2; }; let n = 10; console.log(halve(100)); // → 50 console.log(n); // → 10 Nested scope JavaScript distinguishes not just global and local bindings. Blocks and func- tions can be created inside other blocks and functions, producing multiple de- grees of locality. For example, this function—which outputs the ingredients needed to make a batch of hummus—has another function inside it: const hummus = function(factor) { const ingredient = function(amount, unit, name) { let ingredientAmount = amount * factor; if (ingredientAmount > 1) { unit += "s"; } 40 console.log(`${ingredientAmount} ${unit} ${name}`); }; ingredient(1, "can", "chickpeas"); ingredient(0.25, "cup", "tahini"); ingredient(0.25, "cup", "lemon juice"); ingredient(1, "clove", "garlic"); ingredient(2, "tablespoon", "olive oil"); ingredient(0.5, "teaspoon", "cumin"); }; The code inside the ingredient function can see the factor binding from the outer function, but its local bindings, such as unit or ingredientAmount, are not visible in the outer function. The set of bindings visible inside a block is determined by the place of that block in the program text. Each local scope can also see all the local scopes that contain it, and all scopes can see the global scope. This approach to binding visibility is called lexical scoping. Functions as values A function binding usually simply acts as a name for a specific piece of the program. Such a binding is defined once and never changed. This makes it easy to confuse the function and its name. But the two are different. A function value can do all the things that other values can do—you can use it in arbitrary expressions, not just call it. It is possible to store a function value in a new binding, pass it as an argument to a function, and so on. Similarly, a binding that holds a function is still just a regular binding and can, if not constant, be assigned a new value, like so: let launchMissiles = function() { missileSystem.launch("now"); }; if (safeMode) { launchMissiles = function() {/* do nothing */}; } In Chapter 5, we’ll discuss the interesting things that we can do by passing around function values to other functions. 41 Declaration notation There is a slightly shorter way to create a function binding. When the function keyword is used at the start of a statement, it works differently: function square(x) { return x * x; } This is a function declaration. The statement defines the binding square and points it at the given function. It is slightly easier to write and doesn’t require a semicolon after the function. There is one subtlety with this form of function definition. console.log("The future says:", future()); function future() { return "You'll never have flying cars"; } The preceding code works, even though the function is defined below the code that uses it. Function declarations are not part of the regular top-to-bottom flow of control. They are conceptually moved to the top of their scope and can be used by all the code in that scope. This is sometimes useful because it offers the freedom to order code in a way that seems the clearest, without worrying about having to define all functions before they are used. Arrow functions There’s a third notation for functions, which looks very different from the others. Instead of the function keyword, it uses an arrow (=>) made up of an equal sign and a greater-than character (not to be confused with the greater- than-or-equal operator, which is written >=): const roundTo = (n, step) => { let remainder = n % step; return n - remainder + (remainder { return x * x; }; const square2 = x => x * x; When an arrow function has no parameters at all, its parameter list is just an empty set of parentheses. const horn = () => { console.log("Toot"); }; There’s no deep reason to have both arrow functions and function expressions in the language. Apart from a minor detail, which we’ll discuss in Chapter 6, they do the same thing. Arrow functions were added in 2015, mostly to make it possible to write small function expressions in a less verbose way. We’ll use them often in Chapter 5. The call stack The way control flows through functions is somewhat involved. Let’s take a closer look at it. Here is a simple program that makes a few function calls: function greet(who) { console.log("Hello " + who); } greet("Harry"); console.log("Bye"); A run through this program goes roughly like this: the call to greet causes control to jump to the start of that function (line 2). The function calls console .log, which takes control, does its job, and then returns control to line 2. There, it reaches the end of the greet function, so it returns to the place that called it—line 4. The line after that calls console.log again. After that returns, the program reaches its end. We could show the flow of control schematically like this: not in function in greet in console.log in greet not in function in console.log 43 not in function Because a function has to jump back to the place that called it when it returns, the computer must remember the context from which the call happened. In one case, console.log has to return to the greet function when it is done. In the other case, it returns to the end of the program. The place where the computer stores this context is the call stack. Every time a function is called, the current context is stored on top of this stack. When a function returns, it removes the top context from the stack and uses that context to continue execution. Storing this stack requires space in the computer’s memory. When the stack grows too big, the computer will fail with a message like “out of stack space” or “too much recursion”. The following code illustrates this by asking the computer a really hard question that causes an infinite back-and-forth between two functions. Or rather, it would be infinite, if the computer had an infinite stack. As it is, we will run out of space, or “blow the stack”. function chicken() { return egg(); } function egg() { return chicken(); } console.log(chicken() + " came first."); // → ?? Optional Arguments The following code is allowed and executes without any problem: function square(x) { return x * x; } console.log(square(4, true, "hedgehog")); // → 16 We defined square with only one parameter. Yet when we call it with three, the language doesn’t complain. It ignores the extra arguments and computes the square of the first one. JavaScript is extremelybroad-minded about the number of arguments you can pass to a function. If you pass too many, the extra ones are ignored. If you pass too few, the missing parameters are assigned the value undefined. The downside of this is that it is possible—likely, even—that you’ll acciden- tally pass the wrong number of arguments to functions. And no one will tell you 44 about it. The upside is that you can use this behavior to allow a function to be called with different numbers of arguments. For example, this minus function tries to imitate the - operator by acting on either one or two arguments: function minus(a, b) { if (b === undefined) return -a; else return a - b; } console.log(minus(10)); // → -10 console.log(minus(10, 5)); // → 5 If you write an = operator after a parameter, followed by an expression, the value of that expression will replace the argument when it is not given. For example, this version of roundTo makes its second argument optional. If you don’t provide it or pass the value undefined, it will default to one: function roundTo(n, step = 1) { let remainder = n % step; return n - remainder + (remainder local; } let wrap1 = wrapValue(1); let wrap2 = wrapValue(2); console.log(wrap1()); // → 1 console.log(wrap2()); // → 2 This is allowed and works as you’d hope—both instances of the binding can still be accessed. This situation is a good demonstration of the fact that local bindings are created anew for every call, and different calls don’t affect each other’s local bindings. This feature—being able to reference a specific instance of a local binding in an enclosing scope—is called closure. A function that references bindings from local scopes around it is called a closure. This behavior not only frees you from having to worry about the lifetimes of bindings but also makes it possible to use function values in some creative ways. With a slight change, we can turn the previous example into a way to create functions that multiply by an arbitrary amount: function multiplier(factor) { return number => number * factor; } let twice = multiplier(2); console.log(twice(5)); // → 10 The explicit local binding from the wrapValue example isn’t really needed, since a parameter is itself a local binding. Thinking about programs like this takes some practice. A good mental model is to think of function values as containing both the code in their body and the environment in which they are created. When called, the function body sees the environment in which it was created, not the environment in which it is called. In the previous example, multiplier is called and creates an environment in 46 which its factor parameter is bound to 2. The function value it returns, which is stored in twice, remembers this environment so that when that is called, it multiplies its argument by 2. Recursion It is perfectly okay for a function to call itself, as long as it doesn’t do it so often that it overflows the stack. A function that calls itself is called recursive. Recursion allows some functions to be written in a different style. Take, for example, this power function, which does the same as the ** (exponentiation) operator: function power(base, exponent) { if (exponent == 0) { return 1; } else { return base * power(base, exponent - 1); } } console.log(power(2, 3)); // → 8 This is rather close to the way mathematicians define exponentiation and ar- guably describes the concept more clearly than the loop we used in Chapter 2. The function calls itself multiple times with ever smaller exponents to achieve the repeated multiplication. This implementation has one problem, however: in typical JavaScript im- plementations, it’s about three times slower than a version using a for loop. Running through a simple loop is generally cheaper than calling a function multiple times. The dilemma of speed versus elegance is an interesting one. You can see it as a kind of continuum between human-friendliness and machine-friendliness. Al- most any program can be made faster by making it bigger and more convoluted. The programmer has to find an appropriate balance. In the case of the power function, an inelegant (looping) version is still fairly simple and easy to read. It doesn’t make much sense to replace it with a recur- sive function. Often, though, a program deals with such complex concepts that giving up some efficiency in order to make the program more straightforward is helpful. Worrying about efficiency can be a distraction. It’s yet another factor that 47 complicates program design, and when you’re doing something that’s already difficult, that extra thing to worry about can be paralyzing. Therefore, you should generally start by writing something that’s correct and easy to understand. If you’re worried that it’s too slow—which it usually isn’t, since most code simply isn’t executed often enough to take any significant amount of time—you can measure afterward and improve it if necessary. Recursion is not always just an inefficient alternative to looping. Some prob- lems really are easier to solve with recursion than with loops. Most often these are problems that require exploring or processing several “branches”, each of which might branch out again into even more branches. Consider this puzzle: by starting from the number 1 and repeatedly either adding 5 or multiplying by 3, an infinite set of numbers can be produced. How would you write a function that, given a number, tries to find a sequence of such additions and multiplications that produces that number? For example, the number 13 could be reached by first multiplying by 3 and then adding 5 twice, whereas the number 15 cannot be reached at all. Here is a recursive solution: function findSolution(target) { function find(current, history) { if (current == target) { return history; } else if (current > target) { return null; } else { return find(current + 5, `(${history} + 5)`) ?? find(current * 3, `(${history} * 3)`); } } return find(1, "1"); } console.log(findSolution(24)); // → (((1 * 3) + 5) * 3) Note that this program doesn’t necessarily find the shortest sequence of oper- ations. It is satisfied when it finds any sequence at all. It’s okay if you don’t see how this code works right away. Let’s work through it, since it makes for a great exercise in recursive thinking. The inner function find does the actual recursing. It takes two arguments: the current number and a string that records how we reached this number. If it finds a solution, it returns a string that shows how to get to the target. If it 48 can find no solution starting from this number, it returns null. To do this, the function performs one of three actions. If the current number is the target number, the current history is a way to reach that target, so it is returned. If the current number is greater than the target, there’s no sense in further exploring this branch because both adding and multiplying will only make the number bigger, so it returns null. Finally, if we’re still below the target number, the function tries both possible pathsthat start from the current number by calling itself twice, once for addition and once for multiplication. If the first call returns something that is not null, it is returned. Otherwise, the second call is returned, regardless of whether it produces a string or null. To better understand how this function produces the effect we’re looking for, let’s look at all the calls to find that are made when searching for a solution for the number 13: find(1, "1") find(6, "(1 + 5)") find(11, "((1 + 5) + 5)") find(16, "(((1 + 5) + 5) + 5)") too big find(33, "(((1 + 5) + 5) * 3)") too big find(18, "((1 + 5) * 3)") too big find(3, "(1 * 3)") find(8, "((1 * 3) + 5)") find(13, "(((1 * 3) + 5) + 5)") found! The indentation indicates the depth of the call stack. The first time find is called, the function starts by calling itself to explore the solution that starts with (1 + 5). That call will further recurse to explore every continued solution that yields a number less than or equal to the target number. Since it doesn’t find one that hits the target, it returns null back to the first call. There the ?? operator causes the call that explores (1 * 3) to happen. This search has more luck—its first recursive call, through yet another recursive call, hits upon the target number. That innermost call returns a string, and each of the ?? operators in the intermediate calls passes that string along, ultimately returning the solution. 49 Growing functions There are two more or less natural ways for functions to be introduced into programs. The first occurs when you find yourself writing similar code multiple times. You’d prefer not to do that, as having more code means more space for mistakes to hide and more material to read for people trying to understand the program. So you take the repeated functionality, find a good name for it, and put it into a function. The second way is that you find you need some functionality that you haven’t written yet and that sounds like it deserves its own function. You start by naming the function, then write its body. You might even start writing code that uses the function before you actually define the function itself. How difficult it is to find a good name for a function is a good indication of how clear a concept it is that you’re trying to wrap. Let’s go through an example. We want to write a program that prints two numbers: the numbers of cows and chickens on a farm, with the words Cows and Chickens after them and zeros padded before both numbers so that they are always three digits long: 007 Cows 011 Chickens This asks for a function of two arguments—the number of cows and the number of chickens. Let’s get coding. function printFarmInventory(cows, chickens) { let cowString = String(cows); while (cowString.length a % 3; A key part of understanding functions is understanding scopes. Each block creates a new scope. Parameters and bindings declared in a given scope are local and not visible from the outside. Bindings declared with var behavedifferently—they end up in the nearest function scope or the global scope. Separating the tasks your program performs into different functions is help- ful. You won’t have to repeat yourself as much, and functions can help organize a program by grouping code into pieces that do specific things. Exercises Minimum The previous chapter introduced the standard function Math.min that returns its smallest argument. We can write a function like that ourselves now. Define the function min that takes two arguments and returns their minimum. Recursion We’ve seen that we can use % (the remainder operator) to test whether a number is even or odd by using % 2 to see whether it’s divisible by two. Here’s another way to define whether a positive whole number is even or odd: • Zero is even. 53 • One is odd. • For any other number N, its evenness is the same as N - 2. Define a recursive function isEven corresponding to this description. The function should accept a single parameter (a positive, whole number) and return a Boolean. Test it on 50 and 75. See how it behaves on -1. Why? Can you think of a way to fix this? Bean counting You can get the Nth character, or letter, from a string by writing [N] after the string (for example, string[2]). The resulting value will be a string containing only one character (for example, "b"). The first character has position 0, which causes the last one to be found at position string.length - 1. In other words, a two-character string has length 2, and its characters have positions 0 and 1. Write a function countBs that takes a string as its only argument and returns a number that indicates how many uppercase B characters there are in the string. Next, write a function called countChar that behaves like countBs, except it takes a second argument that indicates the character that is to be counted (rather than counting only uppercase B characters). Rewrite countBs to make use of this new function. 54 “On two occasions I have been asked, ‘Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?’ [...] I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.” —Charles Babbage, Passages from the Life of a Philosopher (1864) Chapter 4 Data Structures: Objects and Arrays Numbers, Booleans, and strings are the atoms from which data structures are built. Many types of information require more than one atom, though. Ob- jects allow us to group values—including other objects—to build more complex structures. The programs we have built so far have been limited by the fact that they were operating only on simple data types. After learning the basics of data structures in this chapter, you’ll know enough to start writing useful programs. The chapter will work through a more or less realistic programming example, introducing concepts as they apply to the problem at hand. The example code will often build on functions and bindings introduced earlier in the book. The online coding sandbox for the book (https://eloquentjavascript.net/code) provides a way to run code in the context of a particular chapter. If you decide to work through the examples in another environment, be sure to first download the full code for this chapter from the sandbox page. The weresquirrel Every now and then, usually between 8 p.m. and 10 p.m., Jacques finds himself transforming into a small furry rodent with a bushy tail. On one hand, Jacques is quite glad that he doesn’t have classic lycanthropy. Turning into a squirrel does cause fewer problems than turning into a wolf. Instead of having to worry about accidentally eating the neighbor (that would be awkward), he worries about being eaten by the neighbor’s cat. After two occasions of waking up on a precariously thin branch in the crown of an oak, naked and disoriented, he has taken to locking the doors and windows of his room at night and putting a few walnuts on the floor to keep himself busy. But Jacques would prefer to get rid of his condition entirely. The irregular occurrences of the transformation make him suspect that they might be trig- gered by something. For a while, he believed that it happened only on days when he had been near oak trees. However, avoiding oak trees did not solve 55 https://eloquentjavascript.net/code the problem. Switching to a more scientific approach, Jacques has started keeping a daily log of everything he does on a given day and whether he changed form. With this data he hopes to narrow down the conditions that trigger the transforma- tions. The first thing he needs is a data structure to store this information. Data sets To work with a chunk of digital data, we first have to find a way to represent it in our machine’s memory. Say, for example, that we want to represent a collection of the numbers 2, 3, 5, 7, and 11. We could get creative with strings—after all, strings can have any length, so we can put a lot of data into them—and use "2 3 5 7 11" as our representation. But this is awkward. We’d have to somehow extract the digits and convert them back to numbers to access them. Fortunately, JavaScript provides a data type specifically for storing sequences of values. It is called an array and is written as a list of values between square brackets, separated by commas: let listOfNumbers = [2, 3, 5, 7, 11]; console.log(listOfNumbers[2]); // → 5 console.log(listOfNumbers[0]); // → 2 console.log(listOfNumbers[2 - 1]); // → 3 The notation for getting at the elements inside an array also uses square brack- ets. A pair of square brackets immediately after an expression, with another expression inside of them, will look up the element in the left-hand expression that corresponds to the index given by the expression in the brackets. The first index of an array is zero, not one, so the first element is retrieved with listOfNumbers[0]. Zero-based counting has a long tradition in technology and in certain ways makes a lot of sense, but it takes some getting used to. Think of the index as the number of items to skip, counting from the start of the array. 56 Properties We’ve seen a few expressions like myString.length (to get the length of a string) and Math.max (the maximum function) in past chapters. These expressions access a property of some value. In the first case, we access the length property of the value in myString. In the second, we access the property named max in the Math object (which is a collection of mathematics-related constants and functions). Almost all JavaScript values have properties. The exceptions are null and undefined. If you try to access a property on one of these nonvalues, you get an error: null.length; // → TypeError: null has no properties The two main ways to access properties in JavaScript are with a dot and with square brackets. Both value.x and value[x] access a property on value—but not necessarily the same property. The difference is in how x is interpreted. When using a dot, the word after the dot is the literal name of the property. When using square brackets, the expression between the brackets is evaluated to get the property name. Whereas value.x fetches the property of value named “x”, value[x] takes the value of the variable named x and uses that, converted to a string, as the property name. If you know that the property in which you are interested is called color, you say value.color. If you want to extract the property named by the value held in the binding i, you say value[i]. Property names are strings. They can be any string, but the dot notation works only with names that look like valid binding names—starting with a letter or underscore, and containing only letters, numbers, and underscores. If you want to access a property named 2 or John Doe, you must use square brackets: value[2] or value["John Doe"]. The elements in an array are stored as the array’s properties, using numbers as property names. Because you can’t use the dot notation with numbers and usuallywant to use a binding that holds the index anyway, you have to use the bracket notation to get at them. Just like strings, arrays have a length property that tells us how many ele- ments the array has. Methods Both string and array values contain, in addition to the length property, a number of properties that hold function values. 57 let doh = "Doh"; console.log(typeof doh.toUpperCase); // → function console.log(doh.toUpperCase()); // → DOH Every string has a toUpperCase property. When called, it will return a copy of the string in which all letters have been converted to uppercase. There is also toLowerCase, going the other way. Interestingly, even though the call to toUpperCase does not pass any argu- ments, the function somehow has access to the string "Doh", the value whose property we called. You’ll find out how this works in Chapter 6. Properties that contain functions are generally called methods of the value they belong to, as in “toUpperCase is a method of a string”. This example demonstrates two methods you can use to manipulate arrays: let sequence = [1, 2, 3]; sequence.push(4); sequence.push(5); console.log(sequence); // → [1, 2, 3, 4, 5] console.log(sequence.pop()); // → 5 console.log(sequence); // → [1, 2, 3, 4] The push method adds values to the end of an array. The pop method does the opposite, removing the last value in the array and returning it. These somewhat silly names are the traditional terms for operations on a stack. A stack, in programming, is a data structure that allows you to push values into it and pop them out again in the opposite order so that the thing that was added last is removed first. Stacks are common in programming—you might remember the function call stack from the previous chapter, which is an instance of the same idea. Objects Back to the weresquirrel. A set of daily log entries can be represented as an array, but the entries do not consist of just a number or a string—each entry needs to store a list of activities and a Boolean value that indicates whether Jacques turned into a squirrel or not. Ideally, we would like to group these together into a single value and then put those grouped values into an array of 58 log entries. Values of the type object are arbitrary collections of properties. One way to create an object is by using braces as an expression: let day1 = { squirrel: false, events: ["work", "touched tree", "pizza", "running"] }; console.log(day1.squirrel); // → false console.log(day1.wolf); // → undefined day1.wolf = false; console.log(day1.wolf); // → false Inside the braces, you write a list of properties separated by commas. Each property has a name followed by a colon and a value. When an object is written over multiple lines, indenting it as shown in this example helps with readability. Properties whose names aren’t valid binding names or valid numbers must be quoted: let descriptions = { work: "Went to work", "touched tree": "Touched a tree" }; This means that braces have two meanings in JavaScript. At the start of a statement, they begin a block of statements. In any other position, they describe an object. Fortunately, it is rarely useful to start a statement with an object in braces, so the ambiguity between these two is not much of a problem. The one case where this does come up is when you want to return an object from a short-hand arrow function—you can’t write n => {prop: n}, since the braces will be interpreted as a function body. Instead, you have to put a set of parentheses around the object to make it clear that it is an expression. Reading a property that doesn’t exist will give you the value undefined. It is possible to assign a value to a property expression with the = operator. This will replace the property’s value if it already existed or create a new property on the object if it didn’t. To briefly return to our tentacle model of bindings—property bindings are similar. They grasp values, but other bindings and properties might be holding onto those same values. You can think of objects as octopuses with any number of tentacles, each of which has a name written on it. 59 The delete operator cuts off a tentacle from such an octopus. It is a unary operator that, when applied to an object property, will remove the named property from the object. This is not a common thing to do, but it is possible. let anObject = {left: 1, right: 2}; console.log(anObject.left); // → 1 delete anObject.left; console.log(anObject.left); // → undefined console.log("left" in anObject); // → false console.log("right" in anObject); // → true The binary in operator, when applied to a string and an object, tells you whether that object has a property with that name. The difference between setting a property to undefined and actually deleting it is that in the first case, the object still has the property (it just doesn’t have a very interesting value), whereas in the second case the property is no longer present and in will return false. To find out what properties an object has, you can use the Object.keys function. Give the function an object and it will return an array of strings— the object’s property names: console.log(Object.keys({x: 0, y: 0, z: 2})); // → ["x", "y", "z"] There’s an Object.assign function that copies all properties from one object into another: let objectA = {a: 1, b: 2}; Object.assign(objectA, {b: 3, c: 4}); console.log(objectA); // → {a: 1, b: 3, c: 4} Arrays, then, are just a kind of object specialized for storing sequences of things. If you evaluate typeof [], it produces "object". You can visualize arrays as long, flat octopuses with all their tentacles in a neat row, labeled with numbers. Jacques will represent the journal that Jacques keeps as an array of objects: let journal = [ {events: ["work", "touched tree", "pizza", "running", "television"], squirrel: false}, 60 {events: ["work", "ice cream", "cauliflower", "lasagna", "touched tree", "brushed teeth"], squirrel: false}, {events: ["weekend", "cycling", "break", "peanuts", "beer"], squirrel: true}, /* and so on... */ ]; Mutability We will get to actual programming soon, but first, there’s one more piece of theory to understand. We saw that object values can be modified. The types of values discussed in earlier chapters, such as numbers, strings, and Booleans, are all immutable—it is impossible to change values of those types. You can combine them and derive new values from them, but when you take a specific string value, that value will always remain the same. The text inside it cannot be changed. If you have a string that contains "cat", it is not possible for other code to change a character in your string to make it spell "rat". Objects work differently. You can change their properties, causing a single object value to have different content at different times. When we have two numbers, 120 and 120, we can consider them precisely the same number, whether or not they refer to the same physical bits. With objects, there is a difference between having two references to the same object and having two different objects that contain the same properties. Consider the following code: let object1 = {value: 10}; let object2 = object1; let object3 = {value: 10}; console.log(object1 == object2); // → true console.log(object1 == object3); // → false object1.value = 15; console.log(object2.value); // → 15 console.log(object3.value); // → 10 61 The object1 and object2 bindings grasp the same object, which is why chang- ing object1 also changes the value of object2. They are said to have the same identity. The binding object3 points to a different object, which initially contains the same properties as object1 but lives a separate life. Bindings can also be changeable or constant, but this is separate from the way their values behave. Even though number values don’t change, you can use a let binding to keep track of a changing number by changing the value at which the binding points. Similarly, though a const bindingto an object can itself not be changed and will continue to point at the same object, the contents of that object might change. const score = {visitors: 0, home: 0}; // This is okay score.visitors = 1; // This isn't allowed score = {visitors: 1, home: 1}; When you compare objects with JavaScript’s == operator, it compares by iden- tity: it will produce true only if both objects are precisely the same value. Comparing different objects will return false, even if they have identical prop- erties. There is no “deep” comparison operation built into JavaScript that compares objects by contents, but it is possible to write it yourself (which is one of the exercises at the end of this chapter). The lycanthrope's log Jacques starts up his JavaScript interpreter and sets up the environment he needs to keep his journal: let journal = []; function addEntry(events, squirrel) { journal.push({events, squirrel}); } Note that the object added to the journal looks a little odd. Instead of declaring properties like events: events, it just gives a property name: events. This is shorthand that means the same thing—if a property name in brace notation isn’t followed by a value, its value is taken from the binding with the same name. Every evening at 10 p.m.—or sometimes the next morning, after climbing down from the top shelf of his bookcase—Jacques records the day: 62 addEntry(["work", "touched tree", "pizza", "running", "television"], false); addEntry(["work", "ice cream", "cauliflower", "lasagna", "touched tree", "brushed teeth"], false); addEntry(["weekend", "cycling", "break", "peanuts", "beer"], true); Once he has enough data points, he intends to use statistics to find out which of these events may be related to the squirrelifications. Correlation is a measure of dependence between statistical variables. A sta- tistical variable is not quite the same as a programming variable. In statistics you typically have a set of measurements, and each variable is measured for every measurement. Correlation between variables is usually expressed as a value that ranges from -1 to 1. Zero correlation means the variables are not related. A correlation of 1 indicates that the two are perfectly related—if you know one, you also know the other. Negative 1 also means that the variables are perfectly related but are opposites—when one is true, the other is false. To compute the measure of correlation between two Boolean variables, we can use the phi coefficient (φ). This is a formula whose input is a frequency table containing the number of times the different combinations of the variables were observed. The output of the formula is a number between -1 and 1 that describes the correlation. We could take the event of eating pizza and put that in a frequency table like this, where each number indicates the number of times that combination occurred in our measurements. No squirrel, no pizza 76 Squirrel, no pizza 4 No squirrel, pizza 9 Squirrel, pizza 1 If we call that table n, we can compute φ using the following formula: φ = n11n00 − n10n01√ n1•n0•n•1n•0 (4.1) (If at this point you’re putting the book down to focus on a terrible flashback to 10th grade math class—hold on! I do not intend to torture you with endless 63 pages of cryptic notation—it’s just this one formula for now. And even with this one, all we do is turn it into JavaScript.) The notation n01 indicates the number of measurements where the first vari- able (squirrelness) is false (0) and the second variable (pizza) is true (1). In the pizza table, n01 is 9. The value n1• refers to the sum of all measurements where the first variable is true, which is 5 in the example table. Likewise, n•0 refers to the sum of the measurements where the second variable is false. So for the pizza table, the part above the division line (the dividend) would be 1×76−4×9 = 40, and the part below it (the divisor) would be the square root of 5×85×10×80, or √340, 000. This comes out to φ ≈ 0.069, which is tiny. Eating pizza does not appear to have influence on the transformations. Computing correlation We can represent a two-by-two table in JavaScript with a four-element array ([76, 9, 4, 1]). We could also use other representations, such as an array con- taining two two-element arrays ([[76, 9], [4, 1]]) or an object with property names like "11" and "01", but the flat array is simple and makes the expres- sions that access the table pleasantly short. We’ll interpret the indices to the array as two-bit binary numbers, where the leftmost (most significant) digit refers to the squirrel variable and the rightmost (least significant) digit refers to the event variable. For example, the binary number 10 refers to the case where Jacques did turn into a squirrel, but the event (say, “pizza”) didn’t oc- cur. This happened four times. And since binary 10 is 2 in decimal notation, we will store this number at index 2 of the array. This is the function that computes the φ coefficient from such an array: function phi(table) { return (table[3] * table[0] - table[2] * table[1]) / Math.sqrt((table[2] + table[3]) * (table[0] + table[1]) * (table[1] + table[3]) * (table[0] + table[2])); } console.log(phi([76, 9, 4, 1])); // → 0.068599434 This is a direct translation of the φ formula into JavaScript. Math.sqrt is the square root function, as provided by the Math object in a standard JavaScript 64 environment. We have to add two fields from the table to get fields like n1• because the sums of rows or columns are not stored directly in our data struc- ture. Jacques keeps his journal for three months. The resulting data set is available in the coding sandbox for this chapter (https://eloquentjavascript.net/code#4), where it is stored in the JOURNAL binding, and in a downloadable file. To extract a two-by-two table for a specific event from the journal, we must loop over all the entries and tally how many times the event occurs in relation to squirrel transformations: function tableFor(event, journal) { let table = [0, 0, 0, 0]; for (let i = 0; iTo do that, we first need to find every type of event. function journalEvents(journal) { let events = []; for (let entry of journal) { for (let event of entry.events) { if (!events.includes(event)) { events.push(event); } } } return events; } console.log(journalEvents(JOURNAL)); // → ["carrot", "exercise", "weekend", "bread", …] By adding any event names that aren’t already in it to the events array, the function collects every type of event. Using that function, we can see all the correlations: for (let event of journalEvents(JOURNAL)) { console.log(event + ":", phi(tableFor(event, JOURNAL))); } // → carrot: 0.0140970969 // → exercise: 0.0685994341 // → weekend: 0.1371988681 // → bread: -0.0757554019 66 // → pudding: -0.0648203724 // and so on... Most correlations seem to lie close to zero. Eating carrots, bread, or pudding apparently does not trigger squirrel-lycanthropy. The transformations do seem to occur somewhat more often on weekends. Let’s filter the results to show only correlations greater than 0.1 or less than -0.1: for (let event of journalEvents(JOURNAL)) { let correlation = phi(tableFor(event, JOURNAL)); if (correlation > 0.1 || correlation result) result = number; } return result; } console.log(max(4, 1, 9, -2)); // → 9 When such a function is called, the rest parameter is bound to an array con- taining all further arguments. If there are other parameters before it, their values aren’t part of that array. When, as in max,. . . . . 99 Overriding derived properties . . . . . . . . . . . . . . . . . . . . . . . 99 Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Polymorphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 Getters, setters, and statics . . . . . . . . . . . . . . . . . . . . . . . . 103 Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 The iterator interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 The instanceof operator . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 7 Project: A Robot 112 Meadowfield . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 The task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 Persistent data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 iv The mail truck’s route . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 Pathfinding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 8 Bugs and Errors 123 Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Strict mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 Error propagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 Cleaning up after exceptions . . . . . . . . . . . . . . . . . . . . . . . . 131 Selective catching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 Assertions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 9 Regular Expressions 138 Creating a regular expression . . . . . . . . . . . . . . . . . . . . . . . 138 Testing for matches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 Sets of characters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 International characters . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 Repeating parts of a pattern . . . . . . . . . . . . . . . . . . . . . . . . 142 Grouping subexpressions . . . . . . . . . . . . . . . . . . . . . . . . . . 143 Matches and groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 The Date class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 Boundaries and look-ahead . . . . . . . . . . . . . . . . . . . . . . . . . 145 Choice patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 The mechanics of matching . . . . . . . . . . . . . . . . . . . . . . . . 147 Backtracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 The replace method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 Greed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 Dynamically creating RegExp objects . . . . . . . . . . . . . . . . . . 152 The search method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 The lastIndex property . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 Parsing an INI file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 Code units and characters . . . . . . . . . . . . . . . . . . . . . . . . . 157 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 v 10 Modules 161 Modular programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 ES modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 Packages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 CommonJS modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 Building and bundling . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 Module design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 11 Asynchronous Programming 173 Asynchronicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 Callbacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 Promises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 Failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 Carla . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 Breaking In . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 Async functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 Generators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 A Corvid Art Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 The event loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 Asynchronous bugs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 12 Project: A Programming Language 193 Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 The evaluator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 Special forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 The environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 Compilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 Cheating . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 13 JavaScript and the Browser 206 Networks and the Internet . . . . . . . . . . . . . . . . . . . . . . . . . 206 The Web . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 HTML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 HTML and JavaScript . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 vi In the sandbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212 Compatibility and the browser wars . . . . . . . . . . . . . . . . . . . 212 14 The Document Object Model 214 Document structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214 Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 The standard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216 Moving through the tree . . . . . . . . . . . . . . . . . . . . . . . . . . 217 Finding elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218 Changing the document . . . . . . . . . . . . . . . . . . . . . . . . . . 219 Creating nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220 Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222 Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222 Styling . . . . . . . . . . . . . . . . . . . . . . . . . . . .it is the only parameter, it will hold all arguments. You can use a similar three-dot notation to call a function with an array of arguments: let numbers = [5, 1, 7]; console.log(max(...numbers)); // → 7 This “spreads” out the array into the function call, passing its elements as separate arguments. It is possible to include an array like that along with other arguments, as in max(9, ...numbers, 2). Square bracket array notation similarly allows the triple-dot operator to spread another array into the new array: let words = ["never", "fully"]; console.log(["will", ...words, "understand"]); // → ["will", "never", "fully", "understand"] This works even in curly brace objects, where it adds all properties from another object. If a property is added multiple times, the last value to be added wins: let coordinates = {x: 10, y: 0}; console.log({...coordinates, y: 5, z: 1}); // → {x: 10, y: 5, z: 1} 71 The Math object As we’ve seen, Math is a grab bag of number-related utility functions such as Math.max (maximum), Math.min (minimum), and Math.sqrt (square root). The Math object is used as a container to group a bunch of related function- ality. There is only one Math object, and it is almost never useful as a value. Rather, it provides a namespace so that all these functions and values do not have to be global bindings. Having too many global bindings “pollutes” the namespace. The more names have been taken, the more likely you are to accidentally overwrite the value of some existing binding. For example, it’s not unlikely you’ll want to name some- thing max in one of your programs. Since JavaScript’s built-in max function is tucked safely inside the Math object, you don’t have to worry about overwriting it. Many languages will stop you, or at least warn you, when you are defining a binding with a name that is already taken. JavaScript does this for bindings you declared with let or const but—perversely—not for standard bindings nor for bindings declared with var or function. Back to the Math object. If you need to do trigonometry, Math can help. It contains cos (cosine), sin (sine), and tan (tangent), as well as their inverse functions, acos, asin, and atan, respectively. The number π (pi)—or at least the closest approximation that fits in a JavaScript number—is available as Math .PI. There is an old programming tradition of writing the names of constant values in all caps: function randomPointOnCircle(radius) { let angle = Math.random() * 2 * Math.PI; return {x: radius * Math.cos(angle), y: radius * Math.sin(angle)}; } console.log(randomPointOnCircle(2)); // → {x: 0.3667, y: 1.966} If you’re not familiar with sines and cosines, don’t worry. I’ll explain them when they are used in this book, in Chapter 14. The previous example used Math.random. This is a function that returns a new pseudorandom number between zero (inclusive) and one (exclusive) every time you call it: console.log(Math.random()); // → 0.36993729369714856 console.log(Math.random()); 72 // → 0.727367032552138 console.log(Math.random()); // → 0.40180766698904335 Though computers are deterministic machines—they always react the same way if given the same input—it is possible to have them produce numbers that ap- pear random. To do that, the machine keeps some hidden value, and whenever you ask for a new random number, it performs complicated computations on this hidden value to create a new value. It stores a new value and returns some number derived from it. That way, it can produce ever new, hard-to-predict numbers in a way that seems random. If we want a whole random number instead of a fractional one, we can use Math.floor (which rounds down to the nearest whole number) on the result of Math.random: console.log(Math.floor(Math.random() * 10)); // → 2 Multiplying the random number by 10 gives us a number greater than or equal to 0 and below 10. Since Math.floor rounds down, this expression will produce, with equal chance, any number from 0 through 9. There are also the functions Math.ceil (for “ceiling”, which rounds up to a whole number), Math.round (to the nearest whole number), and Math.abs, which takes the absolute value of a number, meaning it negates negative values but leaves positive ones as they are. Destructuring Let’s return to the phi function for a moment. function phi(table) { return (table[3] * table[0] - table[2] * table[1]) / Math.sqrt((table[2] + table[3]) * (table[0] + table[1]) * (table[1] + table[3]) * (table[0] + table[2])); } One reason this function is awkward to read is that we have a binding pointing at our array, but we’d much prefer to have bindings for the elements of the array—that is, let n00 = table[0] and so on. Fortunately, there is a succinct way to do this in JavaScript: 73 function phi([n00, n01, n10, n11]) { return (n11 * n00 - n10 * n01) / Math.sqrt((n10 + n11) * (n00 + n01) * (n01 + n11) * (n00 + n10)); } This also works for bindings created with let, var, or const. If you know that the value you are binding is an array, you can use square brackets to “look inside” of the value, binding its contents. A similar trick works for objects, using braces instead of square brackets: let {name} = {name: "Faraji", age: 23}; console.log(name); // → Faraji Note that if you try to destructure null or undefined, you get an error, much as you would if you directly try to access a property of those values. Optional property access When you aren’t sure whether a given value produces an object but still want to read a property from it when it does, you can use a variant of the dot notation: object?.property. function city(object) { return object.address?.city; } console.log(city({address: {city: "Toronto"}})); // → Toronto console.log(city({name: "Vera"})); // → undefined The expression a?.b means the same as a.b when a isn’t null or undefined. When it is, it evaluates to undefined. This can be convenient when, as in the example, you aren’t sure that a given property exists or when a variable might hold an undefined value. A similar notation can be used with square bracket access, and even with function calls, by putting ?. in front of the parentheses or brackets: console.log("string".notAMethod?.()); // → undefined console.log({}.arrayProp?.[0]); // → undefined 74 JSON Because properties grasp their value rather than contain it, objects and ar- rays are stored in the computer’s memory as sequences of bits holding the addresses—the place in memory—of their contents. An array with another ar- ray inside of it consists of (at least) one memory region for the inner array and another for the outer array, containing (among other things) a number that represents the address of the inner array. If you want to save data in a file for later or send it to another computer over the network, you have to somehow convert these tangles of memory addresses to a description that can be stored or sent. You could send over your entire computer memory along with the address of the value you’re interested in, I suppose, but that doesn’t seem like the best approach. What we can do is serialize the data. That means it is converted into a flat description. A popular serialization format is called JSON (pronounced “Jason”), which stands for JavaScript Object Notation. It is widely used as a data storage and communication format on the Web, even in languages other than JavaScript. JSON looks similar to JavaScript’s way of writing arrays and objects, with a few restrictions. All property names have to be surrounded by double quotes, and only simple data expressions are allowed—no function calls, bindings, or anything that involves actual computation. Comments are not allowed in JSON. A journal entry might look like this when represented as JSON data: { "squirrel": false, "events": ["work", "touched tree", "pizza", "running"] } JavaScript gives us the functions JSON.stringify and JSON.parse to convert data to and from this format. The first takes a JavaScript value and returnsa JSON-encoded string. The second takes such a string and converts it to the value it encodes: let string = JSON.stringify({squirrel: false, events: ["weekend"]}); console.log(string); // → {"squirrel":false,"events":["weekend"]} console.log(JSON.parse(string).events); // → ["weekend"] 75 Summary Objects and arrays provide ways to group several values into a single value. This allows us to put a bunch of related things in a bag and run around with the bag instead of wrapping our arms around all of the individual things and trying to hold on to them separately. Most values in JavaScript have properties, with the exceptions being null and undefined. Properties are accessed using value.prop or value["prop"]. Objects tend to use names for their properties and store more or less a fixed set of them. Arrays, on the other hand, usually contain varying amounts of conceptually identical values and use numbers (starting from 0) as the names of their properties. There are some named properties in arrays, such as length and a number of methods. Methods are functions that live in properties and (usually) act on the value of which they are a property. You can iterate over arrays using a special kind of for loop: for (let element of array). Exercises The sum of a range The introduction of this book alluded to the following as a nice way to compute the sum of a range of numbers: console.log(sum(range(1, 10))); Write a range function that takes two arguments, start and end, and returns an array containing all the numbers from start up to and including end. Next, write a sum function that takes an array of numbers and returns the sum of these numbers. Run the example program and see whether it does indeed return 55. As a bonus assignment, modify your range function to take an optional third argument that indicates the “step” value used when building the array. If no step is given, the elements should go up by increments of one, corresponding to the old behavior. The function call range(1, 10, 2) should return [1, 3, 5, 7, 9]. Make sure this also works with negative step values so that range(5, 2, -1) produces [5, 4, 3, 2]. 76 Reversing an array Arrays have a reverse method that changes the array by inverting the order in which its elements appear. For this exercise, write two functions, reverseArray and reverseArrayInPlace. The first, reverseArray, should take an array as argument and produce a new array that has the same elements in the inverse order. The second, reverseArrayInPlace, should do what the reverse method does: modify the array given as argument by reversing its elements. Neither may use the standard reverse method. Thinking back to the notes about side effects and pure functions in the previous chapter, which variant do you expect to be useful in more situations? Which one runs faster? A list As generic blobs of values, objects can be used to build all sorts of data struc- tures. A common data structure is the list (not to be confused with arrays). A list is a nested set of objects, with the first object holding a reference to the second, the second to the third, and so on: let list = { value: 1, rest: { value: 2, rest: { value: 3, rest: null } } }; The resulting objects form a chain, as shown in the following diagram: value: 1 rest: value: 2 rest: value: 3 rest: null A nice thing about lists is that they can share parts of their structure. For example, if I create two new values {value: 0, rest: list} and {value: -1, rest: list} (with list referring to the binding defined earlier), they are both independent lists, but they share the structure that makes up their last three elements. The original list is also still a valid three-element list. Write a function arrayToList that builds up a list structure like the one 77 shown when given [1, 2, 3] as argument. Also write a listToArray function that produces an array from a list. Add the helper functions prepend, which takes an element and a list and creates a new list that adds the element to the front of the input list, and nth, which takes a list and a number and returns the element at the given position in the list (with zero referring to the first element) or undefined when there is no such element. If you haven’t already, also write a recursive version of nth. Deep comparison The == operator compares objects by identity, but sometimes you’d prefer to compare the values of their actual properties. Write a function deepEqual that takes two values and returns true only if they are the same value or are objects with the same properties, where the values of the properties are equal when compared with a recursive call to deepEqual. To find out whether values should be compared directly (using the === op- erator for that) or have their properties compared, you can use the typeof operator. If it produces "object" for both values, you should do a deep com- parison. But you have to take one silly exception into account: because of a historical accident, typeof null also produces "object". The Object.keys function will be useful when you need to go over the prop- erties of objects to compare them. 78 “ There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies.” —C.A.R. Hoare, 1980 ACM Turing Award Lecture Chapter 5 Higher-Order Functions A large program is a costly program, and not just because of the time it takes to build. Size almost always involves complexity, and complexity confuses programmers. Confused programmers, in turn, introduce mistakes (bugs) into programs. A large program then provides a lot of space for these bugs to hide, making them hard to find. Let’s briefly go back to the final two example programs in the introduction. The first is self-contained and six lines long. let total = 0, count = 1; while (countwords such as soak, simmer, chop, and, I guess, vegetable. When programming, we can’t rely on all the words we need to be waiting for us in the dictionary. Thus, we might fall into the pattern of the first recipe— work out the precise steps the computer has to perform, one by one, blind to the higher-level concepts they express. It is a useful skill, in programming, to notice when you are working at too low a level of abstraction. Abstracting repetition Plain functions, as we’ve seen them so far, are a good way to build abstractions. But sometimes they fall short. It is common for a program to do something a given number of times. You can write a for loop for that, like this: 80 for (let i = 0; i { labels.push(`Unit ${i + 1}`); }); console.log(labels); // → ["Unit 1", "Unit 2", "Unit 3", "Unit 4", "Unit 5"] This is structured a little like a for loop—it first describes the kind of loop and then provides a body. However, the body is now written as a function value, which is wrapped in the parentheses of the call to repeat. This is why it has to be closed with the closing brace and closing parenthesis. In cases like this example where the body is a single small expression, you could also omit the braces and write the loop on a single line. 81 Higher-order functions Functions that operate on other functions, either by taking them as arguments or by returning them, are called higher-order functions. Since we have already seen that functions are regular values, there is nothing particularly remarkable about the fact that such functions exist. The term comes from mathemat- ics, where the distinction between functions and other values is taken more seriously. Higher-order functions allow us to abstract over actions, not just values. They come in several forms. For example, we can have functions that create new functions: function greaterThan(n) { return m => m > n; } let greaterThan10 = greaterThan(10); console.log(greaterThan10(11)); // → true We can also have functions that change other functions: function noisy(f) { return (...args) => { console.log("calling with", args); let result = f(...args); console.log("called with", args, ", returned", result); return result; }; } noisy(Math.min)(3, 2, 1); // → calling with [3, 2, 1] // → called with [3, 2, 1] , returned 1 We can even write functions that provide new types of control flow: function unless(test, then) { if (!test) then(); } repeat(3, n => { unless(n % 2 == 1, () => { console.log(n, "is even"); }); }); // → 0 is even 82 // → 2 is even There is a built-in array method, forEach, that provides something like a for /of loop as a higher-order function: ["A", "B"].forEach(l => console.log(l)); // → A // → B Script data set One area where higher-order functions shine is data processing. To process data, we’ll need some actual example data. This chapter will use a data set about scripts—writing systems such as Latin, Cyrillic, or Arabic. Remember Unicode from Chapter 1, the system that assigns a number to each character in written language? Most of these characters are associated with a specific script. The standard contains 140 different scripts, of which 81 are still in use today and 59 are historic. Though I can fluently read only Latin characters, I appreciate the fact that people are writing texts in at least 80 other writing systems, many of which I wouldn’t even recognize. For example, here’s a sample of Tamil handwriting: The example data set contains some pieces of information about the 140 scripts defined in Unicode. It is available in the coding sandbox for this chapter (https://eloquentjavascript.net/code#5) as the SCRIPTS binding. The binding contains an array of objects, each of which describes a script: { name: "Coptic", ranges: [[994, 1008], [11392, 11508], [11513, 11520]], direction: "ltr", year: -200, living: false, link: "https://en.wikipedia.org/wiki/Coptic_alphabet" } Such an object tells us the name of the script, the Unicode ranges assigned to it, the direction in which it is written, the (approximate) origin time, whether 83 https://eloquentjavascript.net/code#5 https://eloquentjavascript.net/code#5 it is still in use, and a link to more information. The direction may be "ltr" for left to right, "rtl" for right to left (the way Arabic and Hebrew text are written), or "ttb" for top to bottom (as with Mongolian writing). The ranges property contains an array of Unicode character ranges, each of which is a two-element array containing a lower bound and an upper bound. Any character codes within these ranges are assigned to the script. The lower bound is inclusive (code 994 is a Coptic character) and the upper bound is non-inclusive (code 1008 isn’t). Filtering arrays If we want to find the scripts in the data set that are still in use, the following function might be helpful. It filters out elements in an array that don’t pass a test. function filter(array, test) { let passed = []; for (let element of array) { if (test(element)) { passed.push(element); } } return passed; } console.log(filter(SCRIPTS, script => script.living)); // → [{name: "Adlam", …}, …] The function uses the argument named test, a function value, to fill a “gap” in the computation—the process of deciding which elements to collect. Note how the filter function, rather than deleting elements from the ex- isting array, builds up a new array with only the elements that pass the test. This function is pure. It does not modify the array it is given. Like forEach, filter is a standard array method. The example defined the function only to show what it does internally. From now on, we’ll use it like this instead: console.log(SCRIPTS.filter(s => s.direction == "ttb")); // → [{name: "Mongolian", …}, …] 84 Transforming with map Say we have an array of objects representing scripts, produced by filtering the SCRIPTS array somehow. We want an array of names instead, which is easier to inspect. The map method transforms an array by applying a function to all of its elements and building a new array from the returned values. The new array will have the same length as the input array, but its content will have been mapped to a new form by the function: function map(array, transform) { let mapped = []; for (let element of array) { mapped.push(transform(element)); } return mapped; } let rtlScripts = SCRIPTS.filter(s => s.direction == "rtl"); console.log(map(rtlScripts, s => s.name)); // → ["Adlam", "Arabic", "Imperial Aramaic", …] Like forEach and filter, map is a standard array method. Summarizing with reduce Another common thing to do with arrays is to compute a single value from them. Our recurring example, summing a collection of numbers, is an instance of this. Another example is finding the script with the most characters. The higher-order operation that represents this pattern is called reduce (some- times also called fold). It builds a value by repeatedly taking a single element from the array and combining it with the current value. When summing num- bers, you’d start with the number zero and, for each element, add that to the sum. The parameters to reduce are, apart from the array, a combining function and a start value. This function is a little less straightforward than filter and map, so takea close look at it: function reduce(array, combine, start) { let current = start; for (let element of array) { current = combine(current, element); } 85 return current; } console.log(reduce([1, 2, 3, 4], (a, b) => a + b, 0)); // → 10 The standard array method reduce, which of course corresponds to this func- tion, has an added convenience. If your array contains at least one element, you are allowed to leave off the start argument. The method will take the first element of the array as its start value and start reducing at the second element. console.log([1, 2, 3, 4].reduce((a, b) => a + b)); // → 10 To use reduce (twice) to find the script with the most characters, we can write something like this: function characterCount(script) { return script.ranges.reduce((count, [from, to]) => { return count + (to - from); }, 0); } console.log(SCRIPTS.reduce((a, b) => { return characterCount(a) a + b) / array.length; } console.log(Math.round(average( SCRIPTS.filter(s => s.living).map(s => s.year)))); // → 1165 console.log(Math.round(average( SCRIPTS.filter(s => !s.living).map(s => s.year)))); // → 204 As you can see, the dead scripts in Unicode are, on average, older than the living ones. This is not a terribly meaningful or surprising statistic. But I hope you’ll agree that the code used to compute it isn’t hard to read. You can see it as a pipeline: we start with all scripts, filter out the living (or dead) ones, take the years from those, average them, and round the result. You could definitely also write this computation as one big loop: let total = 0, count = 0; for (let script of SCRIPTS) { if (script.living) { total += script.year; count += 1; } } console.log(Math.round(total / count)); // → 1165 However, it is harder to see what was being computed and how. And because 87 intermediate results aren’t represented as coherent values, it’d be a lot more work to extract something like average into a separate function. In terms of what the computer is actually doing, these two approaches are also quite different. The first will build up new arrays when running filter and map, whereas the second computes only some numbers, doing less work. You can usually afford the readable approach, but if you’re processing huge arrays and doing so many times, the less abstract style might be worth the extra speed. Strings and character codes One interesting use of this data set would be figuring out what script a piece of text is using. Let’s go through a program that does this. Remember that each script has an array of character code ranges associated with it. Given a character code, we could use a function like this to find the corresponding script (if any): function characterScript(code) { for (let script of SCRIPTS) { if (script.ranges.some(([from, to]) => { return code >= from && code c.name == name); if (!known) { counts.push({name, count: 1}); } else { known.count++; } } return counts; } console.log(countBy([1, 2, 3, 4, 5], n => n > 2)); // → [{name: false, count: 2}, {name: true, count: 3}] The countBy function expects a collection (anything that we can loop over with for/of) and a function thatcomputes a group name for a given element. It returns an array of objects, each of which names a group and tells you the number of elements that were found in that group. It uses another array method, find, which goes over the elements in the array and returns the first one for which a function returns true. It returns undefined when no such element is found. Using countBy, we can write the function that tells us which scripts are used in a piece of text: function textScripts(text) { let scripts = countBy(text, char => { let script = characterScript(char.codePointAt(0)); return script ? script.name : "none"; }).filter(({name}) => name != "none"); let total = scripts.reduce((n, {count}) => n + count, 0); if (total == 0) return "No scripts found"; return scripts.map(({name, count}) => { return `${Math.round(count * 100 / total)}% ${name}`; }).join(", "); 90 } console.log(textScripts('英国的狗说"woof", 俄罗斯的狗说"тяв"')); // → 61% Han, 22% Latin, 17% Cyrillic The function first counts the characters by name, using characterScript to assign them a name and falling back to the string "none" for characters that aren’t part of any script. The filter call drops the entry for "none" from the resulting array since we aren’t interested in those characters. To be able to compute percentages, we first need the total number of char- acters that belong to a script, which we can compute with reduce. If no such characters are found, the function returns a specific string. Otherwise it trans- forms the counting entries into readable strings with map and then combines them with join. Summary Being able to pass function values to other functions is a deeply useful aspect of JavaScript. It allows us to write functions that model computations with “gaps” in them. The code that calls these functions can fill in the gaps by providing function values. Arrays provide a number of useful higher-order methods. You can use forEach to loop over the elements in an array. The filter method returns a new array containing only the elements that pass the predicate function. Transforming an array by putting each element through a function is done with map. You can use reduce to combine all the elements in an array into a single value. The some method tests whether any element matches a given predicate function, while find finds the first element that matches a predicate. Exercises Flattening Use the reduce method in combination with the concat method to “flatten” an array of arrays into a single array that has all the elements of the original arrays. Your own loop Write a higher-order function loop that provides something like a for loop statement. It should take a value, a test function, an update function, and 91 a body function. Each iteration, it should first run the test function on the current loop value and stop if that returns false. It should then call the body function, giving it the current value, then finally call the update function to create a new value and start over from the beginning. When defining the function, you can use a regular loop to do the actual looping. Everything Arrays also have an every method analogous to the some method. This method returns true when the given function returns true for every element in the array. In a way, some is a version of the || operator that acts on arrays, and every is like the && operator. Implement every as a function that takes an array and a predicate function as parameters. Write two versions, one using a loop and one using the some method. Dominant writing direction Write a function that computes the dominant writing direction in a string of text. Remember that each script object has a direction property that can be "ltr" (left to right), "rtl" (right to left), or "ttb" (top to bottom). The dominant direction is the direction of a majority of the characters that have a script associated with them. The characterScript and countBy func- tions defined earlier in the chapter are probably useful here. 92 “An abstract data type is realized by writing a special kind of program […] which defines the type in terms of the operations which can be performed on it.” —Barbara Liskov, Programming with Abstract Data Types Chapter 6 The Secret Life of Objects Chapter 4 introduced JavaScript’s objects as containers that hold other data. In programming culture, object-oriented programming is a set of techniques that use objects as the central principle of program organization. Though no one really agrees on its precise definition, object-oriented programming has shaped the design of many programming languages, including JavaScript. This chapter describes the way these ideas can be applied in JavaScript. Abstract Data Types The main idea in object-oriented programming is to use objects, or rather types of objects, as the unit of program organization. Setting up a program as a number of strictly separated object types provides a way to think about its structure and thus to enforce some kind of discipline, preventing everything from becoming entangled. The way to do this is to think of objects somewhat like you’d think of an elec- tric mixer or other consumer appliance. The people who design and assemble a mixer have to do specialized work requiring material science and understanding of electricity. They cover all that up in a smooth plastic shell so that the people who only want to mix pancake batter don’t have to worry about all that—they only have to understand the few knobs that the mixer can be operated with. Similarly, an abstract data type, or object class, is a subprogram that may contain arbitrarily complicated code but exposes a limited set of methods and properties that people working with it are supposed to use. This allows large programs to be built up out of a number of appliance types, limiting the degree to which these different parts are entangled by requiring them to only interact with each other in specific ways. If a problem is found in one such object class, it can often be repaired or even completely rewritten without impacting the rest of the program. Even better, it may be possible to use object classes in multiple different programs, avoiding the need to recreate their functionality from scratch. You can think 93 of JavaScript’s built-in data structures, such as arrays and strings, as such reusable abstract data types. Each abstract data type has an interface, the collection of operations that external code can perform on it. Even basic things like numbers can be thought of as an abstract data type whose interface allows us to add them, multiply them, compare them, and so on. In fact, the fixation on single objects as the main unit of organization in classical object-oriented programming is somewhat unfortunate, since useful pieces of functionality often involve a group of different object classes working closely together. Methods In JavaScript, methods are nothing more than properties that hold function values. This is a simple method: function speak(line) { console.log(`The ${this.type} rabbit says '${line}'`); } let whiteRabbit = {type: "white", speak}; let hungryRabbit = {type: "hungry", speak}; whiteRabbit.speak("Oh my fur and whiskers"); // → The white rabbit says 'Oh my fur and whiskers' hungryRabbit.speak("Got any carrots?"); // → The hungry rabbit says 'Got any carrots?' Typically a method needs to do something with the object on which it was called. When a function is called as a method—looked up as a property and immediately called, as in object.method()—the binding called this in its body automatically points at the object on which it was called. You can think of this as an extra parameter that is passed to the function in a different way than regular parameters. If you want to provide it explicitly, you can use a function’s call method, which takes the this value as its first argument and treats further arguments as normal parameters. speak.call(whiteRabbit, "Hurry"); // → The white rabbit says 'Hurry'Since each function has its own this binding whose value depends on the way it is called, you cannot refer to the this of the wrapping scope in a regular function defined with the function keyword. Arrow functions are different—they do not bind their own this but can see the this binding of the scope around them. Thus, you can do something like 94 the following code, which references this from inside a local function: let finder = { find(array) { return array.some(v => v == this.value); }, value: 5 }; console.log(finder.find([4, 5])); // → true A property like find(array) in an object expression is a shorthand way of defining a method. It creates a property called find and gives it a function as its value. If I had written the argument to some using the function keyword, this code wouldn’t work. Prototypes One way to create a rabbit object type with a speak method would be to create a helper function that has a rabbit type as parameter and returns an object holding that as its type property and our speak function in its speak property. All rabbits share that same method. Especially for types with many methods, it would be nice if there was a way to keep a type’s methods in a single place, rather than adding them to each object individually. In JavaScript, prototypes are the way to do that. Objects can be linked to other objects, to magically get all the properties that other object has. Plain old objects created with {} notation are linked to an object called Object. prototype. let empty = {}; console.log(empty.toString); // → function toString()…{} console.log(empty.toString()); // → [object Object] It looks like we just pulled a property out of an empty object. But in fact, toString is a method stored in Object.prototype, meaning it is available in most objects. When an object gets a request for a property that it doesn’t have, its proto- type will be searched for the property. If that doesn’t have it, the prototype’s prototype is searched, and so on until an object without prototype is reached (Object.prototype is such an object). 95 console.log(Object.getPrototypeOf({}) == Object.prototype); // → true console.log(Object.getPrototypeOf(Object.prototype)); // → null As you’d guess, Object.getPrototypeOf returns the prototype of an object. Many objects don’t directly have Object.prototype as their prototype but instead have another object that provides a different set of default proper- ties. Functions derive from Function.prototype and arrays derive from Array .prototype. console.log(Object.getPrototypeOf(Math.max) == Function.prototype); // → true console.log(Object.getPrototypeOf([]) == Array.prototype); // → true Such a prototype object will itself have a prototype, often Object.prototype, so that it still indirectly provides methods like toString. You can use Object.create to create an object with a specific prototype: let protoRabbit = { speak(line) { console.log(`The ${this.type} rabbit says '${line}'`); } }; let blackRabbit = Object.create(protoRabbit); blackRabbit.type = "black"; blackRabbit.speak("I am fear and darkness"); // → The black rabbit says 'I am fear and darkness' The “proto” rabbit acts as a container for the properties shared by all rabbits. An individual rabbit object, like the black rabbit, contains properties that apply only to itself—in this case its type—and derives shared properties from its prototype. Classes JavaScript’s prototype system can be interpreted as a somewhat free-form take on abstract data types or classes. A class defines the shape of a type of object— what methods and properties it has. Such an object is called an instance of the class. Prototypes are useful for defining properties for which all instances of a class share the same value. Properties that differ per instance, such as our rabbits’ 96 type property, need to be stored directly in the objects themselves. To create an instance of a given class, you have to make an object that derives from the proper prototype, but you also have to make sure it itself has the properties that instances of this class are supposed to have. This is what a constructor function does. function makeRabbit(type) { let rabbit = Object.create(protoRabbit); rabbit.type = type; return rabbit; } JavaScript’s class notation makes it easier to define this type of function, along with a prototype object. class Rabbit { constructor(type) { this.type = type; } speak(line) { console.log(`The ${this.type} rabbit says '${line}'`); } } The class keyword starts a class declaration, which allows us to define a con- structor and a set of methods together. Any number of methods may be written inside the declaration’s braces. This code has the effect of defining a binding called Rabbit, which holds a function that runs the code in constructor and has a prototype property which holds the speak method. This function cannot be called like a normal function. Constructors, in JavaScript, are called by putting the keyword new in front of them. Doing so creates a fresh instance object whose prototype is the object from the function’s prototype property, then runs the function with this bound to the new object, and finally returns the object. let killerRabbit = new Rabbit("killer"); In fact, class was only introduced in the 2015 edition of JavaScript. Any function can be used as a constructor, and before 2015, the way to define a class was to write a regular function and then manipulate its prototype property. function ArchaicRabbit(type) { this.type = type; } ArchaicRabbit.prototype.speak = function(line) { 97 console.log(`The ${this.type} rabbit says '${line}'`); }; let oldSchoolRabbit = new ArchaicRabbit("old school"); For this reason, all non-arrow functions start with a prototype property holding an empty object. By convention, the names of constructors are capitalized so that they can easily be distinguished from other functions. It is important to understand the distinction between the way a prototype is associated with a constructor (through its prototype property) and the way objects have a prototype (which can be found with Object.getPrototypeOf). The actual prototype of a constructor is Function.prototype since construc- tors are functions. The constructor function’s prototype property holds the prototype used for instances created through it. console.log(Object.getPrototypeOf(Rabbit) == Function.prototype); // → true console.log(Object.getPrototypeOf(killerRabbit) == Rabbit.prototype); // → true Constructors will typically add some per-instance properties to this. It is also possible to declare properties directly in the class declaration. Unlike methods, such properties are added to instance objects, not the prototype. class Particle { speed = 0; constructor(position) { this.position = position; } } Like function, class can be used both in statements and in expressions. When used as an expression, it doesn’t define a binding but just produces the con- structor as a value. You are allowed to omit the class name in a class expression. let object = new class { getWord() { return "hello"; } }; console.log(object.getWord()); // → hello 98 Private Properties It is common for classes to define some properties and methods for internal use that are not part of their interface. These are called private properties, as opposed to public ones, which are part of the object’s external interface. To declare a private method, put a # sign in front of its name. Such methods can only be called from inside the class declaration that defines them. class SecretiveObject { #getSecret() { return "I ate all the plums"; } interrogate() { let shallISayIt = this.#getSecret(); return "never"; } } When a class does not declare a constructor, it will automatically get an empty one. If you try to call #getSecret from outside the class, you get an error. Its existence is entirely hidden inside the class declaration. To use private instance properties, you must declare them. Regular proper-ties can be created by just assigning to them, but private properties must be declared in the class declaration to be available at all. This class implements an appliance for getting random whole number below a given maximum number. It only has one public property: getNumber. class RandomSource { #max; constructor(max) { this.#max = max; } getNumber() { return Math.floor(Math.random() * this.#max); } } Overriding derived properties When you add a property to an object, whether it is present in the prototype or not, the property is added to the object itself. If there was already a property 99 with the same name in the prototype, this property will no longer affect the object, as it is now hidden behind the object’s own property. Rabbit.prototype.teeth = "small"; console.log(killerRabbit.teeth); // → small killerRabbit.teeth = "long, sharp, and bloody"; console.log(killerRabbit.teeth); // → long, sharp, and bloody console.log((new Rabbit("basic")).teeth); // → small console.log(Rabbit.prototype.teeth); // → small The following diagram sketches the situation after this code has run. The Rabbit and Object prototypes lie behind killerRabbit as a kind of backdrop, where properties that are not found in the object itself can be looked up. toString: ... teeth: "small" speak: killerRabbit teeth: "long, sharp, ..." type: "killer" Rabbit prototype Object create: prototype ... Overriding properties that exist in a prototype can be a useful thing to do. As the rabbit teeth example shows, overriding can be used to express exceptional properties in instances of a more generic class of objects while letting the nonexceptional objects take a standard value from their prototype. Overriding is also used to give the standard function and array prototypes a different toString method than the basic object prototype. console.log(Array.prototype.toString == Object.prototype.toString); // → false console.log([1, 2].toString()); // → 1,2 Calling toString on an array gives a result similar to calling .join(",") on it—it puts commas between the values in the array. Directly calling Object. prototype.toString with an array produces a different string. That function doesn’t know about arrays, so it simply puts the word object and the name of the type between square brackets. 100 console.log(Object.prototype.toString.call([1, 2])); // → [object Array] Maps We saw the word map used in the previous chapter for an operation that trans- forms a data structure by applying a function to its elements. Confusing as it is, in programming the same word is used for a related but rather different thing. A map (noun) is a data structure that associates values (the keys) with other values. For example, you might want to map names to ages. It is possible to use objects for this. let ages = { Boris: 39, Liang: 22, Júlia: 62 }; console.log(`Júlia is ${ages["Júlia"]}`); // → Júlia is 62 console.log("Is Jack's age known?", "Jack" in ages); // → Is Jack's age known? false console.log("Is toString's age known?", "toString" in ages); // → Is toString's age known? true Here, the object’s property names are the people’s names and the property values are their ages. But we certainly didn’t list anybody named toString in our map. Yet, because plain objects derive from Object.prototype, it looks like the property is there. As such, using plain objects as maps is dangerous. There are several possible ways to avoid this problem. First, you can create objects with no prototype. If you pass null to Object.create, the resulting object will not derive from Object.prototype and can safely be used as a map. console.log("toString" in Object.create(null)); // → false Object property names must be strings. If you need a map whose keys can’t easily be converted to strings—such as objects—you cannot use an object as your map. Fortunately, JavaScript comes with a class called Map that is written for this 101 exact purpose. It stores a mapping and allows any type of keys. let ages = new Map(); ages.set("Boris", 39); ages.set("Liang", 22); ages.set("Júlia", 62); console.log(`Júlia is ${ages.get("Júlia")}`); // → Júlia is 62 console.log("Is Jack's age known?", ages.has("Jack")); // → Is Jack's age known? false console.log(ages.has("toString")); // → false The methods set, get, and has are part of the interface of the Map object. Writing a data structure that can quickly update and search a large set of values isn’t easy, but we don’t have to worry about that. Someone else did it for us, and we can go through this simple interface to use their work. If you do have a plain object that you need to treat as a map for some reason, it is useful to know that Object.keys returns only an object’s own keys, not those in the prototype. As an alternative to the in operator, you can use the Object.hasOwn function, which ignores the object’s prototype. console.log(Object.hasOwn({x: 1}, "x")); // → true console.log(Object.hasOwn({x: 1}, "toString")); // → false Polymorphism When you call the String function (which converts a value to a string) on an object, it will call the toString method on that object to try to create a mean- ingful string from it. I mentioned that some of the standard prototypes define their own version of toString so they can create a string that contains more useful information than "[object Object]". You can also do that yourself. Rabbit.prototype.toString = function() { return `a ${this.type} rabbit`; }; console.log(String(killerRabbit)); // → a killer rabbit 102 This is a simple instance of a powerful idea. When a piece of code is written to work with objects that have a certain interface—in this case, a toString method—any kind of object that happens to support this interface can be plugged into the code and will be able to work with it. This technique is called polymorphism. Polymorphic code can work with values of different shapes, as long as they support the interface it expects. An example of a widely used interface is that of array-like objects which have a length property holding a number, and numbered properties for each of their elements. Both arrays and strings support this interface, as do various other objects, some of which we’ll see later in the chapters about the browser. Our implementation of forEach from Chapter 5 works on anything that provides this interface. In fact, so does Array.prototype.forEach. Array.prototype.forEach.call({ length: 2, 0: "A", 1: "B" }, elt => console.log(elt)); // → A // → B Getters, setters, and statics Interfaces often contain plain properties, not just methods. For example, Map objects have a size property that tells you how many keys are stored in them. It is not necessary for such an object to compute and store such a property directly in the instance. Even properties that are accessed directly may hide a method call. Such methods are called getters and are defined by writing get in front of the method name in an object expression or class declaration. let varyingSize = { get size() { return Math.floor(Math.random() * 100); } }; console.log(varyingSize.size); // → 73 console.log(varyingSize.size); // → 49 Whenever someone reads from this object’s size property, the associated method 103 is called. You can do a similar thing when a property is written to, using a setter. class Temperature { constructor(celsius) { this.celsius = celsius; } get fahrenheit() { return this.celsius * 1.8 + 32; } set fahrenheit(value) { this.celsius = (value - 32) / 1.8; } static fromFahrenheit(value) { return new Temperature((value - 32) / 1.8); } } let temp = new Temperature(22); console.log(temp.fahrenheit); // → 71.6 temp.fahrenheit = 86; console.log(temp.celsius); // → 30 The Temperature class allows you to read and write the temperature in either degrees Celsius or degrees Fahrenheit, but internally it stores only Celsius and automatically converts to and from Celsius in the fahrenheit getter and setter. Sometimesyou want to attach some properties directly to your constructor function, rather than to the prototype. Such methods won’t have access to a class instance but can, for example, be used to provide additional ways to create instances. Inside a class declaration, methods or properties that have static written before their name are stored on the constructor. For example, the Temperature class allows you to write Temperature.fromFahrenheit(100) to create a tem- perature using degrees Fahrenheit: let boil = Temperature.fromFahrenheit(212); console.log(boil.celsius); // → 100 104 Symbols I mentioned in Chapter 4 that a for/of loop can loop over several kinds of data structures. This is another case of polymorphism—such loops expect the data structure to expose a specific interface, which arrays and strings do. And we can also add this interface to our own objects! But before we can do that, we need to briefly take a look at the symbol type. It is possible for multiple interfaces to use the same property name for dif- ferent things. For example, on array-like objects, length refers to the number of elements in the collection. But an object interface describing a hiking route could use length to provide the length of the route in meters. It would not be possible for an object to conform to both these interfaces. An object trying to be a route and array-like (maybe to enumerate its way- points) is somewhat far-fetched, and this kind of problem isn’t that common in practice. For things like the iteration protocol, though, the language designers needed a type of property that really doesn’t conflict with any others. So in 2015, symbols were added to the language. Most properties, including all those we have seen so far, are named with strings. But it is also possible to use symbols as property names. Symbols are values created with the Symbol function. Unlike strings, newly created symbols are unique—you cannot create the same symbol twice. let sym = Symbol("name"); console.log(sym == Symbol("name")); // → false Rabbit.prototype[sym] = 55; console.log(killerRabbit[sym]); // → 55 The string you pass to Symbol is included when you convert it to a string and can make it easier to recognize a symbol when, for example, showing it in the console. But it has no meaning beyond that—multiple symbols may have the same name. Being both unique and usable as property names makes symbols suitable for defining interfaces that can peacefully live alongside other properties, no matter what their names are. const length = Symbol("length"); Array.prototype[length] = 0; console.log([1, 2].length); // → 2 105 console.log([1, 2][length]); // → 0 It is possible to include symbol properties in object expressions and classes by using square brackets around the property name. That causes the expression between the brackets to be evaluated to produce the property name, analogous to the square bracket property access notation. let myTrip = { length: 2, 0: "Lankwitz", 1: "Babelsberg", [length]: 21500 }; console.log(myTrip[length], myTrip.length); // → 21500 2 The iterator interface The object given to a for/of loop is expected to be iterable. This means it has a method named with the Symbol.iterator symbol (a symbol value defined by the language, stored as a property of the Symbol function). When called, that method should return an object that provides a second interface, iterator. This is the actual thing that iterates. It has a next method that returns the next result. That result should be an object with a value property that provides the next value, if there is one, and a done property, which should be true when there are no more results and false otherwise. Note that the next, value, and done property names are plain strings, not symbols. Only Symbol.iterator, which is likely to be added to a lot of different objects, is an actual symbol. We can directly use this interface ourselves. let okIterator = "OK"[Symbol.iterator](); console.log(okIterator.next()); // → {value: "O", done: false} console.log(okIterator.next()); // → {value: "K", done: false} console.log(okIterator.next()); // → {value: undefined, done: true} Let’s implement an iterable data structure similar to the linked list from the exercise in Chapter 4. We’ll write the list as a class this time. 106 class List { constructor(value, rest) { this.value = value; this.rest = rest; } get length() { return 1 + (this.rest ? this.rest.length : 0); } static fromArray(array) { let result = null; for (let i = array.length - 1; i >= 0; i--) { result = new this(array[i], result); } return result; } } Note that this, in a static method, points at the constructor of the class, not an instance—there is no instance around when a static method is called. Iterating over a list should return all the list’s elements from start to end. We’ll write a separate class for the iterator. class ListIterator { constructor(list) { this.list = list; } next() { if (this.list == null) { return {done: true}; } let value = this.list.value; this.list = this.list.rest; return {value, done: false}; } } The class tracks the progress of iterating through the list by updating its list property to move to the next list object whenever a value is returned and reports that it is done when that list is empty (null). Let’s set up the List class to be iterable. Throughout this book, I’ll occa- sionally use after-the-fact prototype manipulation to add methods to classes so that the individual pieces of code remain small and self-contained. In a regu- 107 lar program, where there is no need to split the code into small pieces, you’d declare these methods directly in the class instead. List.prototype[Symbol.iterator] = function() { return new ListIterator(this); }; We can now loop over a list with for/of. let list = List.fromArray([1, 2, 3]); for (let element of list) { console.log(element); } // → 1 // → 2 // → 3 The ... syntax in array notation and function calls similarly works with any iterable object. For example, you can use [...value] to create an array con- taining the elements in an arbitrary iterable object. console.log([..."PCI"]); // → ["P", "C", "I"] Inheritance Imagine we need a list type much like the List class we saw before, but because we will be asking for its length all the time, we don’t want it to have to scan through its rest every time. Instead, we want to store the length in every instance for efficient access. JavaScript’s prototype system makes it possible to create a new class, much like the old class but with new definitions for some of its properties. The prototype for the new class derives from the old prototype but adds a new definition for, say, the length getter. In object-oriented programming terms, this is called inheritance. The new class inherits properties and behavior from the old class. class LengthList extends List { #length; constructor(value, rest) { super(value, rest); this.#length = super.length; } 108 get length() { return this.#length; } } console.log(LengthList.fromArray([1, 2, 3]).length); // → 3 The use of the word extends indicates that this class shouldn’t be directly based on the default Object prototype but on some other class. This is called the superclass. The derived class is the subclass. To initialize a LengthList instance, the constructor calls the constructor of its superclass through the super keyword. This is necessary because if this new object is to behave (roughly) like a List, it is going to need the instance properties that lists have. The constructor then stores the list’s length in a private property. If we had written this.length there, the class’s own getter would have been called, which doesn’t work yet, since #length hasn’t been filled in yet. We can use super. something to call methods and getters on the superclass’s prototype, which is often useful. Inheritance allows us to build slightly different data types from existing data types with relatively little work. It isa fundamental part of the object-oriented tradition, alongside encapsulation and polymorphism. But while the latter two are now generally regarded as wonderful ideas, inheritance is more controversial. Whereas encapsulation and polymorphism can be used to separate pieces of code from one another, reducing the tangledness of the overall program, inheritance fundamentally ties classes together, creating more tangle. When inheriting from a class, you usually have to know more about how it works than when simply using it. Inheritance can be a useful tool to make some types of programs more succinct, but it shouldn’t be the first tool you reach for, and you probably shouldn’t actively go looking for opportunities to construct class hierarchies (family trees of classes). The instanceof operator It is occasionally useful to know whether an object was derived from a specific class. For this, JavaScript provides a binary operator called instanceof. console.log( new LengthList(1, null) instanceof LengthList); 109 // → true console.log(new LengthList(2, null) instanceof List); // → true console.log(new List(3, null) instanceof LengthList); // → false console.log([1] instanceof Array); // → true The operator will see through inherited types, so a LengthList is an instance of List. The operator can also be applied to standard constructors like Array. Almost every object is an instance of Object. Summary Objects do more than just hold their own properties. They have prototypes, which are other objects. They’ll act as if they have properties they don’t have as long as their prototype has that property. Simple objects have Object. prototype as their prototype. Constructors, which are functions whose names usually start with a capital letter, can be used with the new operator to create new objects. The new object’s prototype will be the object found in the prototype property of the constructor. You can make good use of this by putting the properties that all values of a given type share into their prototype. There’s a class notation that provides a clear way to define a constructor and its prototype. You can define getters and setters to secretly call methods every time an object’s property is accessed. Static methods are methods stored in a class’s constructor rather than its prototype. The instanceof operator can, given an object and a constructor, tell you whether that object is an instance of that constructor. One useful thing to do with objects is to specify an interface for them and tell everybody that they are supposed to talk to your object only through that interface. The rest of the details that make up your object are now encapsulated, hidden behind the interface. You can use private properties to hide a part of your object from the outside world. More than one type may implement the same interface. Code written to use an interface automatically knows how to work with any number of different objects that provide the interface. This is called polymorphism. When implementing multiple classes that differ in only some details, it can be helpful to write the new classes as subclasses of an existing class, inheriting part of its behavior. 110 Exercises A vector type Write a class Vec that represents a vector in two-dimensional space. It takes x and y parameters (numbers), that it saves to properties of the same name. Give the Vec prototype two methods, plus and minus, that take another vector as a parameter and return a new vector that has the sum or difference of the two vectors’ (this and the parameter) x and y values. Add a getter property length to the prototype that computes the length of the vector—that is, the distance of the point (x, y) from the origin (0, 0). Groups The standard JavaScript environment provides another data structure called Set. Like an instance of Map, a set holds a collection of values. Unlike Map, it does not associate other values with those—it just tracks which values are part of the set. A value can be part of a set only once—adding it again doesn’t have any effect. Write a class called Group (since Set is already taken). Like Set, it has add, delete, and has methods. Its constructor creates an empty group, add adds a value to the group (but only if it isn’t already a member), delete removes its argument from the group (if it was a member), and has returns a Boolean value indicating whether its argument is a member of the group. Use the === operator, or something equivalent such as indexOf, to determine whether two values are the same. Give the class a static from method that takes an iterable object as argument and creates a group that contains all the values produced by iterating over it. Iterable groups Make the Group class from the previous exercise iterable. Refer to the section about the iterator interface earlier in the chapter if you aren’t clear on the exact form of the interface anymore. If you used an array to represent the group’s members, don’t just return the iterator created by calling the Symbol.iterator method on the array. That would work, but it defeats the purpose of this exercise. It is okay if your iterator behaves strangely when the group is modified during iteration. 111 “The question of whether Machines Can Think [...] is about as relevant as the question of whether Submarines Can Swim.” —Edsger Dijkstra, The Threats to Computing Science Chapter 7 Project: A Robot In “project” chapters, I’ll stop pummeling you with new theory for a brief mo- ment, and instead we’ll work through a program together. Theory is necessary to learn to program, but reading and understanding actual programs is just as important. Our project in this chapter is to build an automaton, a little program that performs a task in a virtual world. Our automaton will be a mail-delivery robot picking up and dropping off parcels. Meadowfield The village of Meadowfield isn’t very big. It consists of 11 places with 14 roads between them. It can be described with this array of roads: const roads = [ "Alice's House-Bob's House", "Alice's House-Cabin", "Alice's House-Post Office", "Bob's House-Town Hall", "Daria's House-Ernie's House", "Daria's House-Town Hall", "Ernie's House-Grete's House", "Grete's House-Farm", "Grete's House-Shop", "Marketplace-Farm", "Marketplace-Post Office", "Marketplace-Shop", "Marketplace-Town Hall", "Shop-Town Hall" ]; 112 The network of roads in the village forms a graph. A graph is a collection of points (places in the village) with lines between them (roads). This graph will be the world that our robot moves through. The array of strings isn’t very easy to work with. What we’re interested in is the destinations that we can reach from a given place. Let’s convert the list of roads to a data structure that, for each place, tells us what can be reached from there. function buildGraph(edges) { let graph = Object.create(null); function addEdge(from, to) { if (from in graph) { graph[from].push(to); } else { graph[from] = [to]; } } for (let [from, to] of edges.map(r => r.split("-"))) { addEdge(from, to); addEdge(to, from); } return graph; } const roadGraph = buildGraph(roads); Given an array of edges, buildGraph creates a map object that, for each node, stores an array of connected nodes. It uses the split method to go from 113 the road strings—which have the form "Start-End")—to two-element arrays containing the start and end as separate strings. The task Our robot will be moving around the village. There are parcels in various places, each addressed to some other place. The robot picks up parcels when it comes across them and delivers them when it arrives at their destinations. The automaton must decide, at each point, where to go next. It has finished its task when all parcels have been delivered. To be able to simulate this process, we must define a virtual world that can describe it. This model tells us where the robot is and where the parcels are. When the robot has decided to move somewhere, weneed to update the model to reflect the new situation. If you’re thinking in terms of object-oriented programming, your first impulse might be to start defining objects for the various elements in the world: a class for the robot, one for a parcel, maybe one for places. These could then hold properties that describe their current state, such as the pile of parcels at a location, which we could change when updating the world. This is wrong. At least, it usually is. The fact that something sounds like an object does not automatically mean that it should be an object in your program. Reflexively writing classes for every concept in your application tends to leave you with a collection of interconnected objects that each have their own internal, changing state. Such programs are often hard to understand and thus easy to break. Instead, let’s condense the village’s state down to the minimal set of values that define it. There’s the robot’s current location and the collection of unde- livered parcels, each of which has a current location and a destination address. That’s it. While we’re at it, let’s make it so that we don’t change this state when the robot moves but rather compute a new state for the situation after the move. class VillageState { constructor(place, parcels) { this.place = place; this.parcels = parcels; } move(destination) { if (!roadGraph[this.place].includes(destination)) { return this; 114 } else { let parcels = this.parcels.map(p => { if (p.place != this.place) return p; return {place: destination, address: p.address}; }).filter(p => p.place != p.address); return new VillageState(destination, parcels); } } } The move method is where the action happens. It first checks whether there is a road going from the current place to the destination, and if not, it returns the old state since this is not a valid move. Next, the method creates a new state with the destination as the robot’s new place. It also needs to create a new set of parcels—parcels that the robot is carrying (that are at the robot’s current place) need to be moved along to the new place. And parcels that are addressed to the new place need to be delivered—that is, they need to be removed from the set of undelivered parcels. The call to map takes care of the moving, and the call to filter does the delivering. Parcel objects aren’t changed when they are moved but re-created. The move method gives us a new village state but leaves the old one entirely intact. let first = new VillageState( "Post Office", [{place: "Post Office", address: "Alice's House"}] ); let next = first.move("Alice's House"); console.log(next.place); // → Alice's House console.log(next.parcels); // → [] console.log(first.place); // → Post Office The move causes the parcel to be delivered, which is reflected in the next state. But the initial state still describes the situation where the robot is at the post office and the parcel is undelivered. 115 Persistent data Data structures that don’t change are called immutable or persistent. They behave a lot like strings and numbers in that they are who they are and stay that way, rather than containing different things at different times. In JavaScript, just about everything can be changed, so working with values that are supposed to be persistent requires some restraint. There is a function called Object.freeze that changes an object so that writing to its properties is ignored. You could use that to make sure your objects aren’t changed, if you want to be careful. Freezing does require the computer to do some extra work, and having updates ignored is just about as likely to confuse someone as having them do the wrong thing. I usually prefer to just tell people that a given object shouldn’t be messed with and hope they remember it. let object = Object.freeze({value: 5}); object.value = 10; console.log(object.value); // → 5 Why am I going out of my way to not change objects when the language is obviously expecting me to? Because it helps me understand my programs. This is about complexity management again. When the objects in my system are fixed, stable things, I can consider operations on them in isolation—moving to Alice’s house from a given start state always produces the same new state. When objects change over time, that adds a whole new dimension of complexity to this kind of reasoning. For a small system like the one we are building in this chapter, we could handle that bit of extra complexity. But the most important limit on what kind of systems we can build is how much we can understand. Anything that makes your code easier to understand makes it possible to build a more ambitious system. Unfortunately, although understanding a system built on persistent data structures is easier, designing one, especially when your programming language isn’t helping, can be a little harder. We’ll look for opportunities to use persis- tent data structures in this book, but we’ll also be using changeable ones. Simulation A delivery robot looks at the world and decides in which direction it wants to move. As such, we could say that a robot is a function that takes a VillageState object and returns the name of a nearby place. 116 Because we want robots to be able to remember things so they can make and execute plans, we also pass them their memory and allow them to return a new memory. Thus, the thing a robot returns is an object containing both the direction it wants to move in and a memory value that will be given back to it the next time it is called. function runRobot(state, robot, memory) { for (let turn = 0;; turn++) { if (state.parcels.length == 0) { console.log(`Done in ${turn} turns`); break; } let action = robot(state, memory); state = state.move(action.direction); memory = action.memory; console.log(`Moved to ${action.direction}`); } } Consider what a robot has to do to “solve” a given state. It must pick up all parcels by visiting every location that has a parcel and deliver them by visiting every location to which a parcel is addressed, but only after picking up the parcel. What is the dumbest strategy that could possibly work? The robot could just walk in a random direction every turn. That means, with great likelihood, it will eventually run into all parcels and then also at some point reach the place where they should be delivered. Here’s what that could look like: function randomPick(array) { let choice = Math.floor(Math.random() * array.length); return array[choice]; } function randomRobot(state) { return {direction: randomPick(roadGraph[state.place])}; } Remember that Math.random() returns a number between zero and one—but always below one. Multiplying such a number by the length of an array and then applying Math.floor to it gives us a random index for the array. Since this robot does not need to remember anything, it ignores its second argument (remember that JavaScript functions can be called with extra argu- ments without ill effects) and omits the memory property in its returned object. 117 To put this sophisticated robot to work, we’ll first need a way to create a new state with some parcels. A static method (written here by directly adding a property to the constructor) is a good place to put that functionality. VillageState.random = function(parcelCount = 5) { let parcels = []; for (let i = 0; i. . . . . . . . 224 Cascading styles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226 Query selectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 Positioning and animating . . . . . . . . . . . . . . . . . . . . . . . . . 228 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230 15 Handling Events 233 Event handlers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 Events and DOM nodes . . . . . . . . . . . . . . . . . . . . . . . . . . 234 Event objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 Propagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 Default actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 Key events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 Pointer events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 Scroll events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 Focus events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 Load event . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 Events and the event loop . . . . . . . . . . . . . . . . . . . . . . . . . 245 Timers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246 Debouncing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 16 Project: A Platform Game 251 The game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 The technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 vii Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 Reading a level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 Actors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 Drawing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258 Motion and collision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 Actor updates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266 Tracking keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268 Running the game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 17 Drawing on Canvas 273 SVG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 The canvas element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274 Lines and surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276 Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277 Drawing a pie chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280 Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281 Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282 Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283 Storing and clearing transformations . . . . . . . . . . . . . . . . . . . 286 Back to the game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287 Choosing a graphics interface . . . . . . . . . . . . . . . . . . . . . . . 292 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294 18 HTTP and Forms 296 The protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296 Browsers and HTTP . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 Fetch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 HTTP sandboxing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301 Appreciating HTTP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301 Security and HTTPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302 Form fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303 Focus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304 Disabled fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305 The form as a whole . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306 Text fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307 Checkboxes and radio buttons . . . . . . . . . . . . . . . . . . . . . . . 309 Select fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310 viii File fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311 Storing data client-side . . . . . . . . . . . . . . . . . . . . . . . . . . . 312 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315 19 Project: A Pixel Art Editor 318 Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318 The state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320 DOM building . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321 The canvas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322 The application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325 Drawing tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327 Saving and loading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330 Undo history . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332 Let’s draw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334 Why is this so hard? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336 20 Node.js 338 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338 The node command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339 Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340 Installing with NPM . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 The file system module . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 The HTTP module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344 Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346 A file server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353 21 Project: Skill-Sharing Website 355 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355 Long polling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356 HTTP interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357 The server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359 The client . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372 Exercise Hints 374 Program Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374 ix Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375 Data Structures: Objects and Arrays . . . . . . . . . . . . . . . . . . . 376 Higher-Order Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 378 The Secret Life of Objects . . . . . . . . . . . . . . . . . . . . . . . . . 379 Project: A Robot . . . . . . . . . . . .it isn’t planning ahead very well. We’ll address that soon. The mail truck's route We should be able to do a lot better than the random robot. An easy improve- ment would be to take a hint from the way real-world mail delivery works. If we find a route that passes all places in the village, the robot could run that route twice, at which point it is guaranteed to be done. Here is one such route (starting from the post office): const mailRoute = [ "Alice's House", "Cabin", "Alice's House", "Bob's House", "Town Hall", "Daria's House", "Ernie's House", "Grete's House", "Shop", "Grete's House", "Farm", "Marketplace", "Post Office" 118 ]; To implement the route-following robot, we’ll need to make use of robot mem- ory. The robot keeps the rest of its route in its memory and drops the first element every turn. function routeRobot(state, memory) { if (memory.length == 0) { memory = mailRoute; } return {direction: memory[0], memory: memory.slice(1)}; } This robot is a lot faster already. It’ll take a maximum of 26 turns (twice the 13-step route) but usually less. Pathfinding Still, I wouldn’t really call blindly following a fixed route intelligent behavior. The robot could work more efficiently if it adjusted its behavior to the actual work that needs to be done. To do that, it has to be able to deliberately move toward a given parcel or toward the location where a parcel has to be delivered. Doing that, even when the goal is more than one move away, will require some kind of route-finding function. The problem of finding a route through a graph is a typical search problem. We can tell whether a given solution (a route) is valid, but we can’t directly compute the solution the way we could for 2 + 2. Instead, we have to keep creating potential solutions until we find one that works. The number of possible routes through a graph is infinite. But when search- ing for a route from A to B, we are interested only in the ones that start at A. We also don’t care about routes that visit the same place twice—those are definitely not the most efficient route anywhere. So that cuts down on the number of routes that the route finder has to consider. In fact, since we are mostly interested in the shortest route, we want to make sure we look at short routes before we look at longer ones. A good approach would be to “grow” routes from the starting point, exploring every reachable place that hasn’t been visited yet until a route reaches the goal. That way, we’ll only explore routes that are potentially interesting, and we know that the first route we find is the shortest route (or one of the shortest routes, if there are more than one). 119 Here is a function that does this: function findRoute(graph, from, to) { let work = [{at: from, route: []}]; for (let i = 0; i w.at == place)) { work.push({at: place, route: route.concat(place)}); } } } } The exploring has to be done in the right order—the places that were reached first have to be explored first. We can’t immediately explore a place as soon as we reach it because that would mean places reached from there would also be explored immediately, and so on, even though there may be other, shorter paths that haven’t yet been explored. Therefore, the function keeps a work list. This is an array of places that should be explored next, along with the route that got us there. It starts with just the start position and an empty route. The search then operates by taking the next item in the list and exploring that, which means it looks at all roads going from that place. If one of them is the goal, a finished route can be returned. Otherwise, if we haven’t looked at this place before, a new item is added to the list. If we have looked at it before, since we are looking at short routes first, we’ve found either a longer route to that place or one precisely as long as the existing one, and we don’t need to explore it. You can visualize this as a web of known routes crawling out from the start location, growing evenly on all sides (but never tangling back into itself). As soon as the first thread reaches the goal location, that thread is traced back to the start, giving us our route. Our code doesn’t handle the situation where there are no more work items on the work list because we know that our graph is connected, meaning that every location can be reached from all other locations. We’ll always be able to find a route between two points, and the search can’t fail. function goalOrientedRobot({place, parcels}, route) { if (route.length == 0) { let parcel = parcels[0]; if (parcel.place != place) { 120 route = findRoute(roadGraph, place, parcel.place); } else { route = findRoute(roadGraph, place, parcel.address); } } return {direction: route[0], memory: route.slice(1)}; } This robot uses its memory value as a list of directions to move in, just like the route-following robot. Whenever that list is empty, it has to figure out what to do next. It takes the first undelivered parcel in the set and, if that parcel hasn’t been picked up yet, plots a route toward it. If the parcel has been picked up, it still needs to be delivered, so the robot creates a route toward the delivery address instead. This robot usually finishes the task of delivering 5 parcels in about 16 turns. That’s slightly better than routeRobot but still definitely not optimal. We’ll continue refining it in the exercises. Exercises Measuring a robot It’s hard to objectively compare robots by just letting them solve a few sce- narios. Maybe one robot just happened to get easier tasks or the kind of tasks that it is good at, whereas the other didn’t. Write a function compareRobots that takes two robots (and their starting memory). It generates 100 tasks and lets each of the robots solve each of these tasks. When done, it outputs the average number of steps each robot took per task. For the sake of fairness, make sure you give each task to both robots, rather than generating different tasks per robot. Robot efficiency Can you write a robot that finishes the delivery task faster than goalOrientedRobot ? If you observe that robot’s behavior, what obviously stupid things does it do? How could those be improved? If you solved the previous exercise, you might want to use your compareRobots function to verify whether you improved the robot. 121 Persistent group Most data structures provided in a standard JavaScript environment aren’t very well suited for persistent use. Arrays have slice and concat methods, which allow us to easily create new arrays without damaging the old one. But Set, for example, has no methods for creating a new set with an item added or removed. Write a new class PGroup, similar to the Group class from Chapter 6, which stores a set of values. Like Group, it has add, delete, and has methods. Its add method, however, returns a new PGroup instance with the given member added and leaves the old one unchanged. Similarly, delete creates a new instance without a given member. The class should work for values of any type, not just strings. It does not have to be efficient when used with large numbers of values. The constructor shouldn’t be part of the class’s interface (though you’ll def- initely want to use it internally). Instead, there is an empty instance, PGroup. empty, that can be used as a starting value. Why do you need only one PGroup.empty value, rather than having a function that creates a new, empty map every time? 122 “Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.” —Brian Kernighan and P.J. Plauger, The Elements of Programming Style Chapter 8 Bugs and Errors Flaws in computer programs are usually called bugs. It makes programmers feel good to imagine them as littlethings that just happen to crawl into our work. In reality, of course, we put them there ourselves. If a program is crystallized thought, we can roughly categorize bugs into those caused by the thoughts being confused and those caused by mistakes introduced while converting a thought to code. The former type is generally harder to diagnose and fix than the latter. Language Many mistakes could be pointed out to us automatically by the computer, if it knew enough about what we’re trying to do. But here JavaScript’s looseness is a hindrance. Its concept of bindings and properties is vague enough that it will rarely catch typos before actually running the program. Even then, it allows you to do some clearly nonsensical things without complaint, such as computing true * "monkey". There are some things that JavaScript does complain about. Writing a pro- gram that does not follow the language’s grammar will immediately make the computer complain. Other things, such as calling something that’s not a func- tion or looking up a property on an undefined value, will cause an error to be reported when the program tries to perform the action. Often, however, your nonsense computation will merely produce NaN (not a number) or an undefined value, while the program happily continues, convinced that it’s doing something meaningful. The mistake will manifest itself only later, after the bogus value has traveled through several functions. It might not trigger an error at all but silently cause the program’s output to be wrong. Finding the source of such problems can be difficult. The process of finding mistakes—bugs—in programs is called debugging. 123 Strict mode JavaScript can be made a little stricter by enabling strict mode. This can done by putting the string "use strict" at the top of a file or a function body. Here’s an example: function canYouSpotTheProblem() { "use strict"; for (counter = 0; counter string[] function findRoute(graph, from, to) { // ... } There are a number of different conventions for annotating JavaScript programs with types. One thing about types is that they need to introduce their own complexity to be able to describe enough code to be useful. What do you think would be the type of the randomPick function that returns a random element from an 125 array? You’d need to introduce a type variable, T, which can stand in for any type, so that you can give randomPick a type like (T[])→T (function from an array of Ts to a T). When the types of a program are known, it is possible for the computer to check them for you, pointing out mistakes before the program is run. There are several JavaScript dialects that add types to the language and check them. The most popular one is called TypeScript. If you are interested in adding more rigor to your programs, I recommend you give it a try. In this book, we’ll continue using raw, dangerous, untyped JavaScript code. Testing If the language is not going to do much to help us find mistakes, we’ll have to find them the hard way: by running the program and seeing whether it does the right thing. Doing this by hand, again and again, is a really bad idea. Not only is it annoying, it also tends to be ineffective since it takes too much time to exhaustively test everything every time you make a change. Computers are good at repetitive tasks, and testing is the ideal repetitive task. Automated testing is the process of writing a program that tests another program. Writing tests is a bit more work than testing manually, but once you’ve done it, you gain a kind of superpower: it takes you only a few seconds to verify that your program still behaves properly in all the situations you wrote tests for. When you break something, you’ll immediately notice, rather than randomly running into it at some later time. Tests usually take the form of little labeled programs that verify some aspect of your code. For example, a set of tests for the (standard, probably already tested by someone else) toUpperCase method might look like this: function test(label, body) { if (!body()) console.log(`Failed: ${label}`); } test("convert Latin text to uppercase", () => { return "hello".toUpperCase() == "HELLO"; }); test("convert Greek text to uppercase", () => { return "Χαίρετε".toUpperCase() == "ΧΑΊΡΕΤΕ"; }); test("don't convert case-less characters", () => { return ()toUpperCase."مرحبا" == ;"مرحبا" 126 https://www.typescriptlang.org/ }); Writing tests like this tends to produce rather repetitive, awkward code. For- tunately, there exist pieces of software that help you build and run collections of tests (test suites) by providing a language (in the form of functions and methods) suited to expressing tests and by outputting informativeinformation when a test fails. These are usually called test runners. Some code is easier to test than other code. Generally, the more external objects that the code interacts with, the harder it is to set up the context in which to test it. The style of programming shown in the previous chapter, which uses self-contained persistent values rather than changing objects, tends to be easy to test. Debugging Once you notice there is something wrong with your program because it mis- behaves or produces errors, the next step is to figure out what the problem is. Sometimes it is obvious. The error message will point at a specific line of your program, and if you look at the error description and that line of code, you can often see the problem. But not always. Sometimes the line that triggered the problem is simply the first place where a flaky value produced elsewhere gets used in an invalid way. If you have been solving the exercises in earlier chapters, you will probably have already experienced such situations. The following example program tries to convert a whole number to a string in a given base (decimal, binary, and so on) by repeatedly picking out the last digit and then dividing the number to get rid of this digit. But the strange output that it currently produces suggests that it has a bug. function numberToString(n, base = 10) { let result = "", sign = ""; if (n 0); return sign + result; } 127 console.log(numberToString(13, 10)); // → 1.5e-3231.3e-3221.3e-3211.3e-3201.3e-3191.3e…-3181.3 Even if you see the problem already, pretend for a moment that you don’t. We know that our program is malfunctioning, and we want to find out why. This is where you must resist the urge to start making random changes to the code to see whether that makes it better. Instead, think. Analyze what is happening and come up with a theory of why it might be happening. Then make additional observations to test this theory—or, if you don’t yet have a theory, make additional observations to help you come up with one. Putting a few strategic console.log calls into the program is a good way to get additional information about what the program is doing. In this case, we want n to take the values 13, 1, and then 0. Let’s write out its value at the start of the loop. 13 1.3 0.13 0.013… 1.5e-323 Right. Dividing 13 by 10 does not produce a whole number. Instead of n /= base, what we actually want is n = Math.floor(n / base) so that the number is properly “shifted” to the right. An alternative to using console.log to peek into the program’s behavior is to use the debugger capabilities of your browser. Browsers come with the ability to set a breakpoint on a specific line of your code. When the execution of the program reaches a line with a breakpoint, it is paused, and you can inspect the values of bindings at that point. I won’t go into details, as debuggers differ from browser to browser, but look in your browser’s developer tools or search the Web for instructions. Another way to set a breakpoint is to include a debugger statement (con- sisting simply of that keyword) in your program. If the developer tools of your browser are active, the program will pause whenever it reaches such a statement. Error propagation Not all problems can be prevented by the programmer, unfortunately. If your program communicates with the outside world in any way, it is possible to get malformed input, to become overloaded with work, or to have the network fail. 128 If you’re programming only for yourself, you can afford to just ignore such problems until they occur. But if you build something that is going to be used by anybody else, you usually want the program to do better than just crash. Sometimes the right thing to do is take the bad input in stride and continue running. In other cases, it is better to report to the user what went wrong and then give up. In either situation the program has to actively do something in response to the problem. Say you have a function promptNumber that asks the user for a number and returns it. What should it return if the user inputs “orange”? One option is to make it return a special value. Common choices for such values are null, undefined, or -1. function promptNumber(question) { let result = Number(prompt(question)); if (Number.isNaN(result)) return null; else return result; } console.log(promptNumber("How many trees do you see?")); Now any code that calls promptNumber must check whether an actual number was read and, failing that, must somehow recover—maybe by asking again or by filling in a default value. Or it could again return a special value to its caller to indicate that it failed to do what it was asked. In many situations, mostly when errors are common and the caller should be explicitly taking them into account, returning a special value is a good way to indicate an error. It does, however, have its downsides. First, what if the function can already return every possible kind of value? In such a function, you’ll have to do something like wrap the result in an object to be able to distinguish success from failure, the way the nextmethod on the iterator interface does. function lastElement(array) { if (array.length == 0) { return {failed: true}; } else { return {value: array[array.length - 1]}; } } The second issue with returning special values is that it can lead to awkward code. If a piece of code calls promptNumber 10 times, it has to check 10 times whether null was returned. If its response to finding null is to simply return 129 null itself, callers of the function will in turn have to check for it, and so on. Exceptions When a function cannot proceed normally, what we would often like to do is just stop what we are doing and immediately jump to a place that knows how to handle the problem. This is what exception handling does. Exceptions are a mechanism that makes it possible for code that runs into a problem to raise (or throw) an exception. An exception can be any value. Raising one somewhat resembles a super-charged return from a function: it jumps out of not just the current function but also its callers, all the way down to the first call that started the current execution. This is called unwinding the stack. You may remember the stack of function calls mentioned in Chapter 3. An exception zooms down this stack, throwing away all the call contexts it encounters. If exceptions always zoomed right down to the bottom of the stack, they would not be of much use. They’d just provide a novel way to blow up your program. Their power lies in the fact that you can set “obstacles” along the stack to catch the exception as it is zooming down. Once you’ve caught an exception, you can do something with it to address the problem and then continue to run the program. Here’s an example: function promptDirection(question) { let result = prompt(question); if (result.toLowerCase() == "left") return "L"; if (result.toLowerCase() == "right") return "R"; throw new Error("Invalid direction: " + result); } function look() { if (promptDirection("Which way?") == "L") { return "a house"; } else { return "two angry bears"; } } try { console.log("You see", look()); } catch (error) { console.log("Something went wrong: " + error); 130 } The throw keyword is used to raise an exception. Catching one is done by wrapping a piece of code in a try block, followed by the keyword catch. When the code in the try block causes an exception to be raised, the catch block is evaluated, with the name in parentheses bound to the exception value. After the catch block finishes—or if the try block finishes without problems—the program proceeds beneath the entire try/catch statement. In this case, we used the Error constructor to create our exception value. This is a standard JavaScript constructor that creates an object with a message property. Instances of Error also gather informationabout the call stack that existed when the exception was created, a so-called stack trace. This information is stored in the stack property and can be helpful when trying to debug a problem: it tells us the function where the problem occurred and which functions made the failing call. Note that the look function completely ignores the possibility that promptDirection might go wrong. This is the big advantage of exceptions: error-handling code is necessary only at the point where the error occurs and at the point where it is handled. The functions in between can forget all about it. Well, almost... Cleaning up after exceptions The effect of an exception is another kind of control flow. Every action that might cause an exception, which is pretty much every function call and property access, might cause control to suddenly leave your code. This means when code has several side effects, even if its “regular” control flow looks like they’ll always all happen, an exception might prevent some of them from taking place. Here is some really bad banking code: const accounts = { a: 100, b: 0, c: 20 }; function getAccount() { let accountName = prompt("Enter an account name"); if (!Object.hasOwn(accounts, accountName)) { throw new Error(`No such account: ${accountName}`); } 131 return accountName; } function transfer(from, amount) { if (accounts[from]if (e instanceof InputError) { console.log("Not a valid direction. Try again."); } else { throw e; } } } This will catch only instances of InputError and let unrelated exceptions through. If you reintroduce the typo, the undefined binding error will be properly re- ported. Assertions Assertions are checks inside a program that verify that something is the way it is supposed to be. They are used not to handle situations that can come up in normal operation but to find programmer mistakes. If, for example, firstElement is described as a function that should never be called on empty arrays, we might write it like this: function firstElement(array) { if (array.length == 0) { throw new Error("firstElement called with []"); } return array[0]; } Now, instead of silently returning undefined (which you get when reading an array property that does not exist), this will loudly blow up your program as soon as you misuse it. This makes it less likely for such mistakes to go 135 unnoticed and easier to find their cause when they occur. I do not recommend trying to write assertions for every possible kind of bad input. That’d be a lot of work and would lead to very noisy code. You’ll want to reserve them for mistakes that are easy to make (or that you find yourself making). Summary An important part of programming is finding, diagnosing, and fixing bugs. Problems can become easier to notice if you have an automated test suite or add assertions to your programs. Problems caused by factors outside the program’s control should usually be actively planned for. Sometimes, when the problem can be handled locally, special return values are a good way to track them. Otherwise, exceptions may be preferable. Throwing an exception causes the call stack to be unwound until the next enclosing try/catch block or until the bottom of the stack. The exception value will be given to the catch block that catches it, which should verify that it is actually the expected kind of exception and then do something with it. To help address the unpredictable control flow caused by exceptions, finally blocks can be used to ensure that a piece of code always runs when a block finishes. Exercises Retry Say you have a function primitiveMultiply that in 20 percent of cases mul- tiplies two numbers and in the other 80 percent of cases raises an exception of type MultiplicatorUnitFailure. Write a function that wraps this clunky function and just keeps trying until a call succeeds, after which it returns the result. Make sure you handle only the exceptions you are trying to handle. The locked box Consider the following (rather contrived) object: const box = new class { locked = true; #content = []; 136 unlock() { this.locked = false; } lock() { this.locked = true; } get content() { if (this.locked) throw new Error("Locked!"); return this.#content; } }; It is a box with a lock. There is an array in the box, but you can get at it only when the box is unlocked. Write a function called withBoxUnlocked that takes a function value as ar- gument, unlocks the box, runs the function, and then ensures that the box is locked again before returning, regardless of whether the argument function returned normally or threw an exception. For extra points, make sure that if you call withBoxUnlocked when the box is already unlocked, the box stays unlocked. 137 “Some people, when confronted with a problem, think ‘I know, I’ll use regular expressions.’ Now they have two problems.” —Jamie Zawinski Chapter 9 Regular Expressions Programming tools and techniques survive and spread in a chaotic, evolutionary way. It’s not always the best or brilliant ones that win but rather the ones that function well enough within the right niche or that happen to be integrated with another successful piece of technology. In this chapter, I will discuss one such tool, regular expressions. Regular expressions are a way to describe patterns in string data. They form a small, separate language that is part of JavaScript and many other languages and systems. Regular expressions are both terribly awkward and extremely useful. Their syntax is cryptic and the programming interface JavaScript provides for them is clumsy. But they are a powerful tool for inspecting and processing strings. Properly understanding regular expressions will make you a more effective pro- grammer. Creating a regular expression A regular expression is a type of object. It can be either constructed with the RegExp constructor or written as a literal value by enclosing a pattern in forward slash (/) characters. let re1 = new RegExp("abc"); let re2 = /abc/; Both of those regular expression objects represent the same pattern: an a character followed by a b followed by a c. When using the RegExp constructor, the pattern is written as a normal string, so the usual rules apply for backslashes. The second notation, where the pattern appears between slash characters, treats backslashes somewhat differently. First, since a forward slash ends the pattern, we need to put a backslash before any forward slash that we want to be part of the pattern. In addition, backslashes that aren’t part of special 138 character codes (like \n) will be preserved, rather than ignored as they are in strings, and change the meaning of the pattern. Some characters, such as question marks and plus signs, have special meanings in regular expressions and must be preceded by a backslash if they are meant to represent the character itself. let aPlus = /A\+/; Testing for matches Regular expression objects have a number of methods. The simplest one is test. If you pass it a string, it will return a Boolean telling you whether the string contains a match of the pattern in the expression. console.log(/abc/.test("abcde")); // → true console.log(/abc/.test("abxde")); // → false A regular expression consisting of only nonspecial characters simply represents that sequence of characters. If abc occurs anywhere in the string we are testing against (not just at the start), test will return true. Sets of characters Finding out whether a string contains abc could just as well be done with a call to indexOf. Regular expressions are useful because they allow us to describe more complicated patterns. Say we want to match any number. In a regular expression, putting a set of characters between square brackets makes that part of the expression match any of the characters between the brackets. Both of the following expressions match all strings that contain a digit: console.log(/[0123456789]/.test("in 1992")); // → true console.log(/[0-9]/.test("in 1992")); // → true Within square brackets, a hyphen (-) between two characters can be used to indicate a range of characters, where the ordering is determined by the character’s Unicode number. Characters 0 to 9 sit right next to each other in 139 this ordering (codes 48 to 57), so [0-9] covers all of them and matches any digit. A number of common character groups have their own built-in shortcuts. Digits are one of them: \d means the same thing as [0-9]. \d Any digit character \w An alphanumeric character (“word character”) \s Any whitespace character (space, tab, newline, and similar) \D A character that is not a digit \W A nonalphanumeric character \S A nonwhitespace character . Any character except for newline You could match a date and time format like 01-30-2003 15:20 with the following expression: let dateTime = /\d\d-\d\d-\d\d\d\d \d\d:\d\d/; console.log(dateTime.test("01-30-2003 15:20")); // → true console.log(dateTime.test("30-jan-2003 15:20")); // → false That regular expression looks completely awful, doesn’t it? Half of it is back- slashes, producing a background noise that makes it hard to spot the actual pattern expressed. We’ll see a slightly improved version of this expression later. These backslash codes can also be used inside square brackets. For example, [\d.] means any digit or a period character. The period itself, between square brackets,loses its special meaning. The same goes for other special characters, such as the plus sign (+). To invert a set of characters—that is, to express that you want to match any character except the ones in the set—you can write a caret (^) character after the opening bracket. let nonBinary = /[^01]/; console.log(nonBinary.test("1100100010100110")); // → false console.log(nonBinary.test("0111010112101001")); // → true International characters Because of JavaScript’s initial simplistic implementation and the fact that this simplistic approach was later set in stone as standard behavior, JavaScript’s 140 regular expressions are rather dumb about characters that do not appear in the English language. For example, as far as JavaScript’s regular expressions are concerned, a “word character” is only one of the 26 characters in the Latin alphabet (uppercase or lowercase), decimal digits, and, for some reason, the underscore character. Things like é or ß, which most definitely are word char- acters, will not match \w (and will match uppercase \W, the nonword category). By a strange historical accident, \s (whitespace) does not have this problem and matches all characters that the Unicode standard considers whitespace, including things like the nonbreaking space and the Mongolian vowel separator. It is possible to use \p in a regular expression to match all characters to which the Unicode standard assigns a given property. This allows us to match things like letters in a more cosmopolitan way. However, again due to compatibility with the original language standards, those are only recognized when you put a u character (for Unicode) after the regular expression. \p{L} Any letter \p{N} Any numeric character \p{P} Any punctuation character \P{L} Any non-letter (uppercase P inverts) \p{Script=Hangul} Any character from the given script (see Chapter 5) Using \w for text processing that may need to handle non-English text (or even English text with borrowed words like “cliché”) is a liability, since it won’t treat characters like “é” as letters. Though they tend to be a bit more verbose, \p property groups are more robust. console.log(/\p{L}/u.test("α")); // → true console.log(/\p{L}/u.test("!")); // → false console.log(/\p{Script=Greek}/u.test("α")); // → true console.log(/\p{Script=Arabic}/u.test("α")); // → false On the other hand, if you are matching numbers in order to do something with them, you often do want \d for digits, since converting arbitrary numeric characters into a JavaScript number is not something that a function like Number can do for you. 141 Repeating parts of a pattern We now know how to match a single digit. What if we want to match a whole number—a sequence of one or more digits? When you put a plus sign (+) after something in a regular expression, it indicates that the element may be repeated more than once. Thus, /\d+/ matches one or more digit characters. console.log(/'\d+'/.test("'123'")); // → true console.log(/'\d+'/.test("''")); // → false console.log(/'\d*'/.test("'123'")); // → true console.log(/'\d*'/.test("''")); // → true The star (*) has a similar meaning but also allows the pattern to match zero times. Something with a star after it never prevents a pattern from matching— it’ll just match zero instances if it can’t find any suitable text to match. A question mark (?) makes a part of a pattern optional, meaning it may occur zero times or one time. In the following example, the u character is allowed to occur, but the pattern also matches when it is missing: let neighbor = /neighbou?r/; console.log(neighbor.test("neighbour")); // → true console.log(neighbor.test("neighbor")); // → true To indicate that a pattern should occur a precise number of times, use braces. Putting {4} after an element, for example, requires it to occur exactly four times. It is also possible to specify a range this way: {2,4} means the element must occur at least twice and at most four times. Here is another version of the date and time pattern that allows both single- and double-digit days, months, and hours. It is also slightly easier to decipher. let dateTime = /\d{1,2}-\d{1,2}-\d{4} \d{1,2}:\d{2}/; console.log(dateTime.test("1-30-2003 8:45")); // → true You can also specify open-ended ranges when using braces by omitting the number after the comma. For example, {5,} means five or more times. 142 Grouping subexpressions To use an operator like * or + on more than one element at a time, you must use parentheses. A part of a regular expression that is enclosed in parentheses counts as a single element as far as the operators following it are concerned. let cartoonCrying = /boo+(hoo+)+/i; console.log(cartoonCrying.test("Boohoooohoohooo")); // → true The first and second + characters apply only to the second o in boo and hoo, respectively. The third + applies to the whole group (hoo+), matching one or more sequences like that. The i at the end of the expression in the example makes this regular expres- sion case-insensitive, allowing it to match the uppercase B in the input string, even though the pattern is itself all lowercase. Matches and groups The test method is the absolute simplest way to match a regular expression. It tells you only whether it matched and nothing else. Regular expressions also have an exec (execute) method that will return null if no match was found and return an object with information about the match otherwise. let match = /\d+/.exec("one two 100"); console.log(match); // → ["100"] console.log(match.index); // → 8 An object returned from exec has an index property that tells us where in the string the successful match begins. Other than that, the object looks like (and in fact is) an array of strings, whose first element is the string that was matched. In the previous example, this is the sequence of digits that we were looking for. String values have a match method that behaves similarly. console.log("one two 100".match(/\d+/)); // → ["100"] When the regular expression contains subexpressions grouped with parentheses, the text that matched those groups will also show up in the array. The whole match is always the first element. The next element is the part matched by the 143 first group (the one whose opening parenthesis comes first in the expression), then the second group, and so on. let quotedText = /'([^']*)'/; console.log(quotedText.exec("she said 'hello'")); // → ["'hello'", "hello"] When a group does not end up being matched at all (for example, when followed by a question mark), its position in the output array will hold undefined. When a group is matched multiple times (for example, when followed by a +), only the last match ends up in the array. console.log(/bad(ly)?/.exec("bad")); // → ["bad", undefined] console.log(/(\d)+/.exec("123")); // → ["123", "3"] If you want to use parentheses purely for grouping, without having them show up in the array of matches, you can put ?: after the opening parenthesis. console.log(/(?:na)+/.exec("banana")); // → ["nana"] Groups can be useful for extracting parts of a string. If we don’t just want to verify whether a string contains a date but also extract it and construct an object that represents it, we can wrap parentheses around the digit patterns and directly pick the date out of the result of exec. But first we’ll take a brief detour to discuss the built-in way to represent date and time values in JavaScript. The Date class JavaScript has a standard Date class for representing dates, or rather, points in time. If you simply create a date object using new, you get the current date and time. console.log(new Date()); // → Fri Feb 02 2024 18:03:06 GMT+0100 (CET) You can also create an object for a specific time. console.log(new Date(2009, 11, 9)); // → Wed Dec 09 2009 00:00:00 GMT+0100 (CET) console.log(new Date(2009, 11, 9, 12, 59, 59, 999)); // → Wed Dec 09 2009 12:59:59 GMT+0100 (CET) 144 JavaScript uses a convention where month numbers start at zero (so December is11), yet day numbers start at one. This is confusing and silly. Be careful. The last four arguments (hours, minutes, seconds, and milliseconds) are op- tional and taken to be zero when not given. Timestamps are stored as the number of milliseconds since the start of 1970, in the UTC time zone. This follows a convention set by “Unix time”, which was invented around that time. You can use negative numbers for times before 1970. The getTime method on a date object returns this number. It is big, as you can imagine. console.log(new Date(2013, 11, 19).getTime()); // → 1387407600000 console.log(new Date(1387407600000)); // → Thu Dec 19 2013 00:00:00 GMT+0100 (CET) If you give the Date constructor a single argument, that argument is treated as such a millisecond count. You can get the current millisecond count by creating a new Date object and calling getTime on it or by calling the Date.now function. Date objects provide methods such as getFullYear, getMonth, getDate, getHours , getMinutes, and getSeconds to extract their components. Besides getFullYear there’s also getYear, which gives you the year minus 1900 (98 or 119) and is mostly useless. Putting parentheses around the parts of the expression that we are interested in, we can now create a date object from a string. function getDate(string) { let [_, month, day, year] = /(\d{1,2})-(\d{1,2})-(\d{4})/.exec(string); return new Date(year, month - 1, day); } console.log(getDate("1-30-2003")); // → Thu Jan 30 2003 00:00:00 GMT+0100 (CET) The underscore (_) binding is ignored and used only to skip the full match element in the array returned by exec. Boundaries and look-ahead Unfortunately, getDate will also happily extract a date from the string "100-1-30000" . A match may happen anywhere in the string, so in this case, it’ll just start at the second character and end at the second-to-last character. If we want to enforce that the match must span the whole string, we can add 145 the markers ^ and $. The caret matches the start of the input string, whereas the dollar sign matches the end. Thus /^\d+$/ matches a string consisting entirely of one or more digits, /^!/ matches any string that starts with an exclamation mark, and /x^/ does not match any string (there cannot be an x before the start of the string). There is also a \bmarker that matches word boundaries, positions that have a word character one side, and a non-word character on the other. Unfortunately, these use the same simplistic concept of word characters as \w, and are therefore not very reliable. Note that these boundary markers don’t match any actual characters. They just enforce that a given condition holds at the place where they appears in the pattern. Look-ahead tests do something similar. They provide a pattern and will make the match fail if the input doesn’t match that pattern, but don’t actually move the match position forward. They are written between (?= and ). console.log(/a(?=e)/.exec("braeburn")); // → ["a"] console.log(/a(?! )/.exec("a b")); // → null The e in the first example is necessary to match, but is not part of the matched string. The (?! ) notation expresses a negative look-ahead. This only matches if the pattern in the parentheses doesn’t match, causing the second example to only match a characters that don’t have a space after them. Choice patterns Say we want to know whether a piece of text contains not only a number but a number followed by one of the words pig, cow, or chicken, or any of their plural forms. We could write three regular expressions and test them in turn, but there is a nicer way. The pipe character (|) denotes a choice between the pattern to its left and the pattern to its right. We can use it in expressions like this: let animalCount = /\d+ (pig|cow|chicken)s?/; console.log(animalCount.test("15 pigs")); // → true console.log(animalCount.test("15 pugs")); // → false Parentheses can be used to limit the part of the pattern to which the pipe 146 operator applies, and you can put multiple such operators next to each other to express a choice between more than two alternatives. The mechanics of matching Conceptually, when you use exec or test, the regular expression engine looks for a match in your string by trying to match the expression first from the start of the string, then from the second character, and so on until it finds a match or reaches the end of the string. It’ll either return the first match that can be found or fail to find any match at all. To do the actual matching, the engine treats a regular expression something like a flow diagram. This is the diagram for the livestock expression in the previous example: digit “” group #1 “pig” “cow” “chicken” “s” If we can find a path from the left side of the diagram to the right side, our expression matches. We keep a current position in the string, and every time we move through a box, we verify that the part of the string after our current position matches that box. Backtracking The regular expression /^([01]+b|[\da-f]+h|\d+)$/ matches either a binary number followed by a b, a hexadecimal number (that is, base 16, with the letters a to f standing for the digits 10 to 15) followed by an h, or a regular decimal number with no suffix character. This is the corresponding diagram: 147 Start of line group #1 One of: “0” “1” “b” One of: digit -“a” “f” “h” digit End of line When matching this expression, the top (binary) branch will often be entered even though the input does not actually contain a binary number. When matching the string "103", for example, it becomes clear only at the 3 that we are in the wrong branch. The string does match the expression, just not the branch we are currently in. So the matcher backtracks. When entering a branch, it remembers its current position (in this case, at the start of the string, just past the first boundary box in the diagram) so that it can go back and try another branch if the current one does not work out. For the string "103", after encountering the 3 character, the matcher starts trying the branch for hexadecimal numbers, which fails again because there is no h after the number. It then tries the decimal number branch. This one fits, and a match is reported after all. The matcher stops as soon as it finds a full match. This means that if multiple branches could potentially match a string, only the first one (ordered by where the branches appear in the regular expression) is used. Backtracking also happens for repetition operators like + and *. If you match /^.*x/ against "abcxe", the .* part will first try to consume the whole string. The engine will then realize that it needs an x to match the pattern. Since there is no x past the end of the string, the star operator tries to match one character less. But the matcher doesn’t find an x after abcx either, so it backtracks again, matching the star operator to just abc. Now it finds an x where it needs it and reports a successful match from positions 0 to 4. It is possible to write regular expressions that will do a lot of backtracking. This problem occurs when a pattern can match a piece of input in many dif- ferent ways. For example, if we get confused while writing a binary-number regular expression, we might accidentally write something like /([01]+)+b/. 148 "b" Group #1 One of: "1" "0" If that tries to match some long series of zeros and ones with no trailing b character, the matcher first goes through the inner loop until it runs out of digits. Then it notices there is no b, so it backtracks one position, goes through the outer loop once, and gives up again, trying to backtrack out of the inner loop once more. It will continue to try every possible route through these two loops. This means the amount of work doubles with each additional character. For even just a few dozen characters, the resulting match will take practically forever. The replace method String values have a replace method that can be used to replace part ofthe string with another string. console.log("papa".replace("p", "m")); // → mapa The first argument can also be a regular expression, in which case the first match of the regular expression is replaced. When a g option (for global) is added after the regular expression, all matches in the string will be replaced, not just the first. console.log("Borobudur".replace(/[ou]/, "a")); // → Barobudur console.log("Borobudur".replace(/[ou]/g, "a")); // → Barabadar The real power of using regular expressions with replace comes from the fact that we can refer to matched groups in the replacement string. For example, say we have a big string containing the names of people, one name per line, in the format Lastname, Firstname. If we want to swap these names and remove the comma to get a Firstname Lastname format, we can use the following code: 149 console.log( "Liskov, Barbara\nMcCarthy, John\nMilner, Robin" .replace(/(\p{L}+), (\p{L}+)/gu, "$2 $1")); // → Barbara Liskov // John McCarthy // Robin Milner The $1 and $2 in the replacement string refer to the parenthesized groups in the pattern. $1 is replaced by the text that matched against the first group, $2 by the second, and so on, up to $9. The whole match can be referred to with $&. It is possible to pass a function—rather than a string—as the second argu- ment to replace. For each replacement, the function will be called with the matched groups (as well as the whole match) as arguments, and its return value will be inserted into the new string. Here’s an example: let stock = "1 lemon, 2 cabbages, and 101 eggs"; function minusOne(match, amount, unit) { amount = Number(amount) - 1; if (amount == 1) { // only one left, remove the 's' unit = unit.slice(0, unit.length - 1); } else if (amount == 0) { amount = "no"; } return amount + " " + unit; } console.log(stock.replace(/(\d+) (\p{L}+)/gu, minusOne)); // → no lemon, 1 cabbage, and 100 eggs This code takes a string, finds all occurrences of a number followed by an alphanumeric word, and returns a string that has one less of every such quantity. The (\d+) group ends up as the amount argument to the function, and the (\ p{L}+) group gets bound to unit. The function converts amount to a number— which always works since it matched \d+ earlier—and makes some adjustments in case there is only one or zero left. Greed We can use replace to write a function that removes all comments from a piece of JavaScript code. Here is a first attempt: function stripComments(code) { 150 return code.replace(/\/\/.*|\/\*[^]*\*\//g, ""); } console.log(stripComments("1 + /* 2 */3")); // → 1 + 3 console.log(stripComments("x = 10;// ten!")); // → x = 10; console.log(stripComments("1 /* a */+/* b */ 1")); // → 1 1 The part before the | operator matches two slash characters followed by any number of non-newline characters. The part for multiline comments is more involved. We use [^] (any character that is not in the empty set of characters) as a way to match any character. We cannot just use a period here because block comments can continue on a new line, and the period character does not match newline characters. But the output for the last line appears to have gone wrong. Why? The [^]* part of the expression, as I described in the section on backtracking, will first match as much as it can. If that causes the next part of the pattern to fail, the matcher moves back one character and tries again from there. In the example, the matcher first tries to match the whole rest of the string and then moves back from there. It will find an occurrence of */ after going back four characters and match that. This is not what we wanted—the intention was to match a single comment, not to go all the way to the end of the code and find the end of the last block comment. Because of this behavior, we say the repetition operators (+, *, ?, and {} ) are greedy, meaning they match as much as they can and backtrack from there. If you put a question mark after them (+?, *?, ??, {}?), they become nongreedy and start by matching as little as possible, matching more only when the remaining pattern does not fit the smaller match. And that is exactly what we want in this case. By having the star match the smallest stretch of characters that brings us to a */, we consume one block comment and nothing more. function stripComments(code) { return code.replace(/\/\/.*|\/\*[^]*?\*\//g, ""); } console.log(stripComments("1 /* a */+/* b */ 1")); // → 1 + 1 A lot of bugs in regular expression programs can be traced to unintentionally using a greedy operator where a nongreedy one would work better. When using a repetition operator, prefer the nongreedy variant. 151 Dynamically creating RegExp objects In some cases you may not know the exact pattern you need to match against when you are writing your code. Say you want to test for the user’s name in a piece of text. You can build up a string and use the RegExp constructor on that. let name = "harry"; let regexp = new RegExp("(^|\\s)" + name + "($|\\s)", "gi"); console.log(regexp.test("Harry is a dodgy character.")); // → true When creating the \s part of the string, we have to use two backslashes because we are writing them in a normal string, not a slash-enclosed regular expression. The second argument to the RegExp constructor contains the options for the regular expression—in this case, "gi" for global and case-insensitive. But what if the name is "dea+hl[]rd" because our user is a nerdy teenager? That would result in a nonsensical regular expression that won’t actually match the user’s name. To work around this, we can add backslashes before any character that has a special meaning. let name = "dea+hl[]rd"; let escaped = name.replace(/[\\[.+*?(){|^$]/g, "\\$&"); let regexp = new RegExp("(^|\\s)" + escaped + "($|\\s)", "gi"); let text = "This dea+hl[]rd guy is super annoying."; console.log(regexp.test(text)); // → true The search method While the indexOfmethod on strings cannot be called with a regular expression, there is another method, search, that does expect a regular expression. Like indexOf, it returns the first index on which the expression was found, or -1 when it wasn’t found. console.log(" word".search(/\S/)); // → 2 console.log(" ".search(/\S/)); // → -1 Unfortunately, there is no way to indicate that the match should start at a 152 given offset (like we can with the second argument to indexOf), which would often be useful. The lastIndex property The exec method similarly does not provide a convenient way to start searching from a given position in the string. But it does provide an inconvenient way. Regular expression objects have properties. One such property is source, which contains the string that expression was created from. Another property is lastIndex, which controls, in some limited circumstances, where the next match will start. Those circumstances are that the regular expression must have the global (g) or sticky (y) option enabled, and the match must happen through the exec method. Again, a less confusing solution would have been to just allow an extra argument to be passed to exec, but confusion is an essential feature of JavaScript’s regular expression interface. let pattern = /y/g; pattern.lastIndex = 3; let match = pattern.exec("xyzzy"); console.log(match.index); // → 4 console.log(pattern.lastIndex); // → 5 If the match was successful, the call to exec automatically updates the lastIndex property to point after the match. If no match was found, lastIndex is set back to zero, which is also the value it has in a newly constructed regular expression object. The difference between the global and the sticky options is that when sticky is enabled, the match will succeed only if it starts directly at lastIndex, whereas with global, it will search ahead for a position where a match can start. let global = /abc/g; console.log(global.exec("xyz abc")); // → ["abc"] let sticky = /abc/y; console.log(sticky.exec("xyz abc")); // → null When usinga shared regular expression value for multiple exec calls, these automatic updates to the lastIndex property can cause problems. Your regular expression might be accidentally starting at an index left over from a previous 153 call. let digit = /\d/g; console.log(digit.exec("here it is: 1")); // → ["1"] console.log(digit.exec("and now: 1")); // → null Another interesting effect of the global option is that it changes the way the match method on strings works. When called with a global expression, instead of returning an array similar to that returned by exec, match will find all matches of the pattern in the string and return an array containing the matched strings. console.log("Banana".match(/an/g)); // → ["an", "an"] So be cautious with global regular expressions. The cases where they are necessary—calls to replace and places where you want to explicitly use lastIndex —are typically the situations where you want to use them. A common thing to do is to find all the matches of a regular expression in a string. We can do this by using the matchAll method. let input = "A string with 3 numbers in it... 42 and 88."; let matches = input.matchAll(/\d+/g); for (let match of matches) { console.log("Found", match[0], "at", match.index); } // → Found 3 at 14 // Found 42 at 33 // Found 88 at 40 This method returns an array of match arrays. The regular expression given to matchAll must have g enabled. Parsing an INI file To conclude the chapter, we’ll look at a problem that calls for regular expres- sions. Imagine we are writing a program to automatically collect information about our enemies from the Internet. (We will not actually write that program here, just the part that reads the configuration file. Sorry.) The configuration file looks like this: searchengine=https://duckduckgo.com/?q=$1 154 spitefulness=9.7 ; comments are preceded by a semicolon... ; each section concerns an individual enemy [larry] fullname=Larry Doe type=kindergarten bully website=http://www.geocities.com/CapeCanaveral/11451 [davaeorn] fullname=Davaeorn type=evil wizard outputdir=/home/marijn/enemies/davaeorn The exact rules for this format—which is a widely used file format, usually called an INI file—are as follows: • Blank lines and lines starting with semicolons are ignored. • Lines wrapped in [ and ] start a new section. • Lines containing an alphanumeric identifier followed by an = character add a setting to the current section. • Anything else is invalid. Our task is to convert a string like this into an object whose properties hold strings for settings written before the first section header and subobjects for sections, with those subobjects holding the section’s settings. Since the format has to be processed line by line, splitting up the file into separate lines is a good start. We saw the split method in Chapter 4. Some operating systems, however, use not just a newline character to separate lines but a carriage return character followed by a newline ("\r\n"). Given that the split method also allows a regular expression as its argument, we can use a regular expression like /\r?\n/ to split in a way that allows both "\n" and "\r\n" between lines. function parseINI(string) { // Start with an object to hold the top-level fields let result = {}; let section = result; for (let line of string.split(/\r?\n/)) { let match; if (match = line.match(/^(\w+)=(.*)$/)) { section[match[1]] = match[2]; 155 } else if (match = line.match(/^\[(.*)\]$/)) { section = result[match[1]] = {}; } else if (!/^\s*(;|$)/.test(line)) { throw new Error("Line '" + line + "' is not valid."); } }; return result; } console.log(parseINI(` name=Vasilis [address] city=Tessaloniki`)); // → {name: "Vasilis", address: {city: "Tessaloniki"}} The code goes over the file’s lines and builds up an object. Properties at the top are stored directly into that object, whereas properties found in sections are stored in a separate section object. The section binding points at the object for the current section. There are two kinds of significant lines—section headers or property lines. When a line is a regular property, it is stored in the current section. When it is a section header, a new section object is created, and section is set to point at it. Note the recurring use of ^ and $ to make sure the expression matches the whole line, not just part of it. Leaving these out results in code that mostly works but behaves strangely for some input, which can be a difficult bug to track down. The pattern if (match = string.match(...)) makes use of the fact that the value of an assignment expression (=) is the assigned value. You often aren’t sure that your call to match will succeed, so you can access the resulting object only inside an if statement that tests for this. To not break the pleasant chain of else if forms, we assign the result of the match to a binding and immediately use that assignment as the test for the if statement. If a line is not a section header or a property, the function checks whether it is a comment or an empty line using the expression /^\s*(;|$)/ to match lines that either contain only space, or space followed by a semicolon (making the rest of the line a comment). When a line doesn’t match any of the expected forms, the function throws an exception. 156 Code units and characters Another design mistake that’s been standardized in JavaScript regular expres- sions is that by default, operators like . or ? work on code units, as discussed in Chapter 5, not actual characters. This means characters that are composed of two code units behave strangely. console.log(/🍎{3}/.test("🍎🍎🍎")); // → false console.log(//.test("")); // → false console.log(//u.test("")); // → true The problem is that the 🍎 in the first line is treated as two code units, and {3} is applied only to the second unit. Similarly, the dot matches a single code unit, not the two that make up the rose emoji. You must add the u (Unicode) option to your regular expression to make it treat such characters properly. console.log(/🍎{3}/u.test("🍎🍎🍎")); // → true Summary Regular expressions are objects that represent patterns in strings. They use their own language to express these patterns. 157 /abc/ A sequence of characters /[abc]/ Any character from a set of characters /[^abc]/ Any character not in a set of characters /[0-9]/ Any character in a range of characters /x+/ One or more occurrences of the pattern x /x+?/ One or more occurrences, nongreedy /x*/ Zero or more occurrences /x?/ Zero or one occurrence /x{2,4}/ Two to four occurrences /(abc)/ A group /a|b|c/ Any one of several patterns /\d/ Any digit character /\w/ An alphanumeric character (“word character”) /\s/ Any whitespace character /./ Any character except newlines /\p{L}/u Any letter character /^/ Start of input /$/ End of input /(?=a)/ A look-ahead test A regular expression has a method test to test whether a given string matches it. It also has a method exec that, when a match is found, returns an array containing all matched groups. Such an array has an index property that indicates where the match started. Strings have a match method to match them against a regular expression and a search method to search for one, returning only the starting position of the match. Their replace method can replace matches of a pattern with a replacement string or function. Regular expressions can have options, which are written after the closing slash. The i option makes the match case insensitive. The g option makes the expression global, which, among other things, causes the replace method to replace all instances instead of just the first. The y option makes and expression sticky, which means that it will not search ahead and skip part of the string when looking for a match. The u option turns on Unicode mode, which enables \p syntax and fixes a number of problems around the handling of characters that take up two code units. Regular expressions are a sharp tool withan awkward handle. They simplify some tasks tremendously but can quickly become unmanageable when applied to complex problems. Part of knowing how to use them is resisting the urge to try to shoehorn things into them that they cannot cleanly express. 158 Exercises It is almost unavoidable that, in the course of working on these exercises, you will get confused and frustrated by some regular expression’s inexplicable behavior. Sometimes it helps to enter your expression into an online tool like debuggex.com to see whether its visualization corresponds to what you intended and to experiment with the way it responds to various input strings. Regexp golf Code golf is a term used for the game of trying to express a particular program in as few characters as possible. Similarly, regexp golf is the practice of writing as tiny a regular expression as possible to match a given pattern and only that pattern. For each of the following items, write a regular expression to test whether the given pattern occurs in a string. The regular expression should match only strings containing the pattern. When your expression works, see whether you can make it any smaller. 1. car and cat 2. pop and prop 3. ferret, ferry, and ferrari 4. Any word ending in ious 5. A whitespace character followed by a period, comma, colon, or semicolon 6. A word longer than six letters 7. A word without the letter e (or E) Refer to the table in the chapter summary for help. Test each solution with a few test strings. Quoting style Imagine you have written a story and used single quotation marks throughout to mark pieces of dialogue. Now you want to replace all the dialogue quotes with double quotes, while keeping the single quotes used in contractions like aren’t. Think of a pattern that distinguishes these two kinds of quote usage and craft a call to the replace method that does the proper replacement. 159 https://www.debuggex.com/ Numbers again Write an expression that matches only JavaScript-style numbers. It must sup- port an optional minus or plus sign in front of the number, the decimal dot, and exponent notation—5e-3 or 1E10—again with an optional sign in front of the exponent. Also note that it is not necessary for there to be digits in front of or after the dot, but the number cannot be a dot alone. That is, .5 and 5. are valid JavaScript numbers, but a lone dot isn’t. 160 “Write code that is easy to delete, not easy to extend.” —Tef, Programming is Terrible Chapter 10 Modules Ideally, a program has a clear, straightforward structure. The way it works is easy to explain, and each part plays a well-defined role. In practice, programs grow organically. Pieces of functionality are added as the programmer identifies new needs. Keeping such a program well-structured requires constant attention and work. This is work that will pay off only in the future, the next time someone works on the program, so it’s tempting to neglect it and allow the various parts of the program to become deeply entangled. This causes two practical issues. First, understanding an entangled system is hard. If everything can touch everything else, it is difficult to look at any given piece in isolation. You are forced to build up a holistic understanding of the entire thing. Second, if you want to use any of the functionality from such a program in another situation, rewriting it may be easier than trying to disentangle it from its context. The phrase “big ball of mud” is often used for such large, structureless pro- grams. Everything sticks together, and when you try to pick out a piece, the whole thing comes apart, and you only succeed in making a mess. Modular programs Modules are an attempt to avoid these problems. A module is a piece of program that specifies which other pieces it relies on and which functionality it provides for other modules to use (its interface). Module interfaces have a lot in common with object interfaces, as we saw them in Chapter 6. They make part of the module available to the outside world and keep the rest private. But the interface that a module provides for others to use is only half the story. A good module system also requires modules to specify which code they use from other modules. These relations are called dependencies. If module A uses functionality from module B, it is said to depend on that module. When these are clearly specified in the module itself, they can be used to figure out 161 which other modules need to be present to be able to use a given module and to automatically load dependencies. When the ways in which modules interact with each other are explicit, a system becomes more like LEGO, where pieces interact through well-defined connectors, and less like mud, where everything mixes with everything else. ES modules The original JavaScript language did not have any concept of a module. All scripts ran in the same scope, and accessing a function defined in another script was done by referencing the global bindings created by that script. This actively encouraged accidental, hard-to-see entanglement of code and invited problems like unrelated scripts trying to use the same binding name. Since ECMAScript 2015, JavaScript supports two different types of pro- grams. Scripts behave in the old way: their bindings are defined in the global scope, and they have no way to directly reference other scripts. Modules get their own separate scope and support the import and export keywords, which aren’t available in scripts, to declare their dependencies and interface. This module system is usually called ES modules (where ES stands for EC- MAScript). A modular program is composed of a number of such modules, wired together via their imports and exports. The following example module converts between day names and numbers (as returned by Date’s getDay method). It defines a constant which is not part of its interface, and two functions which are. It has no dependencies. const names = ["Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday"]; export function dayName(number) { return names[number]; } export function dayNumber(name) { return names.indexOf(name); } The export keyword can be put in front of a function, class, or binding definition to indicate that that binding is part of the module’s interface. This makes it possible for other modules to use that binding by importing it. import {dayName} from "./dayname.js"; let now = new Date(); 162 console.log(`Today is ${dayName(now.getDay())}`); // → Today is Monday The import keyword, followed by a list of binding names in braces, makes bindings from another module available in the current module. Modules are identified by quoted strings. How such a module name is resolved to an actual program differs by platform. The browser treats them as Web addresses, whereas Node.js resolves them to files. When you run a module, all the other modules it depends on—and the modules those depend on—are loaded, and the exported bindings are made available to the modules that import them. Import and export declarations cannot appear inside of functions, loops, or other blocks. They are immediately resolved when the module is loaded, regardless of how the code in the module executes. To reflect this, they must appear only in the outer module body. A module’s interface thus consists of a collection of named bindings, which other modules that depend on the module can access. Imported bindings can be renamed to give them a new local name using as after their name. import {dayName as nomDeJour} from "./dayname.js"; console.log(nomDeJour(3)); // → Wednesday A module may also have a special export named default, which is often used for modules that only export a single binding. To define a default export, you write export default before an expression, a function declaration, or a class declaration. export default ["Winter", "Spring", "Summer", "Autumn"]; Such a binding is imported by omitting the braces around the. . . . . . . . . . . . . . . . . . 380 Bugs and Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381 Regular Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381 Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382 Asynchronous Programming . . . . . . . . . . . . . . . . . . . . . . . . 383 Project: A Programming Language . . . . . . . . . . . . . . . . . . . . 385 The Document Object Model . . . . . . . . . . . . . . . . . . . . . . . 386 Handling Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387 Project: A Platform Game . . . . . . . . . . . . . . . . . . . . . . . . . 388 Drawing on Canvas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389 HTTP and Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391 Project: A Pixel Art Editor . . . . . . . . . . . . . . . . . . . . . . . . 392 Node.js . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394 Project: Skill-Sharing Website . . . . . . . . . . . . . . . . . . . . . . . 395 x “We think we are creating the system for our own purposes. We believe we are making it in our own image... But the computer is not really like us. It is a projection of a very slim part of ourselves: that portion devoted to logic, order, rule, and clarity.” —Ellen Ullman, Close to the Machine: Technophilia and its Discontents Introduction This is a book about instructing computers. Computers are about as common as screwdrivers today, but they are quite a bit more complex, and making them do what you want them to do isn’t always easy. If the task you have for your computer is a common, well-understood one, such as showing you your email or acting like a calculator, you can open the appropriate application and get to work. But for unique or open-ended tasks, there often is no appropriate application. That is where programming may come in. Programming is the act of con- structing a program—a set of precise instructions telling a computer what to do. Because computers are dumb, pedantic beasts, programming is fundamentally tedious and frustrating. Fortunately, if you can get over that fact—and maybe even enjoy the rigor of thinking in terms that dumb machines can deal with—programming can be rewarding. It allows you to do things in seconds that would take forever by hand. It is a way to make your computer tool do things that it couldn’t do before. On top of that, it makes for a wonderful game of puzzle solving and abstract thinking. Most programming is done with programming languages. A programming language is an artificially constructed language used to instruct computers. It is interesting that the most effective way we’ve found to communicate with a computer borrows so heavily from the way we communicate with each other. Like human languages, computer languages allow words and phrases to be combined in new ways, making it possible to express ever new concepts. At one point, language-based interfaces, such as the BASIC and DOS prompts of the 1980s and 1990s, were the main method of interacting with computers. For routine computer use, these have largely been replaced with visual inter- faces, which are easier to learn but offer less freedom. But if you know where to look, the languages are still there. One of them, JavaScript, is built into every modern web browser—and is thus available on almost every device. This book will try to make you familiar enough with this language to do useful and amusing things with it. 1 On programming Besides explaining JavaScript, I will introduce the basic principles of program- ming. Programming, it turns out, is hard. The fundamental rules are simple and clear, but programs built on top of these rules tend to become complex enough to introduce their own rules and complexity. You’re building your own maze, in a way, and you can easily get lost in it. There will be times when reading this book feels terribly frustrating. If you are new to programming, there will be a lot of new material to digest. Much of this material will then be combined in ways that require you to make additional connections. It is up to you to make the necessary effort. When you are struggling to follow the book, do not jump to any conclusions about your own capabilities. You are fine—you just need to keep at it. Take a break, reread some material, and make sure you read and understand the example programs and exercises. Learning is hard work, but everything you learn is yours and will make further learning easier. When action grows unprofitable, gather information; when infor- mation grows unprofitable, sleep. —Ursula K. Le Guin, The Left Hand of Darkness A program is many things. It is a piece of text typed by a programmer, it is the directing force that makes the computer do what it does, it is data in the computer’s memory, and at the same time it controls the actions performed on this memory. Analogies that try to compare programs to familiar objects tend to fall short. A superficially fitting one is to compare a program to a machine—lots of separate parts tend to be involved, and to make the whole thing tick, we have to consider the ways in which these parts interconnect and contribute to the operation of the whole. A computer is a physical machine that acts as a host for these immaterial machines. Computers themselves can do only stupidly straightforward things. The reason they are so useful is that they do these things at an incredibly high speed. A program can ingeniously combine an enormous number of these simple actions to do very complicated things. A program is a building of thought. It is costless to build, it is weightless, and it grows easily under our typing hands. But as a program grows, so does its complexity. The skill of programming is the skill of building programs that don’t confuse yourself. The best programs are those that manage to do something interesting while still being easy to understand. 2 Some programmers believe that this complexity is best managed by using only a small set of well-understood techniques in their programs. They have composed strict rules (“best practices”) prescribing the form programs should have and carefully stay within their safe little zone. This is not only boring, it is ineffective. New problems often require new solutions. The field of programming is young and still developing rapidly, and it is varied enough to have room for wildly different approaches. There are many terrible mistakes to make in program design, and you should go ahead and make them at least once so that you understand them. A sense of what a good program looks like is developed with practice, not learned from a list of rules. Why language matters In the beginning, at the birth of computing, there were no programming lan- guages. Programs looked something like this: 00110001 00000000 00000000 00110001 00000001 00000001 00110011 00000001 00000010 01010001 00001011 00000010 00100010 00000010 00001000 01000011 00000001 00000000 01000001 00000001 00000001 00010000 00000010 00000000 01100010 00000000 00000000 This is a program to add the numbers from 1 to 10 together and print the result: 1 + 2 + ... + 10 = 55. It could run on a simple hypothetical machine. To program early computers, it was necessary to set large arrays of switches in the right position or punch holes in strips of cardboard and feed them to the computer. You can imagine how tedious and error-prone this procedure was. Even writing simple programs required much cleverness and discipline. Complex ones were nearly inconceivable. Of course, manually entering these arcane patterns of bits (the ones and zeros) did give the programmer a profound sense of being a mighty wizard. And that has to be worth something in terms of job satisfaction. Each line of the previous program contains a single instruction. It could be written in English like this: 1. Store the number 0 in memory location 0. 3 2. Store the numbername of the import. import seasonNames from "./seasonname.js"; To import all bindings from a module at the same time, you can use import *. You provide a name, and that name will be bound to an object holding all the module’s exports. This can be useful when you are using a lot of different exports. import * as dayName from "./dayname.js"; console.log(dayName.dayName(3)); // → Wednesday 163 Packages One of the advantages of building a program out of separate pieces and being able to run some of those pieces on their own is that you might be able to use the same piece in different programs. But how do you set this up? Say I want to use the parseINI function from Chapter 9 in another program. If it is clear what the function depends on (in this case, nothing), I can just copy that module into my new project and use it. But then, if I find a mistake in the code, I’ll probably fix it in whichever program I’m working with at the time and forget to also fix it in the other program. Once you start duplicating code, you’ll quickly find yourself wasting time and energy moving copies around and keeping them up to date. That’s where packages come in. A package is a chunk of code that can be distributed (copied and installed). It may contain one or more modules and has information about which other packages it depends on. A package also usually comes with doc- umentation explaining what it does so that people who didn’t write it might still be able to use it. When a problem is found in a package or a new feature is added, the package is updated. Now the programs that depend on it (which may also be packages) can copy the new version to get the improvements that were made to the code. Working in this way requires infrastructure. We need a place to store and find packages and a convenient way to install and upgrade them. In the JavaScript world, this infrastructure is provided by NPM (https://npmjs.org). NPM is two things: an online service where you can download (and upload) packages, and a program (bundled with Node.js) that helps you install and manage them. At the time of writing, there are more than three million different packages available on NPM. A large portion of those are rubbish, to be fair. But almost every useful, publicly available JavaScript package can be found on NPM. For example, an INI file parser, similar to the one we built in Chapter 9, is available under the package name ini. Chapter 20 will show how to install such packages locally using the npm command line program. Having quality packages available for download is extremely valuable. It means that we can often avoid reinventing a program that 100 people have written before and get a solid, well-tested implementation at the press of a few keys. Software is cheap to copy, so once someone has written it, distributing it to other people is an efficient process. Writing it in the first place is work, though, 164 https://npmjs.org and responding to people who have found problems in the code or who want to propose new features is even more work. By default, you own the copyright to the code you write, and other peo- ple may use it only with your permission. But because some people are just nice and because publishing good software can help make you a little bit fa- mous among programmers, many packages are published under a license that explicitly allows other people to use it. Most code on NPM is licensed this way. Some licenses require you to also publish code that you build on top of the package under the same license. Others are less demanding, requiring only that you keep the license with the code as you distribute it. The JavaScript community mostly uses the latter type of license. When using other people’s packages, make sure you are aware of their licenses. Now, instead of writing our own INI file parser, we can use one from NPM. import {parse} from "ini"; console.log(parse("x = 10\ny = 20")); // → {x: "10", y: "20"} CommonJS modules Before 2015, when the JavaScript language had no built-in module system, peo- ple were already building large systems in JavaScript. To make that workable, they needed modules. The community designed its own improvised module systems on top of the language. These use functions to create a local scope for the modules and regular objects to represent module interfaces. Initially, people just manually wrapped their entire module in an “immedi- ately invoked function expression” to create the module’s scope and assigned their interface objects to a single global variable. const weekDay = function() { const names = ["Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday"]; return { name(number) { return names[number]; }, number(name) { return names.indexOf(name); } }; }(); 165 console.log(weekDay.name(weekDay.number("Sunday"))); // → Sunday This style of modules provides isolation, to a certain degree, but it does not declare dependencies. Instead, it just puts its interface into the global scope and expects its dependencies, if any, to do the same. This is not ideal. If we implement our own module loader, we can do better. The most widely used approach to bolted-on JavaScript modules is called CommonJS modules. Node.js used this module system from the start (though it now also knows how to load ES modules), and it is the module system used by many packages on NPM. A CommonJS module looks like a regular script, but it has access to two bindings that it uses to interact with other modules. The first is a function called require. When you call this with the module name of your dependency, it makes sure the module is loaded and returns its interface. The second is an object named exports, which is the interface object for the module. It starts out empty and you add properties to it to define exported values. This CommonJS example module provides a date-formatting function. It uses two packages from NPM—ordinal to convert numbers to strings like "1 st" and "2nd", and date-names to get the English names for weekdays and months. It exports a single function, formatDate, which takes a Date object and a template string. The template string may contain codes that direct the format, such as YYYY for the full year and Do for the ordinal day of the month. You could give it a string like "MMMM Do YYYY" to get output like November 22nd 2017. const ordinal = require("ordinal"); const {days, months} = require("date-names"); exports.formatDate = function(date, format) { return format.replace(/YYYY|M(MMM)?|Do?|dddd/g, tag => { if (tag == "YYYY") return date.getFullYear(); if (tag == "M") return date.getMonth(); if (tag == "MMMM") return months[date.getMonth()]; if (tag == "D") return date.getDate(); if (tag == "Do") return ordinal(date.getDate()); if (tag == "dddd") return days[date.getDay()]; }); }; The interface of ordinal is a single function, whereas date-names exports an object containing multiple things—days and months are arrays of names. De- structuring is very convenient when creating bindings for imported interfaces. 166 The module adds its interface function to exports so that modules that depend on it get access to it. We could use the module like this: const {formatDate} = require("./format-date.js"); console.log(formatDate(new Date(2017, 9, 13), "dddd the Do")); // → Friday the 13th CommonJS is implemented with a module loader that, when loading a module, wraps its code in a function (giving it its own local scope), and passes the require and exports bindings to that function as arguments. If we assume we have access to a readFile function that reads a file by name and gives us its content, we can define a simplified form of require like this: function require(name) { if (!(name in require.cache)) { let code = readFile(name); let exports = require.cache[name] = {}; let wrapper = Function("require, exports", code); wrapper(require, exports); } return require.cache[name]; } require.cache = Object.create(null); Function is a built-inJavaScript function that takes a list of arguments (as a comma-separated string) and a string containing the function body and returns a function value with those arguments and that body. This is an interest- ing concept—it allows a program to create new pieces of program from string data—but also a dangerous one, since if someone can trick your program into putting a string they provide into Function, they can make the program do anything they want. Standard JavaScript provides no such function as readFile, but different JavaScript environments, such as the browser and Node.js, provide their own ways of accessing files. The example just pretends that readFile exists. To avoid loading the same module multiple times, require keeps a store (cache) of already loaded modules. When called, it first checks if the requested module has been loaded and, if not, loads it. This involves reading the module’s code, wrapping it in a function, and calling it. By defining require, exports as parameters for the generated wrapper func- tion (and passing the appropriate values when calling it), the loader makes sure that these bindings are available in the module’s scope. 167 An important difference between this system and ES modules is that ES mod- ule imports happen before a module’s script starts running, whereas require is a normal function, invoked when the module is already running. Unlike import declarations, require calls can appear inside functions, and the name of the dependency can be any expression that evaluates to a string, whereas import only allows plain quoted strings. The transition of the JavaScript community from CommonJS style to ES modules has been a slow and somewhat rough one. Fortunately we are now at a point where most of the popular packages on NPM provide their code as ES modules, and Node.js allows ES modules to import from CommonJS modules. While CommonJS code is still something you will run across, there is no real reason to write new programs in this style anymore. Building and bundling Many JavaScript packages aren’t technically written in JavaScript. Language extensions such as TypeScript, the type checking dialect mentioned in Chapter 8, are widely used. People also often start using planned new language features long before they have been added to the platforms that actually run JavaScript. To make this possible, they compile their code, translating it from their cho- sen JavaScript dialect to plain old JavaScript—or even to a past version of JavaScript—so that browsers can run it. Including a modular program that consists of 200 different files in a web page produces its own problems. If fetching a single file over the network takes 50 milliseconds, loading the whole program takes 10 seconds, or maybe half that if you can load several files simultaneously. That’s a lot of wasted time. Because fetching a single big file tends to be faster than fetching a lot of tiny ones, web programmers have started using tools that combine their programs (which they painstakingly split into modules) into a single big file before they publish it to the Web. Such tools are called bundlers. And we can go further. Apart from the number of files, the size of the files also determines how fast they can be transferred over the network. Thus, the JavaScript community has invented minifiers. These are tools that take a JavaScript program and make it smaller by automatically removing com- ments and whitespace, renaming bindings, and replacing pieces of code with equivalent code that take up less space. It is not uncommon for the code that you find in an NPM package or that runs on a web page to have gone through multiple stages of transformation— converting from modern JavaScript to historic JavaScript, combining the mod- 168 ules into a single file, and minifying the code. We won’t go into the details of these tools in this book since there are many of them, and which one is popular changes regularly. Just be aware that such things exist, and look them up when you need them. Module design Structuring programs is one of the subtler aspects of programming. Any non- trivial piece of functionality can be organized in various ways. Good program design is subjective—there are trade-offs involved, and mat- ters of taste. The best way to learn the value of well-structured design is to read or work on a lot of programs and notice what works and what doesn’t. Don’t assume that a painful mess is “just the way it is”. You can improve the structure of almost everything by putting more thought into it. One aspect of module design is ease of use. If you are designing something that is intended to be used by multiple people—or even by yourself, in three months when you no longer remember the specifics of what you did—it is helpful if your interface is simple and predictable. That may mean following existing conventions. A good example is the ini package. This module imitates the standard JSON object by providing parse and stringify (to write an INI file) functions, and, like JSON, converts between strings and plain objects. The interface is small and familiar, and after you’ve worked with it once, you’re likely to remember how to use it. Even if there’s no standard function or widely used package to imitate, you can keep your modules predictable by using simple data structures and doing a single, focused thing. Many of the INI-file parsing modules on NPM provide a function that directly reads such a file from the hard disk and parses it, for example. This makes it impossible to use such modules in the browser, where we don’t have direct file system access, and adds complexity that would have been better addressed by composing the module with some file-reading function. This points to another helpful aspect of module design—the ease with which something can be composed with other code. Focused modules that compute values are applicable in a wider range of programs than bigger modules that perform complicated actions with side effects. An INI file reader that insists on reading the file from disk is useless in a scenario where the file’s content comes from some other source. Relatedly, stateful objects are sometimes useful or even necessary, but if something can be done with a function, use a function. Several of the INI file 169 readers on NPM provide an interface style that requires you to first create an object, then load the file into your object, and finally use specialized methods to get at the results. This type of thing is common in the object-oriented tradition, and it’s terrible. Instead of making a single function call and moving on, you have to perform the ritual of moving your object through its various states. And because the data is now wrapped in a specialized object type, all code that interacts with it has to know about that type, creating unnecessary interdependencies. Often, defining new data structures can’t be avoided—only a few basic ones are provided by the language standard, and many types of data have to be more complex than an array or a map. But when an array suffices, use an array. An example of a slightly more complex data structure is the graph from Chapter 7. There is no single obvious way to represent a graph in JavaScript. In that chapter, we used an object whose properties hold arrays of strings—the other nodes reachable from that node. There are several different pathfinding packages on NPM, but none of them uses this graph format. They usually allow the graph’s edges to have a weight, which is the cost or distance associated with it. That isn’t possible in our representation. For example, there’s the dijkstrajs package. A well-known approach to pathfinding, quite similar to our findRoute function, is called Dijkstra’s algo- rithm, after Edsger Dijkstra, who first wrote it down. The js suffix is often added to package names to indicate the fact that they are written in JavaScript. This dijkstrajs package uses a graph format similar to ours, but insteadof arrays, it uses objects whose property values are numbers—the weights of the edges. If we wanted to use that package, we’d have to make sure that our graph was stored in the format it expects. All edges get the same weight since our simplified model treats each road as having the same cost (one turn). const {find_path} = require("dijkstrajs"); let graph = {}; for (let node of Object.keys(roadGraph)) { let edges = graph[node] = {}; for (let dest of roadGraph[node]) { edges[dest] = 1; } } 170 console.log(find_path(graph, "Post Office", "Cabin")); // → ["Post Office", "Alice's House", "Cabin"] This can be a barrier to composition—when various packages are using different data structures to describe similar things, combining them is difficult. There- fore, if you want to design for composability, find out what data structures other people are using and, when possible, follow their example. Designing a fitting module structure for a program can be difficult. In the phase where you are still exploring the problem, trying different things to see what works, you might want to not worry about it too much since keeping everything organized can be a big distraction. Once you have something that feels solid, that’s a good time to take a step back and organize it. Summary Modules provide structure to bigger programs by separating the code into pieces with clear interfaces and dependencies. The interface is the part of the module that’s visible to other modules, and the dependencies are the other modules it makes use of. Because JavaScript historically did not provide a module system, the Com- monJS system was built on top of it. Then at some point it did get a built-in system, which now coexists uneasily with the CommonJS system. A package is a chunk of code that can be distributed on its own. NPM is a repository of JavaScript packages. You can download all kinds of useful (and useless) packages from it. Exercises A modular robot These are the bindings that the project from Chapter 7 creates: roads buildGraph roadGraph VillageState runRobot randomPick randomRobot mailRoute routeRobot findRoute 171 goalOrientedRobot If you were to write that project as a modular program, what modules would you create? Which module would depend on which other module, and what would their interfaces look like? Which pieces are likely to be available prewritten on NPM? Would you prefer to use an NPM package or write them yourself? Roads module Write an ES module based on the example from Chapter 7 that contains the array of roads and exports the graph data structure representing them as roadGraph. It depends on a module ./graph.js that exports a function buildGraph, used to build the graph. This function expects an array of two- element arrays (the start and end points of the roads). Circular dependencies A circular dependency is a situation where module A depends on B, and B also, directly or indirectly, depends on A. Many module systems simply forbid this because whichever order you choose for loading such modules, you cannot make sure that each module’s dependencies have been loaded before it runs. CommonJS modules allow a limited form of cyclic dependencies. As long as the modules don’t access each other’s interface until after they finish loading, cyclic dependencies are okay. The require function given earlier in this chapter supports this type of de- pendency cycle. Can you see how it handles cycles? 172 “Who can wait quietly while the mud settles? Who can remain still until the moment of action?” —Laozi, Tao Te Ching Chapter 11 Asynchronous Programming The central part of a computer, the part that carries out the individual steps that make up our programs, is called the processor. The programs we have seen so far will keep the processor busy until they have finished their work. The speed at which something like a loop that manipulates numbers can be executed depends pretty much entirely on the speed of the computer’s processor and memory. But many programs interact with things outside of the processor. For ex- ample, they may communicate over a computer network or request data from the hard disk—which is a lot slower than getting it from memory. When such a thing is happening, it would be a shame to let the processor sit idle—there might be some other work it could do in the meantime. In part, this is handled by your operating system, which will switch the processor between multiple running programs. But that doesn’t help when we want a single program to be able to make progress while it is waiting for a network request. Asynchronicity In a synchronous programming model, things happen one at a time. When you call a function that performs a long-running action, it returns only when the action has finished and it can return the result. This stops your program for the time the action takes. An asynchronous model allows multiple things to happen at the same time. When you start an action, your program continues to run. When the action finishes, the program is informed and gets access to the result (for example, the data read from disk). We can compare synchronous and asynchronous programming using a small example: a program that makes two requests over the network and then com- bines the results. In a synchronous environment, where the request function returns only after 173 it has done its work, the easiest way to perform this task is to make the requests one after the other. This has the drawback that the second request will be started only when the first has finished. The total time taken will be at least the sum of the two response times. The solution to this problem, in a synchronous system, is to start additional threads of control. A thread is another running program whose execution may be interleaved with other programs by the operating system—since most mod- ern computers contain multiple processors, multiple threads may even run at the same time, on different processors. A second thread could start the second request, and then both threads wait for their results to come back, after which they resynchronize to combine their results. In the following diagram, the thick lines represent time the program spends running normally, and the thin lines represent time spent waiting for the net- work. In the synchronous model, the time taken by the network is part of the timeline for a given thread of control. In the asynchronous model, start- ing a network action allows the program to continue running while the network communication happens alongside it, notifying the program when it is finished. synchronous, single thread of control synchronous, two threads of control asynchronous Another way to describe the difference is that waiting for actions to finish is implicit in the synchronous model, while it is explicit—under our control—in the asynchronous one. Asynchronicity cuts both ways. It makes expressing programs that do not fit the straight-line model of control easier, but it can also make expressing programs that do follow a straight line more awkward. We’ll see some ways to reduce this awkwardness later in the chapter. Both prominent JavaScript programming platforms—browsers and Node.js— make operations that might take a while asynchronous, rather than relying on threads. Since programming with threads is notoriously hard (understanding what a program does is much more difficult when it’s doing multiple things at once), this is generally considered a good thing. 174 Callbacks One approach to asynchronous programming is to make functions that need to wait for something take an extra argument, a callback function. The asyn- chronous function starts a process, sets things up so that the callback function is called when the process finishes, and then returns. As an example, the setTimeout function, available both in Node.js and in browsers, waits a given number of milliseconds, then calls a function. setTimeout(() => console.log("Tick"), 500); Waiting isnot generally important work, but it can be very useful when you need to arrange for something to happen at a certain time or check whether some action is taking longer than expected. Another example of a common asynchronous operation is reading a file from a device’s storage. Imagine you have a function readTextFile that reads a file’s content as a string and passes it to a callback function. readTextFile("shopping_list.txt", content => { console.log(`Shopping List:\n${content}`); }); // → Shopping List: // → Peanut butter // → Bananas The readTextFile function is not part of standard JavaScript. We will see how to read files in the browser and in Node.js in later chapters. Performing multiple asynchronous actions in a row using callbacks means that you have to keep passing new functions to handle the continuation of the computation after the actions. An asynchronous function that compares two files and produces a boolean indicating whether their content is the same might look like this: function compareFiles(fileA, fileB, callback) { readTextFile(fileA, contentA => { readTextFile(fileB, contentB => { callback(contentA == contentB); }); }); } This style of programming is workable, but the indentation level increases with each asynchronous action because you end up in another function. Doing more complicated things, such as wrapping asynchronous actions in a loop, can get 175 awkward. In a way, asynchronicity is contagious. Any function that calls a function that works asynchronously must itself be asynchronous, using a callback or similar mechanism to deliver its result. Calling a callback is somewhat more involved and error-prone than simply returning a value, so needing to structure large parts of your program that way is not great. Promises A slightly different way to build an asynchronous program is to have asyn- chronous functions return an object that represents its (future) result instead of passing around callback functions. This way such functions actually return something meaningful, and the shape of the program more closely resembles that of synchronous programs. This is what the standard class Promise is for. A promise is a receipt repre- senting a value that may not be available yet. It provides a then method that allows you to register a function that should be called when the action for which it is waiting finishes. When the promise is resolved, meaning its value becomes available, such functions (there can be multiple) are called with the result value. It is possible to call then on a promise that has already resolved—your function will still be called. The easiest way to create a promise is by calling Promise.resolve. This function ensures that the value you give it is wrapped in a promise. If it’s already a promise, it is simply returned. Otherwise, you get a new promise that immediately resolves with your value as its result. let fifteen = Promise.resolve(15); fifteen.then(value => console.log(`Got ${value}`)); // → Got 15 To create a promise that does not immediately resolve, you can use Promise as a constructor. It has a somewhat odd interface: the constructor expects a function as argument, which it immediately calls, passing it a function that it can use to resolve the promise. For example, this is how you could create a promise-based interface for the readTextFile function: function textFile(filename) { return new Promise(resolve => { readTextFile(filename, text => resolve(text)); }); } 176 textFile("plans.txt").then(console.log); Note how, in contrast to callback-style functions, this asynchronous function returns a meaningful value—a promise to give you the contents of the file at some point in the future. A useful thing about the thenmethod is that it itself returns another promise. This one resolves to the value returned by the callback function or, if that returned value is a promise, to the value that promise resolves to. Thus, you can “chain” multiple calls to then together to set up a sequence of asynchronous actions. This function, which reads a file full of filenames, and returns the content of a random file in that list, shows this kind of asynchronous promise pipeline: function randomFile(listFile) { return textFile(listFile) .then(content => content.trim().split("\n")) .then(ls => ls[Math.floor(Math.random() * ls.length)]) .then(filename => textFile(filename)); } The function returns the result of this chain of then calls. The initial promise fetches the list of files as a string. The first then call transforms that string into an array of lines, producing a new promise. The second then call picks a random line from that, producing a third promise that yields a single filename. The final then call reads this file, so that the result of the function as a whole is a promise that returns the content of a random file. In this code, the functions used in the first two then calls return a regular value that will immediately be passed into the promise returned by then when the function returns. The last then call returns a promise (textFile(filename )), making it an actual asynchronous step. It would also have been possible to perform all these steps inside a single then callback, since only the last step is actually asynchronous. But the kind of then wrappers that only do some synchronous data transformation are often useful, such as when you want to return a promise that produces a processed version of some asynchronous result. function jsonFile(filename) { return textFile(filename).then(JSON.parse); } jsonFile("package.json").then(console.log); 177 Generally, it is useful to think of a promise as a device that lets code ignore the question of when a value is going to arrive. A normal value has to actually exist before we can reference it. A promised value is a value that might already be there or might appear at some point in the future. Computations defined in terms of promises, by wiring them together with then calls, are executed asynchronously as their inputs become available. Failure Regular JavaScript computations can fail by throwing an exception. Asyn- chronous computations often need something like that. A network request may fail, a file may not exist, or some code that is part of the asynchronous computation may throw an exception. One of the most pressing problems with the callback style of asynchronous programming is that it makes it extremely difficult to ensure failures are prop- erly reported to the callbacks. A common convention is to use the first argument to the callback to indicate that the action failed, and the second to pass the value produced by the action when it was successful. someAsyncFunction((error, value) => { if (error) handleError(error); else processValue(value); }); Such callback functions must always check whether they received an exception and make sure that any problems they cause, including exceptions thrown by functions they call, are caught and given to the right function. Promises make this easier. They can be either resolved (the action finished successfully) or rejected (it failed). Resolve handlers (as registered with then) are called only when the action is successful, and rejections are propagated to the new promise returned by then. When a handler throws an exception, this automatically causes the promise produced by its then call to be rejected. If any element in a chain of asynchronous actions fails, the outcome of the whole chain is marked as rejected, and no success handlers are called beyond the point where it failed. Much like resolving a promise provides a value, rejecting one also provides a value, usually called the reason of the rejection. When an exception in a handler function causes the rejection, the exception value is used as the reason. Similarly, when a handler returns a promise that is rejected, that rejection flows into the next promise. There’s a Promise.reject function that creates a new, 178 immediately rejected promise. To explicitlyhandle such rejections, promises have a catch method that registers a handler to be called when the promise is rejected, similar to how then handlers handle normal resolution. It’s also very much like then in that it returns a new promise, which resolves to the original promise’s value when that resolves normally and to the result of the catch handler otherwise. If a catch handler throws an error, the new promise is also rejected. As a shorthand, then also accepts a rejection handler as a second argu- ment, so you can install both types of handlers in a single method call: .then (acceptHandler, rejectHandler). A function passed to the Promise constructor receives a second argument, alongside the resolve function, which it can use to reject the new promise. When our readTextFile function encounters a problem, it passes the error to its callback function as a second argument. Our textFile wrapper should actually check that argument, so that a failure causes the promise it returns to be rejected. function textFile(filename) { return new Promise((resolve, reject) => { readTextFile(filename, (text, error) => { if (error) reject(error); else resolve(text); }); }); } The chains of promise values created by calls to then and catch thus form a pipeline through which asynchronous values or failures move. Since such chains are created by registering handlers, each link has a success handler or a rejection handler (or both) associated with it. Handlers that don’t match the type of outcome (success or failure) are ignored. Handlers that do match are called, and their outcome determines what kind of value comes next—success when they return a non-promise value, rejection when they throw an exception, and the outcome of the promise when they return a promise. new Promise((_, reject) => reject(new Error("Fail"))) .then(value => console.log("Handler 1:", value)) .catch(reason => { console.log("Caught failure " + reason); return "nothing"; }) .then(value => console.log("Handler 2:", value)); // → Caught failure Error: Fail 179 // → Handler 2: nothing The first then handler function isn’t called, because at that point of the pipeline the promise holds a rejection. The catch handler handles that rejection and returns a value, which is given to the second then handler function. Much like an uncaught exception is handled by the environment, JavaScript environments can detect when a promise rejection isn’t handled and will report this as an error. Carla It’s a sunny day in Berlin. The runway of the old, decommissioned airport is teeming with cyclists and inline skaters. In the grass near a garbage container a flock of crows noisily mills about, trying to convince a group of tourists to part with their sandwiches. One of the crows stands out—a large scruffy female with a few white feathers in her right wing. She is baiting people with a skill and confidence that suggest she’s been doing this for a long time. When an elderly man is distracted by the antics of another crow, she casually swoops in, snatches his half-eaten bun from his hand, and sails away. Contrary to the rest of the group, who look like they are happy to spend the day goofing around here, the large crow looks purposeful. Carrying her loot, she flies straight towards the roof of the hangar building, disappearing into an air vent. Inside the building, you can hear an odd tapping sound—soft, but persistent. It comes from a narrow space under the roof of an unfinished stairwell. The crow is sitting there, surrounded by her stolen snacks, half a dozen smartphones (several of which are turned on), and a mess of cables. She rapidly taps the screen of one of the phones with her beak. Words are appearing on it. If you didn’t know better, you’d think she was typing. This crow is known to her peers as “cāāw-krö”. But since those sounds are poorly suited for human vocal chords, we’ll refer to her as Carla. Carla is a somewhat peculiar crow. In her youth, she was fascinated by human language, eavesdropping on people until she had a good grasp of what they were saying. Later in life, her interest shifted to human technology, and she started stealing phones to study them. Her current project is learning to program. The text she is typing in her hidden lab is, in fact, a piece of asynchronous JavaScript code. 180 Breaking In Carla loves the Internet. Annoyingly, the phone she is working on is about to run out of prepaid data. The building has a wireless network, but it requires a code to access. Fortunately, the wireless routers in the building are 20 years old and poorly secured. Doing some research, Carla finds out that the network authentication mechanism has a flaw she can use. When joining the network, a device must send along the correct six-digit passcode. The access point will reply with a success or failure message depending on whether the right code is provided. However, when sending a partial code (say, only 3 digits), the response is different based on whether those digits are the correct start of the code or not. Sending incorrect numbers immediately returns a failure message. When sending the correct ones, the access point waits for more digits. This makes it possible to greatly speed up the guessing of the number. Carla can find the first digit by trying each number in turn, until she finds one that doesn’t immediately return failure. Having one digit, she can find the second digit in the same way, and so on, until she knows the entire passcode. Assume Carla has a joinWifi function. Given the network name and the passcode (as a string), the function tries to join the network, returning a promise that resolves if successful and rejects if the authentication failed. The first thing she needs is a way to wrap a promise so that it automatically rejects after it takes too much time, to allow the program to quickly move on if the access point doesn’t respond. function withTimeout(promise, time) { return new Promise((resolve, reject) => { promise.then(resolve, reject); setTimeout(() => reject("Timed out"), time); }); } This makes use of the fact that a promise can only be resolved or rejected once. If the promise given as argument resolves or rejects first, that result will be the result of the promise returned by withTimeout. If, on the other hand, the setTimeout fires first, rejecting the promise, any further resolve or reject calls are ignored. To find the whole passcode, the program needs to repeatedly look for the next digit by trying each digit. If authentication succeeds, we know we have found what we are looking for. If it immediately fails, we know that digit was wrong, and must try the next digit. If the request times out, we have found 181 another correct digit, and must continue by adding another digit. Because you cannot wait for a promise inside a for loop, Carla uses a recur- sive function to drive this process. On each call, this function gets the code as we know it so far, as well as the next digit to try. Depending on what happens, it may return a finished code or call through to itself, to either start cracking the next position in the code or to try again with another digit. function crackPasscode(networkID) { function nextDigit(code, digit) { let newCode = code + digit; return withTimeout(joinWifi(networkID, newCode), 50) .then(() => newCode) .catch(failure => { if (failure == "Timed out") { return nextDigit(newCode, 0); } else if (digitbe tied together in verbose, arbitrary-looking ways. To create an asynchronous loop, Carla was forced to introduce a recursive function. The thing the cracking function actually does is completely linear—it always waits for the previous action to complete before starting the next one. In a 182 synchronous programming model, it’d be more straightforward to express. The good news is that JavaScript allows you to write pseudo-synchronous code to describe asynchronous computation. An async function implicitly re- turns a promise and can, in its body, await other promises in a way that looks synchronous. We can rewrite crackPasscode like this: async function crackPasscode(networkID) { for (let code = "";;) { for (let digit = 0;; digit++) { let newCode = code + digit; try { await withTimeout(joinWifi(networkID, newCode), 50); return newCode; } catch (failure) { if (failure == "Timed out") { code = newCode; break; } else if (digit == 9) { throw failure; } } } } } This version more clearly shows the double loop structure of the function (the inner loop tries digit 0 to 9, the outer loop adds digits to the passcode). An async function is marked by the word async before the function keyword. Methods can also be made async by writing async before their name. When such a function or method is called, it returns a promise. As soon as the function returns something, that promise is resolved. If the body throws an exception, the promise is rejected. Inside an async function, the word await can be put in front of an expression to wait for a promise to resolve and only then continue the execution of the function. If the promise rejects, an exception is raised at the point of the await. Such a function no longer runs from start to completion in one go like a regular JavaScript function. Instead, it can be frozen at any point that has an await and can be resumed at a later time. For most asynchronous code, this notation is more convenient than directly using promises. You do still need an understanding of promises, since in many cases you’ll still interact with them directly. But when wiring them together, async functions are generally more pleasant to write than chains of then calls. 183 Generators This ability of functions to be paused and then resumed again is not exclusive to async functions. JavaScript also has a feature called generator functions. These are similar, but without the promises. When you define a function with function* (placing an asterisk after the word function), it becomes a generator. When you call a generator, it returns an iterator, which we already saw in Chapter 6. function* powers(n) { for (let current = n;; current *= n) { yield current; } } for (let power of powers(3)) { if (power > 50) break; console.log(power); } // → 3 // → 9 // → 27 Initially, when you call powers, the function is frozen at its start. Every time you call next on the iterator, the function runs until it hits a yield expression, which pauses it and causes the yielded value to become the next value produced by the iterator. When the function returns (the one in the example never does), the iterator is done. Writing iterators is often much easier when you use generator functions. The iterator for the Group class (from the exercise in Chapter 6) can be written with this generator: Group.prototype[Symbol.iterator] = function*() { for (let i = 0; i console.log(`Request to ${ip} accepted`)) .catch(() => {}); } Since most of these addresses won’t exist or will not accept such messages, the catch call makes sure network errors don’t crash the program. The requests are all sent out immediately, without waiting for other requests to finish, in order to not waste time when some of the machines don’t answer. Having fired off her network scan, Carla heads back outside to see the result. To her delight, all of the screens are now showing a stripe of light in their top left corners. They are on the local network, and they do accept commands. She quickly notes the numbers shown on each screen. There are 9 screens, arranged three high and three wide. They have the following network addresses: const screenAddresses = [ "10.0.0.44", "10.0.0.45", "10.0.0.41", "10.0.0.31", "10.0.0.40", "10.0.0.42", "10.0.0.48", "10.0.0.47", "10.0.0.46" ]; Now this opens up possibilities for all kinds of shenanigans. She could show “crows rule, humans drool” on the wall in giant letters. But that feels a bit crude. Instead, she plans to show a video of a flying crow covering all of the screens at night. Carla finds a fitting video clip, in which a second and a half of footage can be repeated to create a looping video showing a crow’s wingbeat. To fit the nine screens (each of which can show 50×30 pixels), Carla cutsand resizes the videos to get a series of 150×90 images, ten per second. Those are then each cut into nine rectangles, and processed so that the dark spots on the video (where the crow is) show a bright light, and the light spots (no crow) are left dark, which should create the effect of an amber crow flying against a black background. She has set up the clipImages variable to hold an array of frames, where each frame is represented with an array of nine sets of pixels—one for each 186 screen—in the format that the signs expect. To display a single frame of the video, Carla needs to send a request to all the screens at once. But she also needs to wait for the result of these requests, both in order to not start sending the next frame before the current one has been properly sent, and in order to notice when requests are failing. Promise has a static method all that can be used to convert an array of promises into a single promise that resolves to an array of results. This provides a convenient way to have some asynchronous actions happen alongside each other, wait for them all to finish, and then do something with their results (or at least wait for them to make sure they don’t fail). function displayFrame(frame) { return Promise.all(frame.map((data, i) => { return request(screenAddresses[i], { command: "display", data }); })); } This maps over the images in frame (which is an array of display data arrays) to create an array of request promises. It then returns a promise that combines all of those. In order to be able to stop a playing video, the process is wrapped in a class. This class has an asynchronous play method that returns a promise that only resolves when the playback is stopped again via the stop method. function wait(time) { return new Promise(accept => setTimeout(accept, time)); } class VideoPlayer { constructor(frames, frameTime) { this.frames = frames; this.frameTime = frameTime; this.stopped = true; } async play() { this.stopped = false; for (let i = 0; !this.stopped; i++) { let nextFrame = wait(this.frameTime); await displayFrame(this.frames[i % this.frames.length]); await nextFrame; 187 } } stop() { this.stopped = true; } } The wait function wraps setTimeout in a promise that resolves after the given amount of milliseconds. This is useful for controlling the speed of the playback. let video = new VideoPlayer(clipImages, 100); video.play().catch(e => { console.log("Playback failed: " + e); }); setTimeout(() => video.stop(), 15000); For the entire week that the screen wall stands, every evening, when it is dark, a huge glowing orange bird mysteriously appears on it. The event loop An asynchronous program starts by running its main script, which will often set up callbacks to be called later. That main script, as well as the callbacks, run to completion in one piece, uninterrupted. But between them, the program may sit idle, waiting for something to happen. So callbacks are not directly called by the code that scheduled them. If I call setTimeout from within a function, that function will have returned by the time the callback function is called. And when the callback returns, control does not go back to the function that scheduled it. Asynchronous behavior happens on its own empty function call stack. This is one of the reasons that, without promises, managing exceptions across asyn- chronous code is so hard. Since each callback starts with a mostly empty stack, your catch handlers won’t be on the stack when they throw an exception. try { setTimeout(() => { throw new Error("Woosh"); }, 20); } catch (e) { // This will not run console.log("Caught", e); } 188 No matter how closely together events—such as timeouts or incoming requests— happen, a JavaScript environment will run only one program at a time. You can think of this as it running a big loop around your program, called the event loop. When there’s nothing to be done, that loop is paused. But as events come in, they are added to a queue, and their code is executed one after the other. Because no two things run at the same time, slow-running code can delay the handling of other events. This example sets a timeout but then dallies until after the timeout’s in- tended point of time, causing the timeout to be late. let start = Date.now(); setTimeout(() => { console.log("Timeout ran at", Date.now() - start); }, 20); while (Date.now() { list += fileName + ": " + (await textFile(fileName)).length + "\n"; })); return list; } The async fileName => part shows how arrow functions can also be made async by putting the word async in front of them. The code doesn’t immediately look suspicious... it maps the async arrow function over the array of names, creating an array of promises, and then uses Promise.all to wait for all of these before returning the list they build up. But this program is entirely broken. It’ll always return only a single line of output, listing the file that took the longest to read. Can you work out why? The problem lies in the += operator, which takes the current value of list at the time where the statement starts executing and then, when the await finishes, sets the list binding to be that value plus the added string. But between the time where the statement starts executing and the time where it finishes there’s an asynchronous gap. The map expression runs before anything has been added to the list, so each of the += operators starts from an empty string and ends up, when its storage retrieval finishes, setting list to the result of adding its line to the empty string. This could have easily been avoided by returning the lines from the mapped promises and calling join on the result of Promise.all, instead of building up the list by changing a binding. As usual, computing new values is less error-prone than changing existing values. async function fileSizes(files) { let lines = files.map(async fileName => { return fileName + ": " + (await textFile(fileName)).length; }); return (await Promise.all(lines)).join("\n"); } Mistakes like this are easy to make, especially when using await, and you should be aware of where the gaps in your code occur. An advantage of JavaScript’s explicit asynchronicity (whether through callbacks, promises, or await) is that spotting these gaps is relatively easy. 190 Summary Asynchronous programming makes it possible to express waiting for long- running actions without freezing the whole program. JavaScript environments typically implement this style of programming using callbacks, functions that are called when the actions complete. An event loop schedules such callbacks to be called when appropriate, one after the other, so that their execution does not overlap. Programming asynchronously is made easier by promises, objects that rep- resent actions that might complete in the future, and async functions, which allow you to write an asynchronous program as if it were synchronous. ExercisesQuiet Times There’s a security camera near Carla’s lab that’s activated by a motion sensor. It is connected to the network and starts sending out a video stream when it is active. Because she’d rather not be discovered, Carla has set up a system that notices this kind of wireless network traffic and turns on a light in her lair whenever there is activity outside, so she knows when to keep quiet. She’s also been logging the times at which the camera is tripped for a while and wants to use this information to visualize which times, in an average week, tend to be quiet, and which tend to be busy. The log is stored in files holding one time stamp number (as returned by Date.now()) per line. 1695709940692 1695701068331 1695701189163 The "camera_logs.txt" file holds a list of log files. Write an asynchronous function activityTable(day) that for a given day of the week returns an array of 24 numbers, one for each hour of the day, that hold the number of camera network traffic observations seen in that hour of the day. Days are identified by number using the system used by Date.getDay, where Sunday is 0 and Saturday is 6. The activityGraph function, provided by the sandbox, summarizes such a table into a string. To read the files, use the textFile function defined earlier—given a filename, it returns a promise that resolves to the file’s content. Remember that new Date(timestamp) creates a Date object for that time, which has getDay and 191 getHours methods returning the day of the week and the hour of the day. Both types of files—the list of log files and the log files themselves—have each piece of data on its own line, separated by newline ("\n") characters. Real Promises Rewrite the function from the previous exercise without async/await, using plain Promise methods. In this style, using Promise.all will be more convenient than trying to model a loop over the log files. In the async function, just using await in a loop is simpler. If reading a file takes some time, which of these two approaches will take the least time to run? If one of the files listed in the file list has a typo, and reading it fails, how does that failure end up in the Promise object that your function returns? Building Promise.all As we saw, given an array of promises, Promise.all returns a promise that waits for all of the promises in the array to finish. It then succeeds, yielding an array of result values. If a promise in the array fails, the promise returned by all fails too, passing on the failure reason from the failing promise. Implement something like this yourself as a regular function called Promise_all . Remember that after a promise has succeeded or failed, it can’t succeed or fail again, and further calls to the functions that resolve it are ignored. This can simplify the way you handle failure of your promise. 192 “The evaluator, which determines the meaning of expressions in a programming language, is just another program.” —Hal Abelson and Gerald Sussman, Structure and Interpretation of Computer Programs Chapter 12 Project: A Programming Language Building your own programming language is surprisingly easy (as long as you do not aim too high) and very enlightening. The main thing I want to show in this chapter is that there’s no magic involved in building a programming language. I’ve often felt that some human inventions were so immensely clever and complicated that I’d never be able to understand them. But with a little reading and experimenting, they often turn out to be quite mundane. We will build a programming language called Egg. It will be a tiny, simple language—but one that is powerful enough to express any computation you can think of. It will allow simple abstraction based on functions. Parsing The most immediately visible part of a programming language is its syntax, or notation. A parser is a program that reads a piece of text and produces a data structure that reflects the structure of the program contained in that text. If the text does not form a valid program, the parser should point out the error. Our language will have a simple and uniform syntax. Everything in Egg is an expression. An expression can be the name of a binding, a number, a string, or an application. Applications are used for function calls but also for constructs such as if or while. To keep the parser simple, strings in Egg do not support anything like back- slash escapes. A string is simply a sequence of characters that are not double quotes, wrapped in double quotes. A number is a sequence of digits. Binding names can consist of any character that is not whitespace and that does not have a special meaning in the syntax. Applications are written the way they are in JavaScript, by putting paren- theses after an expression and having any number of arguments between those parentheses, separated by commas. do(define(x, 10), 193 if(>(x, 5), print("large"), print("small"))) The uniformity of the Egg language means that things that are operators in JavaScript (such as >) are normal bindings in this language, applied just like other functions. Since the syntax has no concept of a block, we need a do construct to represent doing multiple things in sequence. The data structure that the parser will use to describe a program consists of expression objects, each of which has a type property indicating the kind of expression it is and other properties to describe its content. Expressions of type "value" represent literal strings or numbers. Their value property contains the string or number value that they represent. Expressions of type "word" are used for identifiers (names). Such objects have a name prop- erty that holds the identifier’s name as a string. Finally, "apply" expressions represent applications. They have an operator property that refers to the ex- pression that is being applied, as well as an args property that holds an array of argument expressions. The >(x, 5) part of the previous program would be represented like this: { type: "apply", operator: {type: "word", name: ">"}, args: [ {type: "word", name: "x"}, {type: "value", value: 5} ] } Such a data structure is called a syntax tree. If you imagine the objects as dots and the links between them as lines between those dots, as shown in the following diagram, the structure has a treelike shape. The fact that expressions contain other expressions, which in turn might contain more expressions, is similar to the way tree branches split and split again. 194 do define x 10 if > x 5 print "large" print "small" Contrast this to the parser we wrote for the configuration file format in Chapter 9, which had a simple structure: it split the input into lines and handled those lines one at a time. There were only a few simple forms that a line was allowed to have. Here we must find a different approach. Expressions are not separated into lines, and they have a recursive structure. Application expressions contain other expressions. Fortunately, this problem can be solved very well by writing a parser function that is recursive in a way that reflects the recursive nature of the language. We define a function parseExpression that takes a string as input. It returns an object containing the data structure for the expression at the start of the string, along with the part of the string left after parsing this expression. When parsing subexpressions (the argument to an application, for example), this function can be called again, yielding the argument expression as well as the text that remains. This text may in turn contain more arguments or may be the closing parenthesis that ends the list of arguments. This is the first part of the parser: function parseExpression(program) { program = skipSpace(program); let match, expr; if (match = /^"([^"]*)"/.exec(program)) { expr = {type: "value", value: match[1]}; } else if (match = /^\d+\b/.exec(program)) { expr = {type: "value", value: Number(match[0])}; } else if (match = /^[^\s(),#"]+/.exec(program)){ expr = {type: "word", name: match[0]}; } else { 195 throw new SyntaxError("Unexpected syntax: " + program); } return parseApply(expr, program.slice(match[0].length)); } function skipSpace(string) { let first = string.search(/\S/); if (first == -1) return ""; return string.slice(first); } Because Egg, like JavaScript, allows any amount of whitespace between its elements, we have to repeatedly cut the whitespace off the start of the program string. The skipSpace function helps with this. After skipping any leading space, parseExpression uses three regular expres- sions to spot the three atomic elements that Egg supports: strings, numbers, and words. The parser constructs a different kind of data structure depending on which expression matches. If the input does not match one of these three forms, it is not a valid expression, and the parser throws an error. We use the SyntaxError constructor here. This is an exception class defined by the standard, like Error, but more specific. We then cut off the part that was matched from the program string and pass that, along with the object for the expression, to parseApply, which checks whether the expression is an application. If so, it parses a parenthesized list of arguments. function parseApply(expr, program) { program = skipSpace(program); if (program[0] != "(") { return {expr: expr, rest: program}; } program = skipSpace(program.slice(1)); expr = {type: "apply", operator: expr, args: []}; while (program[0] != ")") { let arg = parseExpression(program); expr.args.push(arg.expr); program = skipSpace(arg.rest); if (program[0] == ",") { program = skipSpace(program.slice(1)); } else if (program[0] != ")") { throw new SyntaxError("Expected ',' or ')'"); } 196 } return parseApply(expr, program.slice(1)); } If the next character in the program is not an opening parenthesis, this is not an application, and parseApply returns the expression it was given. Otherwise, it skips the opening parenthesis and creates the syntax tree object for this application expression. It then recursively calls parseExpression to parse each argument until a closing parenthesis is found. The recursion is indirect, through parseApply and parseExpression calling each other. Because an application expression can itself be applied (such as in multiplier (2)(1)), parseApply must, after it has parsed an application, call itself again to check whether another pair of parentheses follows. This is all we need to parse Egg. We wrap it in a convenient parse func- tion that verifies that it has reached the end of the input string after parsing the expression (an Egg program is a single expression), and that gives us the program’s data structure. function parse(program) { let {expr, rest} = parseExpression(program); if (skipSpace(rest).length > 0) { throw new SyntaxError("Unexpected text after program"); } return expr; } console.log(parse("+(a, 10)")); // → {type: "apply", // operator: {type: "word", name: "+"}, // args: [{type: "word", name: "a"}, // {type: "value", value: 10}]} It works! It doesn’t give us very helpful information when it fails and doesn’t store the line and column on which each expression starts, which might be helpful when reporting errors later, but it’s good enough for our purposes. The evaluator What can we do with the syntax tree for a program? Run it, of course! And that is what the evaluator does. You give it a syntax tree and a scope object that associates names with values, and it will evaluate the expression that the tree represents and return the value that this produces. 197 const specialForms = Object.create(null); function evaluate(expr, scope) { if (expr.type == "value") { return expr.value; } else if (expr.type == "word") { if (expr.name in scope) { return scope[expr.name]; } else { throw new ReferenceError( `Undefined binding: ${expr.name}`); } } else if (expr.type == "apply") { let {operator, args} = expr; if (operator.type == "word" && operator.name in specialForms) { return specialForms[operator.name](expr.args, scope); } else { let op = evaluate(operator, scope); if (typeof op == "function") { return op(...args.map(arg => evaluate(arg, scope))); } else { throw new TypeError("Applying a non-function."); } } } } The evaluator has code for each of the expression types. A literal value ex- pression produces its value. (For example, the expression 100 evaluates to the number 100.) For a binding, we must check whether it is actually defined in the scope and, if it is, fetch the binding’s value. Applications are more involved. If they are a special form, like if, we do not evaluate anything—we just and pass the argument expressions, along with the scope, to the function that handles this form. If it is a normal call, we evaluate the operator, verify that it is a function, and call it with the evaluated arguments. We use plain JavaScript function values to represent Egg’s function values. We will come back to this later, when the special form fun is defined. The recursive structure of evaluate resembles the structure of the parser, and both mirror the structure of the language itself. It would also be possible to combine the parser and the evaluator into one function and evaluate during parsing, but splitting them up this way makes the program clearer and more flexible. 198 This is really all that’s needed to interpret Egg. It’s that simple. But without defining a few special forms and adding some useful values to the environment, you can’t do much with this language yet. Special forms The specialForms object is used to define special syntax in Egg. It associates words with functions that evaluate such forms. It is currently empty. Let’s add if. specialForms.if = (args, scope) => { if (args.length != 3) { throw new SyntaxError("Wrong number of args to if"); } else if (evaluate(args[0], scope) !== false) { return evaluate(args[1], scope); } else { return evaluate(args[2], scope); } }; Egg’s if construct expects exactly three arguments. It will evaluate the first, and if the result isn’t the value false, it will evaluate the second. Otherwise, the third gets evaluated. This if form is more similar to JavaScript’s ternary ?: operator than to JavaScript’s if. It is an expression, not a statement, and it produces a value, namely, the result of the second or third argument. Egg also differs from JavaScript in how it handles the condition value to if. It will treat only the value false as false, not things like zero or the empty string. The reason we need to represent if as a special form rather than a regular function is that all arguments to functions are evaluated before the function is called, whereas if should evaluate only either its second or its third argument, depending on the value of the first. The while form is similar. specialForms.while = (args, scope) => { if (args.length != 2) { throw new SyntaxError("Wrong number of args to while"); } while (evaluate(args[0], scope) !== false) { evaluate(args[1], scope); } // Since undefined does not exist in Egg, we return false, 199 // for lack of a meaningful result. return false; }; Another basic building block is do, which executes all its arguments from top to bottom. Its value is the value produced by the last argument. specialForms.do = (args, scope) => { let value = false; for (let arg of args) { value = evaluate(arg, scope); } return value; }; To be able to create bindings and give them new values, we also create a form called define. It expects a word as its first argument and an expression producing the value to assign to that word as its second argument. Since define, like everything, is an expression, it must return a value. We’ll make it return the value that was assigned (just like JavaScript’s = operator). specialForms.define = (args, scope) => { if (args.length != 2 || args[0].type != "word") { throw new SyntaxError("Incorrect use of define"); } let value = evaluate(args[1], scope); scope[args[0].name] = value; return value;}; The environment The scope accepted by evaluate is an object with properties whose names correspond to binding names and whose values correspond to the values those bindings are bound to. Let’s define an object to represent the global scope. To be able to use the if construct we just defined, we must have access to Boolean values. Since there are only two Boolean values, we do not need special syntax for them. We simply bind two names to the values true and false and use them. const topScope = Object.create(null); topScope.true = true; topScope.false = false; 200 We can now evaluate a simple expression that negates a Boolean value. let prog = parse(`if(true, false, true)`); console.log(evaluate(prog, topScope)); // → false To supply basic arithmetic and comparison operators, we will also add some function values to the scope. In the interest of keeping the code short, we’ll use Function to synthesize a bunch of operator functions in a loop instead of defining them individually. for (let op of ["+", "-", "*", "/", "==", ""]) { topScope[op] = Function("a, b", `return a ${op} b;`); } It is also useful to have a way to output values, so we’ll wrap console.log in a function and call it print. topScope.print = value => { console.log(value); return value; }; That gives us enough elementary tools to write simple programs. The following function provides a convenient way to parse a program and run it in a fresh scope: function run(program) { return evaluate(parse(program), Object.create(topScope)); } We’ll use object prototype chains to represent nested scopes so that the program can add bindings to its local scope without changing the top-level scope. run(` do(define(total, 0), define(count, 1), while( { if (!args.length) { throw new SyntaxError("Functions need a body"); } let body = args[args.length - 1]; let params = args.slice(0, args.length - 1).map(expr => { if (expr.type != "word") { throw new SyntaxError("Parameter names must be words"); } return expr.name; }); return function(...args) { if (args.length != params.length) { throw new TypeError("Wrong number of arguments"); } let localScope = Object.create(scope); for (let i = 0; i1 in memory location 1. 3. Store the value of memory location 1 in memory location 2. 4. Subtract the number 11 from the value in memory location 2. 5. If the value in memory location 2 is the number 0, continue with instruc- tion 9. 6. Add the value of memory location 1 to memory location 0. 7. Add the number 1 to the value of memory location 1. 8. Continue with instruction 3. 9. Output the value of memory location 0. Although that is already more readable than the soup of bits, it is still rather obscure. Using names instead of numbers for the instructions and memory locations helps: Set “total” to 0. Set “count” to 1. [loop] Set “compare” to “count”. Subtract 11 from “compare”. If “compare” is zero, continue at [end]. Add “count” to “total”. Add 1 to “count”. Continue at [loop]. [end] Output “total”. Can you see how the program works at this point? The first two lines give two memory locations their starting values: total will be used to build up the result of the computation, and count will keep track of the number that we are currently looking at. The lines using compare are probably the most confusing ones. The program wants to see whether count is equal to 11 to decide whether it can stop running. Because our hypothetical machine is rather primitive, it can only test whether a number is zero and make a decision based on that. It therefore uses the memory location labeled compare to compute the value of count - 11 and makes a decision based on that value. The next two lines add the value of count to the result and increment count by 1 every time the program decides that count is not 11 yet. Here is the same program in JavaScript: 4 let total = 0, count = 1; while (countAdd a special form set, similar to define, which gives a binding a new value, updating the binding in an outer scope if it doesn’t already exist in the inner scope. If the binding is not defined at all, throw a ReferenceError (another standard error type). The technique of representing scopes as simple objects, which has made things convenient so far, will get in your way a little at this point. You might want to use the Object.getPrototypeOf function, which returns the prototype of an object. Also remember that you can use Object.hasOwn to find out if a given object has a property. 205 “The dream behind the Web is of a common information space in which we communicate by sharing information. Its universality is essential: the fact that a hypertext link can point to anything, be it personal, local or global, be it draft or highly polished.” —Tim Berners-Lee, The World Wide Web: A very short personal history Chapter 13 JavaScript and the Browser The next chapters of this book will discuss web browsers. Without browsers, there would be no JavaScript—or if there were, no one would ever have paid any attention to it. Web technology has been decentralized from the start, not just technically but also in terms of the way it evolved. Various browser vendors have added new functionality in ad hoc and sometimes poorly thought-out ways, which were then—sometimes—adopted by others—and finally set down as in standards. This is both a blessing and a curse. On the one hand, it is empowering to not have a central party control a system but have it be improved by various parties working in loose collaboration (or occasionally, open hostility). On the other hand, the haphazard way in which the Web was developed means that the resulting system is not exactly a shining example of internal consistency. Some parts of it are downright confusing and badly designed. Networks and the Internet Computer networks have been around since the 1950s. If you put cables be- tween two or more computers and allow them to send data back and forth through these cables, you can do all kinds of wonderful things. If connecting two machines in the same building allows us to do wonderful things, connecting machines all over the planet should be even better. The technology to start implementing this vision was developed in the 1980s, and the resulting network is called the Internet. It has lived up to its promise. A computer can use this network to shoot bits at another computer. For any effective communication to arise out of this bit-shooting, the computers on both ends must know what the bits are supposed to represent. The meaning of any given sequence of bits depends entirely on the kind of thing that it is trying to express and on the encoding mechanism used. A network protocol describes a style of communication over a network. There are protocols for sending email, for fetching email, for sharing files, and even 206 for controlling computers that happen to be infected by malicious software. The Hypertext Transfer Protocol (HTTP) is a protocol for retrieving named resources (chunks of information, such as web pages or pictures). It specifies that the side making the request should start with a line like this, naming the resource and the version of the protocol that it is trying to use: GET /index.html HTTP/1.1 There are many more rules about the way the requester can include more infor- mation in the request and the way the other side, which returns the resource, packages up its content. We’ll look at HTTP in a little more detail in Chapter 18. Most protocols are built on top of other protocols. HTTP treats the network as a streamlike device into which you can put bits and have them arrive at the correct destination in the correct order. Providing those guarantees on top of the primitive data-sending that the network gives you is already a rather tricky problem. The Transmission Control Protocol (TCP) is a protocol that addresses this problem. All Internet-connected devices “speak” it, and most communication on the Internet is built on top of it. A TCP connection works as follows: one computer must be waiting, or listening, for other computers to start talking to it. To be able to listen for different kinds of communication at the same time on a single machine, each listener has a number (called a port) associated with it. Most protocols specify which port should be used by default. For example, when we want to send an email using the SMTP protocol, the machine through which we send it is expected to be listening on port 25. Another computer can then establish a connection by connecting to the tar- get machine using the correct port number. If the target machine can be reached and is listening on that port, the connection is successfully created. The listening computer is called the server, and the connecting computer is called the client. Such a connection acts as a two-way pipe through which bits can flow—the machines on both ends can put data into it. Once the bits are successfully transmitted, they can be read out again by the machine on the other side. This is a convenient model. You could say that TCP provides an abstraction of the network. 207 The Web The World Wide Web (not to be confused with the Internet as a whole) is a set of protocols and formats that allow us to visit web pages in a browser. “Web” refers to the fact that such pages can easily link to each other, thus connecting into a huge mesh that users can move through. To become part of the Web, all you need to do is connect a machine to the Internet and have it listen on port 80 with the HTTP protocol so that other computers can ask it for documents. Each document on the Web is named by a Uniform Resource Locator (URL), which looks something like this: http://eloquentjavascript.net/13_browser.html | | | | protocol server path The first part tells us that this URL uses the HTTP protocol (as opposed to, for example, encrypted HTTP, which would be https://). Then comes the part that identifies which server we are requesting the document from. Last is a path string that identifies the document (or resource) we are interested in. Machines connected to the Internet get an IP address, a number that can be used to send messages to that machine, and looks something like 149.210.142.219 or 2001:4860:4860::8888. Since lists of more or less random numbers are hard to remember and awkward to type, you can instead register a domain name for an address or set of addresses. I registered eloquentjavascript.net to point at the IP address of a machine I control and can thus use that domain name to serve web pages. If you type this URL into your browser’s address bar, the browser will try to retrieve and display the document at that URL. First, your browser has to find out what address eloquentjavascript.net refers to. Then, using the HTTP protocol, it will make a connection to the server at that address and ask for the resource /13_browser.html. If all goes well, the server sends back a document, which your browser then displays on your screen. HTML HTML, which stands for Hypertext Markup Language, is the document format used for web pages. An HTML document contains text, as well as tags that give structure to the text, describing things such as links, paragraphs, and headings. A short HTML document might look like this: 208 My home page My home page Hello, I am Marijn and this is my home page. I also wrote a book! Read it here. This is what such a document would look like in the browser: The tags, wrapped in angle brackets (, the symbols for less than and greater than), provide information about the structure of the document. The other text is just plain text. The document starts with , which tells the browser to in- terpret the page as modern HTML,go beyond the exercises. The easiest way to run the example code in the book—and to experiment with it—is to look it up in the online version of the book at https://eloquentjavascript.net. There, you can click any code example to edit and run it and to see the output it produces. To work on the exercises, go to https://eloquentjavascript.net/ code, which provides starting code for each coding exercise and allows you to look at the solutions. Running the programs defined in this book outside of the book’s website requires some care. Many examples stand on their own and should work in any JavaScript environment. But code in later chapters is often written for a specific environment (the browser or Node.js) and can run only there. In addition, many chapters define bigger programs, and the pieces of code that appear in them depend on each other or on external files. The sandbox on the website provides links to ZIP files containing all the scripts and data files necessary to run the code for a given chapter. 7 https://eloquentjavascript.net/ https://eloquentjavascript.net/code https://eloquentjavascript.net/code https://eloquentjavascript.net/code Overview of this book This book contains roughly three parts. The first 12 chapters discuss the JavaScript language. The next seven chapters are about web browsers and the way JavaScript is used to program them. Finally, two chapters are devoted to Node.js, another environment to program JavaScript in. There are five project chapters in the book that describe larger example programs to give you a taste of actual programming. The language part of the book starts with four chapters that introduce the basic structure of the JavaScript language. They discuss control structures (such as the while word you saw in this introduction), functions (writing your own building blocks), and data structures. After these, you will be able to write basic programs. Next, Chapters 5 and 6 introduce techniques to use functions and objects to write more abstract code and keep complexity under control. After a first project chapter that builds a crude delivery robot, the language part of the book continues with chapters on error handling and bug fixing, regular expressions (an important tool for working with text), modularity (an- other defense against complexity), and asynchronous programming (dealing with events that take time). The second project chapter, where we implement a programming language, concludes the first part of the book. The second part of the book, Chapters 13 to 19, describes the tools that browser JavaScript has access to. You’ll learn to display things on the screen (Chapters 14 and 17), respond to user input (Chapter 15), and communicate over the network (Chapter 18). There are again two project chapters in this part, building a platform game and a pixel paint program. Chapter 20 describes Node.js, and Chapter 21 builds a small website using that tool. Typographic conventions In this book, text written in a monospaced font will represent elements of pro- grams. Sometimes these are self-sufficient fragments, and sometimes they just refer to part of a nearby program. Programs (of which you have already seen a few) are written as follows: function factorial(n) { if (n == 0) { return 1; } else { return factorial(n - 1) * n; } 8 } Sometimes, to show the output that a program produces, the expected output is written after it, with two slashes and an arrow in front. console.log(factorial(8)); // → 40320 Good luck! 9 “Below the surface of the machine, the program moves. Without effort, it expands and contracts. In great harmony, electrons scatter and regroup. The forms on the monitor are but ripples on the water. The essence stays invisibly below.” —Master Yuan-Ma, The Book of Programming Chapter 1 Values, Types, and Operators In the computer’s world, there is only data. You can read data, modify data, create new data—but that which isn’t data cannot be mentioned. All this data is stored as long sequences of bits and is thus fundamentally alike. Bits are any kind of two-valued things, usually described as zeros and ones. Inside the computer, they take forms such as a high or low electrical charge, a strong or weak signal, or a shiny or dull spot on the surface of a CD. Any piece of discrete information can be reduced to a sequence of zeros and ones and thus represented in bits. For example, we can express the number 13 in bits. This works the same way as a decimal number, but instead of 10 different digits, we have only 2, and the weight of each increases by a factor of 2 from right to left. Here are the bits that make up the number 13, with the weights of the digits shown below them: 0 0 0 0 1 1 0 1 128 64 32 16 8 4 2 1 That’s the binary number 00001101. Its non-zero digits stand for 8, 4, and 1, and add up to 13. Values Imagine a sea of bits—an ocean of them. A typical modern computer has more than 100 billion bits in its volatile data storage (working memory). Nonvolatile storage (the hard disk or equivalent) tends to have yet a few orders of magnitude more. To be able to work with such quantities of bits without getting lost, we separate them into chunks that represent pieces of information. In a JavaScript environment, those chunks are called values. Though all values are made of bits, they play different roles. Every value has a type that determines its role. Some values are numbers, some values are pieces of text, some values are functions, 10 and so on. To create a value, you must merely invoke its name. This is convenient. You don’t have to gather building material for your values or pay for them. You just call for one, and whoosh, you have it. Of course, values are not really created from thin air. Each one has to be stored somewhere, and if you want to use a gigantic number of them at the same time, you might run out of computer memory. Fortunately, this is a problem only if you need them all simultaneously. As soon as you no longer use a value, it will dissipate, leaving behind its bits to be recycled as building material for the next generation of values. The remainder of this chapter introduces the atomic elements of JavaScript programs, that is, the simple value types and the operators that can act on such values. Numbers Values of the number type are, unsurprisingly, numeric values. In a JavaScript program, they are written as follows: 13 Using that in a program will cause the bit pattern for the number 13 to come into existence inside the computer’s memory. JavaScript uses a fixed number of bits, 64 of them, to store a single number value. There are only so many patterns you can make with 64 bits, which limits the number of different numbers that can be represented. With N decimal digits, you can represent 10N numbers. Similarly, given 64 binary digits, you can represent 264 different numbers, which is about 18 quintillion (an 18 with 18 zeros after it). That’s a lot. Computer memory used to be much smaller, and people tended to use groups of 8 or 16 bits to represent their numbers. It was easy to accidentally overflow such small numbers—to end up with a number that did not fit into the given number of bits. Today, even computers that fit in your pocket have plenty of memory, so you are free to use 64-bit chunks, and you need to worry about overflow only when dealing with truly astronomical numbers. Not all whole numbers less than 18 quintillion fit in a JavaScript number, though. Those bits also store negative numbers, so one bit indicates the sign of the number. A bigger issue is representing nonwhole numbers. To do this, some of the bits are used to store the position of the decimal point. The actual maximum whole number that can be stored is more in the range of 9 quadrillion 11 (15 zeros)—which is still pleasantly huge. Fractional numbers are written using a dot: 9.81 For very big or very small numbers, you may also use scientific notation by adding an e (for exponent), followed by the exponentof the number: 2.998e8 That’s 2.998 × 108 = 299,800,000. Calculations with whole numbers (also called integers) that are smaller than the aforementioned 9 quadrillion are guaranteed to always be precise. Unfor- tunately, calculations with fractional numbers are generally not. Just as π (pi) cannot be precisely expressed by a finite number of decimal digits, many num- bers lose some precision when only 64 bits are available to store them. This is a shame, but it causes practical problems only in specific situations. The important thing is to be aware of it and treat fractional digital numbers as approximations, not as precise values. Arithmetic The main thing to do with numbers is arithmetic. Arithmetic operations such as addition or multiplication take two number values and produce a new number from them. Here is what they look like in JavaScript: 100 + 4 * 11 The + and * symbols are called operators. The first stands for addition and the second stands for multiplication. Putting an operator between two values will apply it to those values and produce a new value. Does this example mean “Add 4 and 100, and multiply the result by 11”, or is the multiplication done before the adding? As you might have guessed, the multiplication happens first. As in mathematics, you can change this by wrapping the addition in parentheses: (100 + 4) * 11 For subtraction, there is the - operator. Division can be done with the / operator. When operators appear together without parentheses, the order in which they are applied is determined by the precedence of the operators. The example shows that multiplication comes before addition. The / operator has the same precedence as *. Likewise, + and - have the same precedence. When multiple 12 operators with the same precedence appear next to each other, as in 1 - 2 + 1, they are applied left to right: (1 - 2)+ 1. Don’t worry too much about these precedence rules. When in doubt, just add parentheses. There is one more arithmetic operator, which you might not immediately recognize. The % symbol is used to represent the remainder operation. X % Y is the remainder of dividing X by Y. For example, 314 % 100 produces 14, and 144 % 12 gives 0. The remainder operator’s precedence is the same as that of multiplication and division. You’ll also often see this operator referred to as modulo. Special numbers There are three special values in JavaScript that are considered numbers but don’t behave like normal numbers. The first two are Infinity and -Infinity , which represent the positive and negative infinities. Infinity - 1 is still Infinity, and so on. Don’t put too much trust in infinity-based computation, though. It isn’t mathematically sound, and it will quickly lead to the next special number: NaN. NaN stands for “not a number”, even though it is a value of the number type. You’ll get this result when you, for example, try to calculate 0 / 0 (zero divided by zero), Infinity - Infinity, or any number of other numeric operations that don’t yield a meaningful result. Strings The next basic data type is the string. Strings are used to represent text. They are written by enclosing their content in quotes. `Down on the sea` "Lie on the ocean" 'Float on the ocean' You can use single quotes, double quotes, or backticks to mark strings, as long as the quotes at the start and the end of the string match. You can put almost anything between quotes to have JavaScript make a string value out of it. But a few characters are more difficult. You can imagine how putting quotes between quotes might be hard, since they will look like the end of the string. Newlines (the characters you get when you press enter) can be included only when the string is quoted with backticks (\‘). 13 To make it possible to include such characters in a string, the following notation is used: a backslash (\) inside quoted text indicates that the character after it has a special meaning. This is called escaping the character. A quote that is preceded by a backslash will not end the string but be part of it. When an n character occurs after a backslash, it is interpreted as a newline. Similarly, a t after a backslash means a tab character. Take the following string: "This is the first line\nAnd this is the second" This is the actual text in that string: This is the first line And this is the second There are, of course, situations where you want a backslash in a string to be just a backslash, not a special code. If two backslashes follow each other, they will collapse together, and only one will be left in the resulting string value. This is how the string “A newline character is written like "\n".” can be expressed: "A newline character is written like \"\\n\"." Strings, too, have to be modeled as a series of bits to be able to exist inside the computer. The way JavaScript does this is based on the Unicode standard. This standard assigns a number to virtually every character you would ever need, including characters from Greek, Arabic, Japanese, Armenian, and so on. If we have a number for every character, a string can be described by a sequence of numbers. And that’s what JavaScript does. There’s a complication though: JavaScript’s representation uses 16 bits per string element, which can describe up to 216 different characters. However, Unicode defines more characters than that—about twice as many, at this point. So some characters, such as many emoji, take up two “character positions” in JavaScript strings. We’ll come back to this in Chapter 5. Strings cannot be divided, multiplied, or subtracted. The + operator can be used on them, not to add, but to concatenate—to glue two strings together. The following line will produce the string "concatenate": "con" + "cat" + "e" + "nate" String values have a number of associated functions (methods) that can be used to perform other operations on them. I’ll say more about these in Chapter 4. Strings written with single or double quotes behave very much the same— the only difference lies in which type of quote you need to escape inside of them. Backtick-quoted strings, usually called template literals, can do a few 14 more tricks. Apart from being able to span lines, they can also embed other values. `half of 100 is ${100 / 2}` When you write something inside ${} in a template literal, its result will be computed, converted to a string, and included at that position. This example produces “half of 100 is 50”. Unary operators Not all operators are symbols. Some are written as words. One example is the typeof operator, which produces a string value naming the type of the value you give it. console.log(typeof 4.5) // → number console.log(typeof "x") // → string We will use console.log in example code to indicate that we want to see the result of evaluating something. More about that in the next chapter. The other operators shown so far in this chapter all operated on two values, but typeof takes only one. Operators that use two values are called binary operators, while those that take one are called unary operators. The minus operator can be used both as a binary operator and as a unary operator. console.log(- (10 - 2)) // → -8 Boolean values It is often useful to have a value that distinguishes between only two possibili- ties, like “yes” and “no” or “on” and “off”. For this purpose, JavaScript has a Boolean type, which has just two values, true and false, written as those words. Comparison Here is one way to produce Boolean values: console.log(3 > 2) // → true 15 console.log(3 andare always “less” than lowercase ones, so "Z" = (greater than or equal to), , ==, and so on), and then the rest. This order has been chosen such that, in typical expressions like the following one, as few parentheses as possible are necessary: 1 + 1 == 2 && 10 * 10 > 50 The last logical operator we will look at is not unary, not binary, but ternary, operating on three values. It is written with a question mark and a colon, like this: console.log(true ? 1 : 2); // → 1 console.log(false ? 1 : 2); // → 2 This one is called the conditional operator (or sometimes just the ternary oper- ator since it is the only such operator in the language). The operator uses the value to the left of the question mark to decide which of the two other values to “pick”. If you write a ? b : c, the result will be b when a is true and c otherwise. Empty values There are two special values, written null and undefined, that are used to denote the absence of a meaningful value. They are themselves values, but they carry no information. Many operations in the language that don’t produce a meaningful value yield undefined simply because they have to yield some value. The difference in meaning between undefined and null is an accident of JavaScript’s design, and it doesn’t matter most of the time. In cases where 17 you actually have to concern yourself with these values, I recommend treating them as mostly interchangeable. Automatic type conversion In the Introduction, I mentioned that JavaScript goes out of its way to accept almost any program you give it, even programs that do odd things. This is nicely demonstrated by the following expressions: console.log(8 * null) // → 0 console.log("5" - 1) // → 4 console.log("5" + 1) // → 51 console.log("five" * 2) // → NaN console.log(false == 0) // → true When an operator is applied to the “wrong” type of value, JavaScript will quietly convert that value to the type it needs, using a set of rules that often aren’t what you want or expect. This is called type coercion. The null in the first expression becomes 0 and the "5" in the second expression becomes 5 (from string to number). Yet in the third expression, + tries string concatenation before numeric addition, so the 1 is converted to "1" (from number to string). When something that doesn’t map to a number in an obvious way (such as "five" or undefined) is converted to a number, you get the value NaN. Further arithmetic operations on NaN keep producing NaN, so if you find yourself getting one of those in an unexpected place, look for accidental type conversions. When comparing values of the same type using the == operator, the outcome is easy to predict: you should get true when both values are the same, except in the case of NaN. But when the types differ, JavaScript uses a complicated and confusing set of rules to determine what to do. In most cases, it just tries to convert one of the values to the other value’s type. However, when null or undefined occurs on either side of the operator, it produces true only if both sides are one of null or undefined. console.log(null == undefined); // → true console.log(null == 0); // → false 18 That behavior is often useful. When you want to test whether a value has a real value instead of null or undefined, you can compare it to null with the == or != operator. What if you want to test whether something refers to the precise value false? Expressions like 0 == false and "" == false are also true because of automatic type conversion. When you do not want any type conversions to happen, there are two additional operators: === and !==. The first tests whether a value is precisely equal to the other, and the second tests whether it is not precisely equal. Thus "" === false is false as expected. I recommend using the three-character comparison operators defensively to prevent unexpected type conversions from tripping you up. But when you’re certain the types on both sides will be the same, there is no problem with using the shorter operators. Short-circuiting of logical operators The logical operators && and || handle values of different types in a peculiar way. They will convert the value on their left side to Boolean type in order to decide what to do, but depending on the operator and the result of that conversion, they will return either the original left-hand value or the right- hand value. The || operator, for example, will return the value to its left when that value can be converted to true and will return the value on its right otherwise. This has the expected effect when the values are Boolean and does something analogous for values of other types. console.log(null || "user") // → user console.log("Agnes" || "user") // → Agnes We can use this functionality as a way to fall back on a default value. If you have a value that might be empty, you can put || after it with a replacement value. If the initial value can be converted to false, you’ll get the replacement instead. The rules for converting strings and numbers to Boolean values state that 0, NaN, and the empty string ("") count as false, while all the other values count as true. That means 0 || -1 produces -1, and "" || "!?" yields "!?". The ?? operator resembles ||, but returns the value on the right only if the one on the left is null or undefined, not if it is some other value that can be converted to false. Often, this is preferable to the behavior of ||. console.log(0 || 100); 19 // → 100 console.log(0 ?? 100); // → 0 console.log(null ?? 100); // → 100 The && operator works similarly but the other way around. When the value to its left is something that converts to false, it returns that value, and otherwise it returns the value on its right. Another important property of these two operators is that the part to their right is evaluated only when necessary. In the case of true || X, no matter what X is—even if it’s a piece of program that does something terrible—the result will be true, and X is never evaluated. The same goes for false && X, which is false and will ignore X. This is called short-circuit evaluation. The conditional operator works in a similar way. Of the second and third values, only the one that is selected is evaluated. Summary We looked at four types of JavaScript values in thischapter: numbers, strings, Booleans, and undefined values. Such values are created by typing in their name (true, null) or value (13, "abc"). You can combine and transform values with operators. We saw binary op- erators for arithmetic (+, -, *, /, and %), string concatenation (+), comparison (==, !=, ===, !==, , =), and logic (&&, ||, ??), as well as several unary operators (- to negate a number, ! to negate logically, and typeof to find a value’s type) and a ternary operator (?:) to pick one of two values based on a third value. This gives you enough information to use JavaScript as a pocket calculator but not much more. The next chapter will start tying these expressions together into basic programs. 20 “And my heart glows bright red under my filmy, translucent skin and they have to administer 10cc of JavaScript to get me to come back. (I respond well to toxins in the blood.) Man, that stuff will kick the peaches right out your gills!” —_why, Why’s (Poignant) Guide to Ruby Chapter 2 Program Structure In this chapter, we will start to do things that can actually be called program- ming. We will expand our command of the JavaScript language beyond the nouns and sentence fragments we’ve seen so far to the point where we can express meaningful prose. Expressions and statements In Chapter 1, we made values and applied operators to them to get new values. Creating values like this is the main substance of any JavaScript program. But that substance has to be framed in a larger structure to be useful. That’s what we’ll cover in this chapter. A fragment of code that produces a value is called an expression. Every value that is written literally (such as 22 or "psychoanalysis") is an expression. An expression between parentheses is also an expression, as is a binary operator applied to two expressions or a unary operator applied to one. This shows part of the beauty of a language-based interface. Expressions can contain other expressions in a way similar to how subsentences in human languages are nested—a subsentence can contain its own subsentences, and so on. This allows us to build expressions that describe arbitrarily complex computations. If an expression corresponds to a sentence fragment, a JavaScript statement corresponds to a full sentence. A program is a list of statements. The simplest kind of statement is an expression with a semicolon after it. This is a program: 1; !false; It is a useless program, though. An expression can be content to just produce a value, which can then be used by the enclosing code. However, a statement stands on its own, so if it doesn’t affect the world, it’s useless. It may display 21 something on the screen, as with console.log, or change the state of the ma- chine in a way that will affect the statements that come after it. These changes are called side effects. The statements in the previous example just produce the values 1 and true and then immediately throw them away. This leaves no impression on the world at all. When you run this program, nothing observable happens. In some cases, JavaScript allows you to omit the semicolon at the end of a statement. In other cases, it has to be there, or the next line will be treated as part of the same statement. The rules for when it can be safely omitted are somewhat complex and error-prone. So in this book, every statement that needs a semicolon will always get one. I recommend you do the same, at least until you’ve learned more about the subtleties of missing semicolons. Bindings How does a program keep an internal state? How does it remember things? We have seen how to produce new values from old values, but this does not change the old values, and the new value must be used immediately or it will dissipate again. To catch and hold values, JavaScript provides a thing called a binding, or variable: let caught = 5 * 5; That gives us a second kind of statement. The special word (keyword) let indicates that this sentence is going to define a binding. It is followed by the name of the binding and, if we want to immediately give it a value, by an = operator and an expression. The example creates a binding called caught and uses it to grab hold of the number that is produced by multiplying 5 by 5. After a binding has been defined, its name can be used as an expression. The value of such an expression is the value the binding currently holds. Here’s an example: let ten = 10; console.log(ten * ten); // → 100 When a binding points at a value, that does not mean it is tied to that value forever. The = operator can be used at any time on existing bindings to dis- connect them from their current value and have them point to a new one: let mood = "light"; 22 console.log(mood); // → light mood = "dark"; console.log(mood); // → dark You should imagine bindings as tentacles rather than boxes. They do not contain values; they grasp them—two bindings can refer to the same value. A program can access only the values to which it still has a reference. When you need to remember something, you either grow a tentacle to hold on to it or reattach one of your existing tentacles to it. Let’s look at another example. To remember the number of dollars that Luigi still owes you, you create a binding. When he pays back $35, you give this binding a new value: let luigisDebt = 140; luigisDebt = luigisDebt - 35; console.log(luigisDebt); // → 105 When you define a binding without giving it a value, the tentacle has nothing to grasp, so it ends in thin air. If you ask for the value of an empty binding, you’ll get the value undefined. A single let statement may define multiple bindings. The definitions must be separated by commas: let one = 1, two = 2; console.log(one + two); // → 3 The words var and const can also be used to create bindings, in a similar fashion to let: var name = "Ayda"; const greeting = "Hello "; console.log(greeting + name); // → Hello Ayda The first of these, var (short for “variable”), is the way bindings were declared in pre-2015 JavaScript, when let didn’t exist yet. I’ll get back to the precise way it differs from let in the next chapter. For now, remember that it mostly does the same thing, but we’ll rarely use it in this book because it behaves oddly in some situations. The word const stands for constant. It defines a constant binding, which 23 points at the same value for as long as it lives. This is useful for bindings that just give a name to a value so that you can easily refer to it later. Binding names Binding names can be any sequence of one or more letters. Digits can be part of binding names—catch22 is a valid name, for example—but the name must not start with a digit. A binding name may include dollar signs ($) or underscores (_) but no other punctuation or special characters. Words with a special meaning, such as let, are keywords, and may not be used as binding names. There are also a number of words that are “reserved for use” in future versions of JavaScript, which also can’t be used as binding names. The full list of keywords and reserved words is rather long: break case catch class const continue debugger default delete do else enum export extends false finally for function if implements import interface in instanceof let new package private protected public return static super switch this throw true try typeof var void while with yield Don’t worry about memorizing this list. When creating a binding produces an unexpected syntax error, check whether you’re trying to define a reserved word. The environment The collection of bindings and their values that exist at a given time is called the environment. When a program starts up, this environment is not empty. It always contains bindings that are part of the language standard, and most of the time, it also has bindings that provide ways to interact with the surrounding system. For example, in a browser, there are functions to interact with the currently loaded website and to read mouse and keyboard