1 00:00:00,000 --> 00:00:00,530 2 00:00:00,530 --> 00:00:02,960 The following content is provided under a Creative 3 00:00:02,960 --> 00:00:04,370 Commons license. 4 00:00:04,370 --> 00:00:07,410 Your support will help MIT OpenCourseWare continue to 5 00:00:07,410 --> 00:00:11,060 offer high quality educational resources for free. 6 00:00:11,060 --> 00:00:13,960 To make a donation or view additional materials from 7 00:00:13,960 --> 00:00:19,790 hundreds of MIT courses, visit MIT OpenCourseWare at 8 00:00:19,790 --> 00:00:21,040 ocw.mit.edu. 9 00:00:21,040 --> 00:00:22,925 10 00:00:22,925 --> 00:00:25,470 PROFESSOR: So we're going to pick up where we left off last 11 00:00:25,470 --> 00:00:26,720 Friday on recursion. 12 00:00:26,720 --> 00:00:30,420 13 00:00:30,420 --> 00:00:34,240 Well first of all, can anyone tell me what recursion is or 14 00:00:34,240 --> 00:00:35,640 what a recursive function is? 15 00:00:35,640 --> 00:00:41,615 16 00:00:41,615 --> 00:00:43,180 No one knows. 17 00:00:43,180 --> 00:00:43,652 OK. 18 00:00:43,652 --> 00:00:45,540 AUDIENCE: To divide and conquer. 19 00:00:45,540 --> 00:00:46,490 PROFESSOR: OK. 20 00:00:46,490 --> 00:00:49,710 It's a divide-and-conquer technique. 21 00:00:49,710 --> 00:00:53,350 How does it do it? 22 00:00:53,350 --> 00:00:56,228 It's a recursive function. 23 00:00:56,228 --> 00:00:58,618 AUDIENCE: You have a base case. 24 00:00:58,618 --> 00:01:01,804 And usually a return function, you return 25 00:01:01,804 --> 00:01:03,398 the value of something. 26 00:01:03,398 --> 00:01:05,788 But because you keep returning the function back. 27 00:01:05,788 --> 00:01:06,640 PROFESSOR: Right. 28 00:01:06,640 --> 00:01:10,590 So it's a function that calls itself. 29 00:01:10,590 --> 00:01:14,510 And it works by one, identifying a base case, which 30 00:01:14,510 --> 00:01:18,010 is the smallest sub-problem possible. 31 00:01:18,010 --> 00:01:21,610 And then in the other case, the recursive case, it tries 32 00:01:21,610 --> 00:01:25,910 to chunk the problem down into a smaller sub-problem that it 33 00:01:25,910 --> 00:01:27,575 then solves by calling itself. 34 00:01:27,575 --> 00:01:31,300 35 00:01:31,300 --> 00:01:33,630 So we went over a few examples. 36 00:01:33,630 --> 00:01:41,500 And one of the things that we wanted to talk about is that 37 00:01:41,500 --> 00:01:44,310 for many problems, a recursive function can be written 38 00:01:44,310 --> 00:01:49,360 iteratively, or actually for most problems. 39 00:01:49,360 --> 00:01:56,070 So there usually is some sort of a subjective choice in how 40 00:01:56,070 --> 00:01:57,540 to write a function. 41 00:01:57,540 --> 00:02:00,440 And it really comes down to ease of understanding. 42 00:02:00,440 --> 00:02:03,840 So what we've done is we've taken a couple of the 43 00:02:03,840 --> 00:02:06,810 algorithms that we showed last week. 44 00:02:06,810 --> 00:02:09,130 And we've written them iteratively. 45 00:02:09,130 --> 00:02:10,440 We also have the recursive version. 46 00:02:10,440 --> 00:02:17,230 We'll compare and contrast and see where we would want to use 47 00:02:17,230 --> 00:02:18,930 a recursive function. 48 00:02:18,930 --> 00:02:24,900 So on the screen on the left here, we have the 49 00:02:24,900 --> 00:02:28,670 multiplication version, the recursive multiplication, that 50 00:02:28,670 --> 00:02:30,440 we showed you last week. 51 00:02:30,440 --> 00:02:33,310 It's a little different because it turns out there's a 52 00:02:33,310 --> 00:02:35,150 couple of simplifications you can make. 53 00:02:35,150 --> 00:02:40,930 54 00:02:40,930 --> 00:02:42,870 Can someone walk me through how this works? 55 00:02:42,870 --> 00:02:46,470 So what's my base case first? 56 00:02:46,470 --> 00:02:50,646 57 00:02:50,646 --> 00:02:51,574 AUDIENCE: 0 58 00:02:51,574 --> 00:02:53,060 PROFESSOR: 0, right? 59 00:02:53,060 --> 00:02:58,010 And so obviously when n is 0, if we multiply by 0, then our 60 00:02:58,010 --> 00:02:59,540 result is 0. 61 00:02:59,540 --> 00:03:03,130 Now there are two recursive cases. 62 00:03:03,130 --> 00:03:09,600 And I'm not really sure how to explain this intuitively. 63 00:03:09,600 --> 00:03:12,840 But let's say that my n is positive. 64 00:03:12,840 --> 00:03:15,210 So I'm multiplying by a positive n. 65 00:03:15,210 --> 00:03:21,630 Well, then all I'm going to do is take m and just add it to 66 00:03:21,630 --> 00:03:26,280 the recursive version of itself n minus 1 times. 67 00:03:26,280 --> 00:03:29,610 That's how to read that. 68 00:03:29,610 --> 00:03:36,120 And then analogously, if n is less than or equal to 1, then 69 00:03:36,120 --> 00:03:39,210 I'm going to take negative m and add the recursive result 70 00:03:39,210 --> 00:03:43,060 of n plus 1 times m. 71 00:03:43,060 --> 00:03:45,560 72 00:03:45,560 --> 00:03:48,100 It is not too intuitive right? 73 00:03:48,100 --> 00:03:53,506 So if we implement it iteratively though, I think 74 00:03:53,506 --> 00:03:55,830 it's a little easier to understand. 75 00:03:55,830 --> 00:03:58,490 Now this is also a subjective judgment, 76 00:03:58,490 --> 00:03:59,810 so you might disagree. 77 00:03:59,810 --> 00:04:01,060 You're free to. 78 00:04:01,060 --> 00:04:02,940 79 00:04:02,940 --> 00:04:05,940 Here's our base case again. n is equal to 0 or m is equal to 80 00:04:05,940 --> 00:04:08,240 0, return 0. 81 00:04:08,240 --> 00:04:12,750 In this case though, if we don't have the base case, then 82 00:04:12,750 --> 00:04:15,560 we're going to initialize the result variable. 83 00:04:15,560 --> 00:04:20,480 And then for n is greater than or equal to 1, we're just 84 00:04:20,480 --> 00:04:23,000 going to enter a while loop and keep adding 85 00:04:23,000 --> 00:04:25,535 m to result n times. 86 00:04:25,535 --> 00:04:28,670 87 00:04:28,670 --> 00:04:31,630 It's a little bit easier to understand I think than the 88 00:04:31,630 --> 00:04:33,620 recursive version. 89 00:04:33,620 --> 00:04:36,200 And then same thing for-- 90 00:04:36,200 --> 00:04:37,950 Oh I have a bug here. 91 00:04:37,950 --> 00:04:44,790 92 00:04:44,790 --> 00:04:49,020 Same thing for n is less than equal to negative 1. 93 00:04:49,020 --> 00:04:55,020 So I'm going to run the two versions of this function. 94 00:04:55,020 --> 00:05:00,250 So here's the recursive version and here's the 95 00:05:00,250 --> 00:05:01,330 iterative function. 96 00:05:01,330 --> 00:05:03,870 They both return the same exact thing. 97 00:05:03,870 --> 00:05:07,000 They both work in generally the same manner. 98 00:05:07,000 --> 00:05:10,260 It's just that in one case we're using recursion to solve 99 00:05:10,260 --> 00:05:15,350 it, which I don't find too intuitive. 100 00:05:15,350 --> 00:05:19,430 And the other case, we're using WHILE loops. 101 00:05:19,430 --> 00:05:20,580 All right. 102 00:05:20,580 --> 00:05:26,040 So in this case in my opinion, writing this iteratively, it 103 00:05:26,040 --> 00:05:28,650 makes a little bit more intuitive sense. 104 00:05:28,650 --> 00:05:47,260 But in other cases, let's say good old Fibonacci, we can 105 00:05:47,260 --> 00:05:49,460 write the recursive version and then 106 00:05:49,460 --> 00:05:50,710 the iterative version. 107 00:05:50,710 --> 00:05:53,480 108 00:05:53,480 --> 00:05:55,260 So here's recursive Fibonacci. 109 00:05:55,260 --> 00:05:58,975 We have our base case or cases. 110 00:05:58,975 --> 00:06:02,360 And then we have our recursive case. 111 00:06:02,360 --> 00:06:05,990 You can almost rewrite the mathematical 112 00:06:05,990 --> 00:06:07,430 formula from this directly. 113 00:06:07,430 --> 00:06:18,342 114 00:06:18,342 --> 00:06:19,592 All right now. 115 00:06:19,592 --> 00:06:22,438 116 00:06:22,438 --> 00:06:23,765 Now what was it? 117 00:06:23,765 --> 00:06:34,080 118 00:06:34,080 --> 00:06:36,250 So here's the iterative version of Fibonacci. 119 00:06:36,250 --> 00:06:41,480 120 00:06:41,480 --> 00:06:45,070 We still have our base case. 121 00:06:45,070 --> 00:06:50,210 But when we get to what was the recursive case, we have to 122 00:06:50,210 --> 00:06:53,310 do a lot of bookkeeping. 123 00:06:53,310 --> 00:06:58,000 We have to save off the previous Fibonacci what we're 124 00:06:58,000 --> 00:07:00,340 currently computing. 125 00:07:00,340 --> 00:07:07,360 And then we have to iterate, get the next Fibonacci and 126 00:07:07,360 --> 00:07:13,200 then save off the prior versions of it. 127 00:07:13,200 --> 00:07:16,760 This is all stuff that in the recursive version gets done 128 00:07:16,760 --> 00:07:19,820 for us by virtue of just calling another function. 129 00:07:19,820 --> 00:07:22,470 130 00:07:22,470 --> 00:07:26,730 So this is an example of a case where your recursive 131 00:07:26,730 --> 00:07:31,370 version is actually a little bit easier to understand. 132 00:07:31,370 --> 00:07:33,400 Doesn't mean that it's more efficient. 133 00:07:33,400 --> 00:07:36,250 And later on in the class we'll actually use this to 134 00:07:36,250 --> 00:07:40,150 talk about complexity. 135 00:07:40,150 --> 00:07:44,090 But the left version I think is easier to understand than 136 00:07:44,090 --> 00:07:45,170 the right version. 137 00:07:45,170 --> 00:07:46,575 Are there any disagreements? 138 00:07:46,575 --> 00:07:50,870 139 00:07:50,870 --> 00:07:53,310 If you disagree, I'm not going to bite you. 140 00:07:53,310 --> 00:07:56,260 So anyway, we can run this. 141 00:07:56,260 --> 00:07:58,860 And we can see that the output's identical. 142 00:07:58,860 --> 00:08:01,845 143 00:08:01,845 --> 00:08:03,673 AUDIENCE: What's x-range? 144 00:08:03,673 --> 00:08:04,923 PROFESSOR: x-range? 145 00:08:04,923 --> 00:08:07,445 146 00:08:07,445 --> 00:08:09,510 Probably something I shouldn't have put in there. 147 00:08:09,510 --> 00:08:15,470 x-range is like range, except that it returns what's known 148 00:08:15,470 --> 00:08:21,330 as a generator object that you can iterate over. 149 00:08:21,330 --> 00:08:26,110 So that I don't have to explain that right now-- 150 00:08:26,110 --> 00:08:27,870 Well actually, we'll probably talk about it 151 00:08:27,870 --> 00:08:30,760 later in the semester. 152 00:08:30,760 --> 00:08:35,669 153 00:08:35,669 --> 00:08:38,580 The difference is efficiency. 154 00:08:38,580 --> 00:08:41,970 Range will return an entire list to you. 155 00:08:41,970 --> 00:08:45,260 Whereas x-range is a little bit more conservative in how 156 00:08:45,260 --> 00:08:49,680 it manages it's memory for these purposes. 157 00:08:49,680 --> 00:08:53,630 But changing it won't make a difference in the program. 158 00:08:53,630 --> 00:08:59,680 And for a program as simple as this, range is perfectly fine. 159 00:08:59,680 --> 00:09:03,350 I just used x-range out of habit. 160 00:09:03,350 --> 00:09:05,150 So we'll do one last example. 161 00:09:05,150 --> 00:09:06,795 And then we'll move on to a different topic. 162 00:09:06,795 --> 00:09:10,500 163 00:09:10,500 --> 00:09:15,420 If I didn't mention it before, in problem set 4, recursion is 164 00:09:15,420 --> 00:09:18,760 highly recommended for the final portion of it. 165 00:09:18,760 --> 00:09:25,690 So it's kind of important you understand what's going on. 166 00:09:25,690 --> 00:09:33,220 Anyway, so remember we looked at bisection 167 00:09:33,220 --> 00:09:35,500 early on in the semester. 168 00:09:35,500 --> 00:09:39,960 And we showed you an iterative version of bisection. 169 00:09:39,960 --> 00:09:42,520 This shouldn't really be unfamiliar to 170 00:09:42,520 --> 00:09:44,490 anyone at this point. 171 00:09:44,490 --> 00:09:51,530 172 00:09:51,530 --> 00:09:55,120 So all this is doing is finding the square root of a 173 00:09:55,120 --> 00:09:58,590 number using bisection search. 174 00:09:58,590 --> 00:10:04,900 And we set our low and our high, get our midpoint. 175 00:10:04,900 --> 00:10:10,800 And we just keep looping until we get a value that when we 176 00:10:10,800 --> 00:10:12,420 square it is close enough to x. 177 00:10:12,420 --> 00:10:16,770 178 00:10:16,770 --> 00:10:21,600 And on each iteration we set our lows and our highs, 179 00:10:21,600 --> 00:10:24,330 depending on how good our guess was. 180 00:10:24,330 --> 00:10:26,940 181 00:10:26,940 --> 00:10:31,890 Now the recursive version looks like this. 182 00:10:31,890 --> 00:10:35,570 It has a few more lines of code. 183 00:10:35,570 --> 00:10:42,820 And before I launch into it, did we explain default 184 00:10:42,820 --> 00:10:46,920 parameters to you, for functions? 185 00:10:46,920 --> 00:10:57,640 So Python has this feature where if you have a function 186 00:10:57,640 --> 00:11:03,400 such as rec bisection search, you can specify that certain 187 00:11:03,400 --> 00:11:08,490 parameters are optional or you can give 188 00:11:08,490 --> 00:11:11,170 default values to them. 189 00:11:11,170 --> 00:11:16,480 So let's just show a simple example so I 190 00:11:16,480 --> 00:11:17,980 can get past this. 191 00:11:17,980 --> 00:11:40,610 192 00:11:40,610 --> 00:11:42,790 So if I define this function, this one's really easy. 193 00:11:42,790 --> 00:11:45,310 All it's going to do is print out x. 194 00:11:45,310 --> 00:11:53,540 I can call it like this, in which case it's going to pass 195 00:11:53,540 --> 00:11:57,015 150 in and x will be 150 when the function's executing. 196 00:11:57,015 --> 00:12:01,130 197 00:12:01,130 --> 00:12:03,530 See I'm not lying. 198 00:12:03,530 --> 00:12:04,780 Or I can call it like this. 199 00:12:04,780 --> 00:12:09,900 200 00:12:09,900 --> 00:12:12,750 And it'll be 100. 201 00:12:12,750 --> 00:12:15,720 So that's, in a nutshell, what default parameters do for you. 202 00:12:15,720 --> 00:12:19,260 They're useful in some instances, as in this example. 203 00:12:19,260 --> 00:12:33,740 So in this recursive version of bisection square root, we 204 00:12:33,740 --> 00:12:36,790 have a low and a high parameter that we specify. 205 00:12:36,790 --> 00:12:39,360 It's exactly equivalent to the low and the high parameter in 206 00:12:39,360 --> 00:12:40,610 this iterative version. 207 00:12:40,610 --> 00:12:46,880 208 00:12:46,880 --> 00:12:51,110 This is a common idiom for recursive functions in Python. 209 00:12:51,110 --> 00:12:52,840 If we're calling it for the first time, we're not going to 210 00:12:52,840 --> 00:12:55,210 specify in a low and a high. 211 00:12:55,210 --> 00:12:58,090 So low and high will be none coming into this function. 212 00:12:58,090 --> 00:13:00,610 And then we just set them as we did in 213 00:13:00,610 --> 00:13:02,610 this iterative version. 214 00:13:02,610 --> 00:13:03,920 And then we set the midpoint. 215 00:13:03,920 --> 00:13:07,960 And then we have slightly different structure here. 216 00:13:07,960 --> 00:13:13,850 If the midpoint that we guess is close enough to the square 217 00:13:13,850 --> 00:13:20,230 root of x, then we just return the midpoint. 218 00:13:20,230 --> 00:13:24,960 On the other hand, if it's too low of a guess, then we're 219 00:13:24,960 --> 00:13:30,230 going to recursively call ourselves with the same x, 220 00:13:30,230 --> 00:13:32,490 same epsilon, but we're going to use 221 00:13:32,490 --> 00:13:34,380 midpoint for the low parameter. 222 00:13:34,380 --> 00:13:39,520 So midpoint, in this case, is here and 223 00:13:39,520 --> 00:13:42,340 the same high parameter. 224 00:13:42,340 --> 00:13:46,970 And then if we've guessed too high, then our low 225 00:13:46,970 --> 00:13:47,820 parameter is low. 226 00:13:47,820 --> 00:13:49,570 And then our high parameter's the midpoint. 227 00:13:49,570 --> 00:13:51,320 So it's doing the exact same thing as 228 00:13:51,320 --> 00:13:53,390 the iterative version. 229 00:13:53,390 --> 00:13:58,870 We have recursive, iterative, recursive, iterative. 230 00:13:58,870 --> 00:14:00,120 Same thing, just different forms. 231 00:14:00,120 --> 00:14:02,790 232 00:14:02,790 --> 00:14:03,420 All right. 233 00:14:03,420 --> 00:14:09,310 Before I leave recursion, does anyone have any questions, or 234 00:14:09,310 --> 00:14:15,048 want to ask anything, or complain? 235 00:14:15,048 --> 00:14:15,544 No? 236 00:14:15,544 --> 00:14:16,040 All right. 237 00:14:16,040 --> 00:14:19,512 AUDIENCE: Do you use a lot of recursion in your work? 238 00:14:19,512 --> 00:14:22,984 Do you normally use iterative or recursion? 239 00:14:22,984 --> 00:14:25,040 Or is it just case by case? 240 00:14:25,040 --> 00:14:26,140 PROFESSOR: It's case by case. 241 00:14:26,140 --> 00:14:28,570 It depends on the problem. 242 00:14:28,570 --> 00:14:36,250 And what we are trying to show here is that there are some 243 00:14:36,250 --> 00:14:38,420 problems that are better expressed recursively and 244 00:14:38,420 --> 00:14:41,640 others that are better expressed iteratively. 245 00:14:41,640 --> 00:14:46,650 And by better, it's a very subjective term. 246 00:14:46,650 --> 00:14:48,620 In my mind, it means more intuitive, easier to 247 00:14:48,620 --> 00:14:50,450 understand. 248 00:14:50,450 --> 00:14:54,560 It allows you to focus on solving the problem rather 249 00:14:54,560 --> 00:14:55,850 than fiddling with code. 250 00:14:55,850 --> 00:14:59,630 251 00:14:59,630 --> 00:15:02,230 On the other hand, sometimes efficiency comes into play. 252 00:15:02,230 --> 00:15:06,060 And we're going to be talking about that pretty shortly. 253 00:15:06,060 --> 00:15:09,650 And in that case, you might want to do a recursive version 254 00:15:09,650 --> 00:15:10,960 because it's easier to understand. 255 00:15:10,960 --> 00:15:13,220 But it takes too long to run, so you write 256 00:15:13,220 --> 00:15:14,470 an iterative version. 257 00:15:14,470 --> 00:15:17,710 258 00:15:17,710 --> 00:15:20,370 Computer programming, in a lot of cases, actually in all 259 00:15:20,370 --> 00:15:24,990 cases, is a bunch of trade offs. 260 00:15:24,990 --> 00:15:29,880 Often times you'll trade off speed for memory, elegance for 261 00:15:29,880 --> 00:15:32,270 efficiency, that sort of thing. 262 00:15:32,270 --> 00:15:35,950 And part of the skill of becoming good computer 263 00:15:35,950 --> 00:15:38,420 programmers is figuring out where those 264 00:15:38,420 --> 00:15:40,830 balance points are. 265 00:15:40,830 --> 00:15:43,700 And it's something that I think comes only comes with 266 00:15:43,700 --> 00:15:45,430 experience. 267 00:15:45,430 --> 00:15:50,700 All right, so we've talked about floating point to death. 268 00:15:50,700 --> 00:15:53,260 But we just want to really emphasize it. 269 00:15:53,260 --> 00:15:55,540 Because it's something that even for experienced 270 00:15:55,540 --> 00:15:59,680 programmers, still trips us up. 271 00:15:59,680 --> 00:16:04,310 So the thing that we want you to understand is that floating 272 00:16:04,310 --> 00:16:05,730 point is inexact. 273 00:16:05,730 --> 00:16:12,430 So you shouldn't compare for exact equality. 274 00:16:12,430 --> 00:16:16,250 So looking at the code here, I have to find a variable 275 00:16:16,250 --> 00:16:20,610 10/100, which is just 10 over 100, and 1/100, which is just 276 00:16:20,610 --> 00:16:25,370 1 over 100, and then 9/100, which is 9 over 100. 277 00:16:25,370 --> 00:16:31,330 And so in real math, this condition would be true. 278 00:16:31,330 --> 00:16:34,910 I add 1/100 and 9/100. 279 00:16:34,910 --> 00:16:36,810 I should get 10/100. 280 00:16:36,810 --> 00:16:40,260 So if we were not dealing in computer land, 281 00:16:40,260 --> 00:16:43,310 this would print out. 282 00:16:43,310 --> 00:16:49,580 But because we are dealing in computer-land, we get that. 283 00:16:49,580 --> 00:16:52,190 284 00:16:52,190 --> 00:16:58,460 And the reason is because of Python's representation. 285 00:16:58,460 --> 00:17:02,510 Now when you write print x, if x is a float variable, Python 286 00:17:02,510 --> 00:17:05,194 does a little bit of nice formatting for you. 287 00:17:05,194 --> 00:17:07,490 It kind of saves you from its internal representation. 288 00:17:07,490 --> 00:17:11,740 289 00:17:11,740 --> 00:17:16,780 So here is 10/100, as you just printed out. 290 00:17:16,780 --> 00:17:19,069 It's what you would expect it to be. 291 00:17:19,069 --> 00:17:21,898 But this is what Python sees when it does its math. 292 00:17:21,898 --> 00:17:25,119 293 00:17:25,119 --> 00:17:26,290 And it's not just Python. 294 00:17:26,290 --> 00:17:29,650 This applies for anything on a binary computer. 295 00:17:29,650 --> 00:17:32,290 It's an inherent limitation. 296 00:17:32,290 --> 00:17:35,660 And you know we can get arbitrarily 297 00:17:35,660 --> 00:17:36,910 close, but never exact. 298 00:17:36,910 --> 00:17:41,300 299 00:17:41,300 --> 00:17:48,050 And then again, we have 1/100 and 9/100, Python 300 00:17:48,050 --> 00:17:49,950 will show us 0.1. 301 00:17:49,950 --> 00:17:53,420 So when you print these out, they'll look fine. 302 00:17:53,420 --> 00:17:56,480 If you were writing debugging code and you were wondering 303 00:17:56,480 --> 00:18:01,040 why if you compared x to y, it wasn't exactly equal, you 304 00:18:01,040 --> 00:18:03,610 would naturally print out x and then print out y. 305 00:18:03,610 --> 00:18:04,780 But it would look equal to you. 306 00:18:04,780 --> 00:18:08,100 But the code wouldn't be working properly. 307 00:18:08,100 --> 00:18:11,370 Well the reason is, is that the internal application that 308 00:18:11,370 --> 00:18:14,890 Python is using to compare them is that. 309 00:18:14,890 --> 00:18:17,410 310 00:18:17,410 --> 00:18:20,050 So what's the solution? 311 00:18:20,050 --> 00:18:24,620 312 00:18:24,620 --> 00:18:25,490 It's a natural question. 313 00:18:25,490 --> 00:18:26,740 I don't know the answer. 314 00:18:26,740 --> 00:18:30,220 315 00:18:30,220 --> 00:18:33,545 AUDIENCE: If they're close enough, then it would be 316 00:18:33,545 --> 00:18:34,970 inside variance-- 317 00:18:34,970 --> 00:18:37,440 PROFESSOR: Right. 318 00:18:37,440 --> 00:18:39,781 We're going to say good enough. 319 00:18:39,781 --> 00:18:43,470 And the traditional way of representing that is epsilon. 320 00:18:43,470 --> 00:18:47,130 Epsilon, you've seen in your problem sets. 321 00:18:47,130 --> 00:18:48,480 And you've seen it in code before. 322 00:18:48,480 --> 00:18:50,600 And if you've come to office hours, someone's probably 323 00:18:50,600 --> 00:18:52,320 explained it to you. 324 00:18:52,320 --> 00:18:56,740 Epsilon is the amount of error we're willing to tolerate in 325 00:18:56,740 --> 00:18:58,360 our calculations. 326 00:18:58,360 --> 00:19:07,000 So in Python-land, you can have arbitrary precision. 327 00:19:07,000 --> 00:19:08,910 Don't quote me on that though. 328 00:19:08,910 --> 00:19:13,290 But for purposes of this class, if you're using an 329 00:19:13,290 --> 00:19:17,390 epsilon it's like 0.0001, we're not 330 00:19:17,390 --> 00:19:18,640 going to get too upset. 331 00:19:18,640 --> 00:19:21,750 332 00:19:21,750 --> 00:19:24,530 All this function does, and this is a handy function to 333 00:19:24,530 --> 00:19:31,220 keep around, is it just tests to see if the distance between 334 00:19:31,220 --> 00:19:33,800 x and y are less than epsilon. 335 00:19:33,800 --> 00:19:35,970 If it is, then we say they're close enough to each other to 336 00:19:35,970 --> 00:19:37,220 be considered equal. 337 00:19:37,220 --> 00:19:40,350 338 00:19:40,350 --> 00:19:42,740 So I don't like the function named compare. 339 00:19:42,740 --> 00:19:43,990 I don't think it's intuitive. 340 00:19:43,990 --> 00:19:49,810 341 00:19:49,810 --> 00:19:51,060 Close enough is probably better. 342 00:19:51,060 --> 00:19:55,450 343 00:19:55,450 --> 00:19:57,250 But it's also going to break my code. 344 00:19:57,250 --> 00:20:07,590 345 00:20:07,590 --> 00:20:09,030 Uh oh. 346 00:20:09,030 --> 00:20:10,280 This is an actual bug. 347 00:20:10,280 --> 00:20:14,390 348 00:20:14,390 --> 00:20:16,670 Line 203. 349 00:20:16,670 --> 00:20:17,920 What did I do to myself? 350 00:20:17,920 --> 00:20:23,130 351 00:20:23,130 --> 00:20:25,253 I commented out my definition is what I did. 352 00:20:25,253 --> 00:20:32,020 353 00:20:32,020 --> 00:20:32,990 All right. 354 00:20:32,990 --> 00:20:37,000 So if we compare the two values, 10/100 and 1/100 plus 355 00:20:37,000 --> 00:20:43,720 9/100 and we use our close enough, our compare function, 356 00:20:43,720 --> 00:20:45,360 then yeah it's within epsilon. 357 00:20:45,360 --> 00:20:49,060 358 00:20:49,060 --> 00:20:51,780 Again, notice here that we're using a default parameter. 359 00:20:51,780 --> 00:20:54,480 So if we don't pass in something explicitly. 360 00:20:54,480 --> 00:20:56,653 So I can say something like this. 361 00:20:56,653 --> 00:21:01,560 362 00:21:01,560 --> 00:21:02,980 Let's make epsilon really tiny. 363 00:21:02,980 --> 00:21:23,140 364 00:21:23,140 --> 00:21:26,770 So if I make epsilon really, really tiny, then it's 365 00:21:26,770 --> 00:21:27,540 going to say no. 366 00:21:27,540 --> 00:21:34,470 So how you determine epsilon really depends on your 367 00:21:34,470 --> 00:21:37,520 specific application. 368 00:21:37,520 --> 00:21:42,590 If you're doing high precision mathematics, you're modeling 369 00:21:42,590 --> 00:21:45,400 faults on a bridge or something, probably I want to 370 00:21:45,400 --> 00:21:47,850 be pretty precise. 371 00:21:47,850 --> 00:21:50,790 Because if you have the wrong epsilon, then you might have 372 00:21:50,790 --> 00:21:53,630 cars falling of the bridge or the bridge collapsing. 373 00:21:53,630 --> 00:21:56,970 And it would just be a bad day. 374 00:21:56,970 --> 00:22:03,080 So are there any questions so far about floating point? 375 00:22:03,080 --> 00:22:03,520 No. 376 00:22:03,520 --> 00:22:06,075 AUDIENCE: You normally go as close as the 377 00:22:06,075 --> 00:22:09,130 areas will let you. 378 00:22:09,130 --> 00:22:10,291 PROFESSOR: What's that? 379 00:22:10,291 --> 00:22:12,215 AUDIENCE: You can't get as close anymore. 380 00:22:12,215 --> 00:22:16,063 How can you make the error that small. 381 00:22:16,063 --> 00:22:17,987 Because it's not going to get that close. 382 00:22:17,987 --> 00:22:19,580 PROFESSOR: Well yeah, that's what I mean. 383 00:22:19,580 --> 00:22:23,700 So there is a limit to how close you can get. 384 00:22:23,700 --> 00:22:26,110 And it depends on the language and it also 385 00:22:26,110 --> 00:22:27,360 depends on the hardware. 386 00:22:27,360 --> 00:22:29,390 387 00:22:29,390 --> 00:22:33,030 There are, and this group is getting a little bit more 388 00:22:33,030 --> 00:22:38,470 technical than I want, but you can define pretty precisely 389 00:22:38,470 --> 00:22:43,580 the smallest value that epsilon can be. 390 00:22:43,580 --> 00:22:52,570 In a language like C, its defined as the minimum 391 00:22:52,570 --> 00:22:56,440 difference between two floating point variables 392 00:22:56,440 --> 00:23:00,640 that's representable on the host machine's hardware. 393 00:23:00,640 --> 00:23:05,530 So yeah, there is a limit. 394 00:23:05,530 --> 00:23:09,550 There are some math packages though, and we'll be using 395 00:23:09,550 --> 00:23:13,930 something called NumPy later on the semester, that allow 396 00:23:13,930 --> 00:23:16,033 you to do pretty high precision mathematics. 397 00:23:16,033 --> 00:23:19,450 398 00:23:19,450 --> 00:23:20,730 Keep that in the back of your mind. 399 00:23:20,730 --> 00:23:22,180 But yeah you're right. 400 00:23:22,180 --> 00:23:23,620 You do eventually hit a limit. 401 00:23:23,620 --> 00:23:27,380 402 00:23:27,380 --> 00:23:31,790 OK so the last thing that I want to cover on floating 403 00:23:31,790 --> 00:23:40,690 point is that even though it's inexact, it's consistent. 404 00:23:40,690 --> 00:23:47,870 So let's say I define a variable 9/100 plus 1/100. 405 00:23:47,870 --> 00:23:51,310 And it's exactly what it says, 9/100 plus 1/100. 406 00:23:51,310 --> 00:23:54,570 Now we know that this is not going to equal 10/100, right. 407 00:23:54,570 --> 00:23:58,020 We just demonstrated that ad nauseum. 408 00:23:58,020 --> 00:24:02,230 And also, yeah, still defined. 409 00:24:02,230 --> 00:24:07,630 The question though is, if I subtract 1/100 from this 410 00:24:07,630 --> 00:24:14,380 variable that I've defined, this 9/100 plus 1/100, will 411 00:24:14,380 --> 00:24:19,760 9/100 now be equal to 9/100 plus 1/100? 412 00:24:19,760 --> 00:24:24,390 So in other words, will this be true? 413 00:24:24,390 --> 00:24:29,100 And the answer is yes. 414 00:24:29,100 --> 00:24:36,510 And the reason is that if I'm adding or subtracting, even 415 00:24:36,510 --> 00:24:40,080 though 1/100 we know is an inexact representation, it's 416 00:24:40,080 --> 00:24:42,490 still the same. 417 00:24:42,490 --> 00:24:45,020 And so when we do the subtraction, we're subtracting 418 00:24:45,020 --> 00:24:47,160 the same inexact value. 419 00:24:47,160 --> 00:24:57,605 So this appeared as a quiz question at one point. 420 00:24:57,605 --> 00:24:59,360 It probably won't this semester. 421 00:24:59,360 --> 00:25:04,890 But it's something to keep in mind. 422 00:25:04,890 --> 00:25:11,780 So any questions on floating point? 423 00:25:11,780 --> 00:25:21,680 If you're a mathy type and want to, look up IEEE 754. 424 00:25:21,680 --> 00:25:24,060 And this will give you all the gory details about 425 00:25:24,060 --> 00:25:27,100 representation and mathematical operations on 426 00:25:27,100 --> 00:25:29,230 floating point. 427 00:25:29,230 --> 00:25:31,670 And if you don't, then don't worry about it. 428 00:25:31,670 --> 00:25:34,450 It's not required for the class. 429 00:25:34,450 --> 00:25:35,460 OK. 430 00:25:35,460 --> 00:25:41,730 So the next topic we want to cover is pseudocode. 431 00:25:41,730 --> 00:25:45,801 So can someone take a stab at defining pseudocode for me? 432 00:25:45,801 --> 00:25:50,220 AUDIENCE: From what I gathered, it's basically 433 00:25:50,220 --> 00:25:54,148 you're writing out what you're planning on doing in just 434 00:25:54,148 --> 00:25:55,621 normal English. 435 00:25:55,621 --> 00:25:57,890 PROFESSOR: I wouldn't just say normal English. 436 00:25:57,890 --> 00:25:59,895 But it's an English of sorts. 437 00:25:59,895 --> 00:26:04,000 438 00:26:04,000 --> 00:26:07,190 And a lot of the difficulty that programmers have with 439 00:26:07,190 --> 00:26:12,310 writing programs or new programs is that we don't 440 00:26:12,310 --> 00:26:14,023 naturally think in computer languages. 441 00:26:14,023 --> 00:26:16,540 442 00:26:16,540 --> 00:26:18,350 You think in English. 443 00:26:18,350 --> 00:26:22,400 Or well, you think in a human language. 444 00:26:22,400 --> 00:26:28,080 So what pseudocode allows us to do is to kind of be in the 445 00:26:28,080 --> 00:26:29,040 intermediate. 446 00:26:29,040 --> 00:26:32,160 We still want to develop a step by step process for 447 00:26:32,160 --> 00:26:35,280 solving a problem, but we want to be able to describe it in 448 00:26:35,280 --> 00:26:42,670 words and not variables and syntax. 449 00:26:42,670 --> 00:26:45,000 Sometimes what'll happen is programmers will get so 450 00:26:45,000 --> 00:26:47,680 wrapped around kind of getting the syntax right that they'll 451 00:26:47,680 --> 00:26:49,710 forget the problem that they're 452 00:26:49,710 --> 00:26:51,730 actually trying to solve. 453 00:26:51,730 --> 00:26:58,150 So let's walk through an example. 454 00:26:58,150 --> 00:27:00,050 Let's talk about pseudocode for Hangman. 455 00:27:00,050 --> 00:27:03,590 456 00:27:03,590 --> 00:27:06,500 And because you've all done this on the problem set, I 457 00:27:06,500 --> 00:27:08,430 don't have to explain the rules right. 458 00:27:08,430 --> 00:27:10,930 So what would be a good kind of English 459 00:27:10,930 --> 00:27:12,300 first step for Hangman? 460 00:27:12,300 --> 00:27:19,715 461 00:27:19,715 --> 00:27:22,100 AUDIENCE: You have to choose a word. 462 00:27:22,100 --> 00:27:22,670 PROFESSOR: Right. 463 00:27:22,670 --> 00:27:26,175 So let's not be too specific. 464 00:27:26,175 --> 00:27:30,410 We'll just say, select random word. 465 00:27:30,410 --> 00:27:35,170 466 00:27:35,170 --> 00:27:36,180 OK. 467 00:27:36,180 --> 00:27:38,993 Now what would be another good step, next step. 468 00:27:38,993 --> 00:27:42,065 469 00:27:42,065 --> 00:27:46,871 AUDIENCE: Display the amount of spaces maybe? 470 00:27:46,871 --> 00:27:50,352 PROFESSOR: So display a masked version of the word. 471 00:27:50,352 --> 00:27:51,013 AUDIENCE: Exactly. 472 00:27:51,013 --> 00:27:53,328 Hide the word but display it. 473 00:27:53,328 --> 00:27:56,422 PROFESSOR: Hide the word but display it. 474 00:27:56,422 --> 00:27:58,912 AUDIENCE: Well, display the amount of spaces. 475 00:27:58,912 --> 00:28:03,892 476 00:28:03,892 --> 00:28:07,378 You probably want to state how many letters are in the word 477 00:28:07,378 --> 00:28:07,890 at some point. 478 00:28:07,890 --> 00:28:09,450 PROFESSOR: Ah, that's a good point. 479 00:28:09,450 --> 00:28:10,380 Where should that go? 480 00:28:10,380 --> 00:28:12,675 AUDIENCE: That should probably be before the display. 481 00:28:12,675 --> 00:28:14,052 PROFESSOR: OK. 482 00:28:14,052 --> 00:28:17,630 So tell how many letters. 483 00:28:17,630 --> 00:28:23,430 484 00:28:23,430 --> 00:28:24,080 All right. 485 00:28:24,080 --> 00:28:25,977 Now what would come after this? 486 00:28:25,977 --> 00:28:26,955 AUDIENCE: After display? 487 00:28:26,955 --> 00:28:27,933 PROFESSOR: Yeah. 488 00:28:27,933 --> 00:28:30,378 AUDIENCE: See how many letters you have to choose from. 489 00:28:30,378 --> 00:28:31,628 PROFESSOR: OK. 490 00:28:31,628 --> 00:28:37,713 491 00:28:37,713 --> 00:28:40,647 AUDIENCE: First time you don't. 492 00:28:40,647 --> 00:28:41,930 PROFESSOR: At first time you don't. 493 00:28:41,930 --> 00:28:45,840 But the nice thing about pseudocode is that we can barf 494 00:28:45,840 --> 00:28:48,895 things onto paper and then rearrange them as we-- 495 00:28:48,895 --> 00:28:51,070 It's sort of like brainstorming in a sense. 496 00:28:51,070 --> 00:28:54,820 497 00:28:54,820 --> 00:28:57,580 You're trying to derive the structure. 498 00:28:57,580 --> 00:29:01,480 And it's easier to do like this than to try 499 00:29:01,480 --> 00:29:03,660 and do it in code. 500 00:29:03,660 --> 00:29:04,292 But yeah, you're right. 501 00:29:04,292 --> 00:29:06,580 You don't have to. 502 00:29:06,580 --> 00:29:10,540 AUDIENCE: You would ask the person to put a letter. 503 00:29:10,540 --> 00:29:12,520 PROFESSOR: OK. 504 00:29:12,520 --> 00:29:14,995 For a letter. 505 00:29:14,995 --> 00:29:17,965 And then what would come after that? 506 00:29:17,965 --> 00:29:23,410 AUDIENCE: So then you want to check if it's the -- 507 00:29:23,410 --> 00:29:24,895 PROFESSOR: Check if it's in the word. 508 00:29:24,895 --> 00:29:32,320 509 00:29:32,320 --> 00:29:33,310 All right. 510 00:29:33,310 --> 00:29:33,850 And? 511 00:29:33,850 --> 00:29:34,590 AUDIENCE: If it is-- 512 00:29:34,590 --> 00:29:37,721 PROFESSOR: If it is? 513 00:29:37,721 --> 00:29:39,589 AUDIENCE: Add it to the word. 514 00:29:39,589 --> 00:29:43,030 PROFESSOR: Add, let's say, to correct letters guess. 515 00:29:43,030 --> 00:29:44,280 AUDIENCE: Yeah. 516 00:29:44,280 --> 00:29:50,825 517 00:29:50,825 --> 00:29:51,811 PROFESSOR: OK. 518 00:29:51,811 --> 00:29:52,797 And if it isn't? 519 00:29:52,797 --> 00:29:55,262 AUDIENCE: If it isn't, reject it. 520 00:29:55,262 --> 00:29:58,940 521 00:29:58,940 --> 00:30:00,305 PROFESSOR: Let's say-- 522 00:30:00,305 --> 00:30:04,462 AUDIENCE: You want to remove it from the options. 523 00:30:04,462 --> 00:30:06,960 PROFESSOR: So if it's not, then we're going to remove 524 00:30:06,960 --> 00:30:08,210 from options. 525 00:30:08,210 --> 00:30:11,219 526 00:30:11,219 --> 00:30:12,650 So letters remaining. 527 00:30:12,650 --> 00:30:16,470 528 00:30:16,470 --> 00:30:20,370 Probably want to tell the user they're wrong too. 529 00:30:20,370 --> 00:30:22,080 AUDIENCE: And use up a turn. 530 00:30:22,080 --> 00:30:22,970 PROFESSOR: What's that? 531 00:30:22,970 --> 00:30:25,492 AUDIENCE: You want to use up a turn. 532 00:30:25,492 --> 00:30:26,500 PROFESSOR: I'm sorry. 533 00:30:26,500 --> 00:30:29,218 AUDIENCE: Then you use up a turn. 534 00:30:29,218 --> 00:30:31,030 If you had a set amount of turns. 535 00:30:31,030 --> 00:30:32,284 PROFESSOR: OK, so we're actually going get 536 00:30:32,284 --> 00:30:32,934 to that in a second. 537 00:30:32,934 --> 00:30:34,395 AUDIENCE: I sent last week. 538 00:30:34,395 --> 00:30:36,343 [INAUDIBLE] game. 539 00:30:36,343 --> 00:30:38,930 PROFESSOR: I actually played all of your Hangman games. 540 00:30:38,930 --> 00:30:40,412 It was quite fun. 541 00:30:40,412 --> 00:30:45,350 542 00:30:45,350 --> 00:30:46,760 So again, yeah you're right. 543 00:30:46,760 --> 00:30:49,440 So we have a number of guesses that are remaining. 544 00:30:49,440 --> 00:30:56,400 And the thing is that we know that the user has a certain 545 00:30:56,400 --> 00:30:57,010 number of terms. 546 00:30:57,010 --> 00:30:59,760 So we're probably going to repeat a lot of this. 547 00:30:59,760 --> 00:31:06,890 So at some point, we probably want to have a WHILE. 548 00:31:06,890 --> 00:31:17,455 It'll just say, WHILE we have guesses left remaining. 549 00:31:17,455 --> 00:31:21,190 550 00:31:21,190 --> 00:31:24,260 By the way, the reason why I program computers is because 551 00:31:24,260 --> 00:31:26,090 my handwriting is horrible. 552 00:31:26,090 --> 00:31:29,560 So WHILE we have guesses remaining, we're going to keep 553 00:31:29,560 --> 00:31:33,620 doing all this. 554 00:31:33,620 --> 00:31:34,430 All right? 555 00:31:34,430 --> 00:31:35,715 And then we're going to remove-- 556 00:31:35,715 --> 00:31:43,590 557 00:31:43,590 --> 00:31:47,410 But is this the only stopping criteria? 558 00:31:47,410 --> 00:31:50,140 What if they win? 559 00:31:50,140 --> 00:31:55,630 So WHILE they have guesses remaining and 560 00:31:55,630 --> 00:31:59,840 the word is not guessed. 561 00:31:59,840 --> 00:32:02,700 562 00:32:02,700 --> 00:32:07,570 So this is in essence your Hangman program. 563 00:32:07,570 --> 00:32:09,560 It's English language. 564 00:32:09,560 --> 00:32:11,540 It's not easy to read because that's my handwriting. 565 00:32:11,540 --> 00:32:13,370 But it's kind of easy to understand it at 566 00:32:13,370 --> 00:32:14,620 an intuitive level. 567 00:32:14,620 --> 00:32:16,990 568 00:32:16,990 --> 00:32:19,990 And the reason we're talking about this is because we're 569 00:32:19,990 --> 00:32:24,160 going to get to some more complicated programs as we 570 00:32:24,160 --> 00:32:26,670 move through the semester. 571 00:32:26,670 --> 00:32:30,140 And a good starting off point for a lot of you, when you're 572 00:32:30,140 --> 00:32:33,190 trying to do your problem sets, is instead of trying to 573 00:32:33,190 --> 00:32:36,200 jump right into the coding portion of it, to sit down 574 00:32:36,200 --> 00:32:39,540 with a piece of paper, index cards, or a whiteboard and 575 00:32:39,540 --> 00:32:41,460 kind of sketch out a high level view of the algorithm. 576 00:32:41,460 --> 00:32:46,600 577 00:32:46,600 --> 00:32:52,250 So that we can see this in code form. 578 00:32:52,250 --> 00:32:57,760 So let's say that I want to write a function that tests a 579 00:32:57,760 --> 00:33:01,490 number to see if it's prime. 580 00:33:01,490 --> 00:33:07,225 First question is, what is a prime number? 581 00:33:07,225 --> 00:33:10,720 AUDIENCE: One where the only factors are 1 and itself. 582 00:33:10,720 --> 00:33:11,370 PROFESSOR: Right. 583 00:33:11,370 --> 00:33:16,390 So a number that is only divisible by itself and 1. 584 00:33:16,390 --> 00:33:20,040 585 00:33:20,040 --> 00:33:21,990 Are even numbers prime? 586 00:33:21,990 --> 00:33:24,040 Can they ever be prime? 587 00:33:24,040 --> 00:33:25,100 Really? 588 00:33:25,100 --> 00:33:26,220 What about 2? 589 00:33:26,220 --> 00:33:27,250 Right. 590 00:33:27,250 --> 00:33:29,630 So 2 is one of our special cases. 591 00:33:29,630 --> 00:33:30,540 All right. 592 00:33:30,540 --> 00:33:36,550 So what would be maybe a good starting off point for 593 00:33:36,550 --> 00:33:41,935 pseudocode to test primality, knowing those facts. 594 00:33:41,935 --> 00:33:44,645 595 00:33:44,645 --> 00:33:47,020 AUDIENCE: [INAUDIBLE] 596 00:33:47,020 --> 00:33:49,460 PROFESSOR: All right. 597 00:33:49,460 --> 00:33:51,930 Can I erase this or should I leave it up? 598 00:33:51,930 --> 00:33:53,180 Because I can go over there. 599 00:33:53,180 --> 00:33:59,570 600 00:33:59,570 --> 00:34:03,276 It's not like I'm erasing any deep dark secrets. 601 00:34:03,276 --> 00:34:06,190 There's no magic here. 602 00:34:06,190 --> 00:34:19,510 All right so test number, if, what, equal to, say what? 603 00:34:19,510 --> 00:34:21,215 2? 604 00:34:21,215 --> 00:34:21,560 Yeah. 605 00:34:21,560 --> 00:34:23,500 Why not? 606 00:34:23,500 --> 00:34:24,750 And maybe 3. 607 00:34:24,750 --> 00:34:29,030 608 00:34:29,030 --> 00:34:31,679 Now what do I do if it is? 609 00:34:31,679 --> 00:34:37,595 610 00:34:37,595 --> 00:34:38,845 Are they prime? 611 00:34:38,845 --> 00:34:40,920 612 00:34:40,920 --> 00:34:42,210 So I'm done right. 613 00:34:42,210 --> 00:34:46,095 So I'm going to return true. 614 00:34:46,095 --> 00:34:48,630 615 00:34:48,630 --> 00:34:54,774 Now what do I do if the number given is not 2 or 3? 616 00:34:54,774 --> 00:34:57,204 AUDIENCE: [INAUDIBLE] 617 00:34:57,204 --> 00:34:59,710 PROFESSOR: You're talking about the module operator. 618 00:34:59,710 --> 00:35:00,090 Right. 619 00:35:00,090 --> 00:35:02,510 So we will use that. 620 00:35:02,510 --> 00:35:06,200 That will tell us whether or not an integer divides evenly 621 00:35:06,200 --> 00:35:09,232 into another integer or the remainder after an integer is 622 00:35:09,232 --> 00:35:10,594 divided into another integer. 623 00:35:10,594 --> 00:35:13,290 624 00:35:13,290 --> 00:35:14,810 Let me ask you another question. 625 00:35:14,810 --> 00:35:19,165 What is the maximum value of an integer divisor for a 626 00:35:19,165 --> 00:35:20,320 non-prime number? 627 00:35:20,320 --> 00:35:22,122 So for a composite number. 628 00:35:22,122 --> 00:35:23,550 AUDIENCE: So itself. 629 00:35:23,550 --> 00:35:24,555 PROFESSOR: What's that? 630 00:35:24,555 --> 00:35:27,861 AUDIENCE: Number itself. 631 00:35:27,861 --> 00:35:30,060 PROFESSOR: OK, excluding the number itself. 632 00:35:30,060 --> 00:35:33,740 633 00:35:33,740 --> 00:35:39,110 Well let's say that I have n as the number I'm testing, 634 00:35:39,110 --> 00:35:43,300 square root of n, because I'm not going to have a factor 635 00:35:43,300 --> 00:35:46,100 that's larger than that. 636 00:35:46,100 --> 00:35:50,130 And I ask that because there's a loop involved. 637 00:35:50,130 --> 00:35:54,945 So how would I go about this systematically? 638 00:35:54,945 --> 00:36:02,000 I'd probably start at, let's say 5. 639 00:36:02,000 --> 00:36:05,030 640 00:36:05,030 --> 00:36:05,930 OK. 641 00:36:05,930 --> 00:36:18,100 And then test if n modula, let's say 5, is equal to 0. 642 00:36:18,100 --> 00:36:23,250 Now if n is evenly divisible by 5, then that must mean that 643 00:36:23,250 --> 00:36:27,010 n is composite, because 5 is a factor. 644 00:36:27,010 --> 00:36:36,065 So if it is then return false. 645 00:36:36,065 --> 00:36:39,130 646 00:36:39,130 --> 00:36:45,690 Now what if it isn't? 647 00:36:45,690 --> 00:36:48,490 So that means that n is not evenly divisible by 5. 648 00:36:48,490 --> 00:36:52,760 Does that mean that the number's automatically prime? 649 00:36:52,760 --> 00:36:55,527 So after 5, what would be a good number to 650 00:36:55,527 --> 00:36:56,777 test, to move to? 651 00:36:56,777 --> 00:37:02,450 652 00:37:02,450 --> 00:37:04,170 All right. 653 00:37:04,170 --> 00:37:05,840 6? 654 00:37:05,840 --> 00:37:06,900 No. 655 00:37:06,900 --> 00:37:08,090 That wouldn't be it. 656 00:37:08,090 --> 00:37:15,790 Because if 6 is a factor, then obviously it's not. 657 00:37:15,790 --> 00:37:17,840 Whatever. 658 00:37:17,840 --> 00:37:19,250 So we're going to move on to 7. 659 00:37:19,250 --> 00:37:25,370 660 00:37:25,370 --> 00:37:29,210 So basically we're going to test all the odd numbers. 661 00:37:29,210 --> 00:37:31,540 And this is going to be the same as that. 662 00:37:31,540 --> 00:37:34,520 So this repetition indicates here that I 663 00:37:34,520 --> 00:37:36,730 probably need a loop. 664 00:37:36,730 --> 00:37:50,390 So instead of doing this, I want to say x is equal to 5 665 00:37:50,390 --> 00:37:57,075 while x is less than-- 666 00:37:57,075 --> 00:38:00,600 667 00:38:00,600 --> 00:38:10,340 We're going to test if x evenly divides n. 668 00:38:10,340 --> 00:38:16,100 669 00:38:16,100 --> 00:38:23,280 And if it does, return false. 670 00:38:23,280 --> 00:38:36,740 And if it doesn't, then we just increment x and repeat. 671 00:38:36,740 --> 00:38:40,440 And what happens when x becomes greater than 672 00:38:40,440 --> 00:38:43,420 square root of n? 673 00:38:43,420 --> 00:38:45,330 Well the WHILE loop's going to stop. 674 00:38:45,330 --> 00:38:48,510 And that also means that if I've made it to that point, 675 00:38:48,510 --> 00:38:52,360 then I've not found any numbers between 5 and square 676 00:38:52,360 --> 00:38:55,960 root of n that will evenly divide n. 677 00:38:55,960 --> 00:39:03,160 So that means that n is prime. 678 00:39:03,160 --> 00:39:28,020 So if I translate this into code, it would look 679 00:39:28,020 --> 00:39:29,270 something like this. 680 00:39:29,270 --> 00:39:33,350 681 00:39:33,350 --> 00:39:38,700 Now I see. 682 00:39:38,700 --> 00:39:42,420 683 00:39:42,420 --> 00:39:44,490 So first we're going to check if n is less 684 00:39:44,490 --> 00:39:46,800 than or equal to 3. 685 00:39:46,800 --> 00:39:50,370 If it's 2 or 3, then we'll return true. 686 00:39:50,370 --> 00:39:54,940 If it's not 2 or 3, then that means it's 1 or 0. 687 00:39:54,940 --> 00:39:57,040 So return false. 688 00:39:57,040 --> 00:40:00,030 So we've got those cases. 689 00:40:00,030 --> 00:40:01,440 And then we're going to iterate-- 690 00:40:01,440 --> 00:40:04,950 or if n is greater than 3, we're going to iterate-- 691 00:40:04,950 --> 00:40:07,554 692 00:40:07,554 --> 00:40:11,450 now why would you go from 2-- 693 00:40:11,450 --> 00:40:13,410 we're going to integrate through all the possible 694 00:40:13,410 --> 00:40:18,120 divisors and check for divisibility. 695 00:40:18,120 --> 00:40:20,790 696 00:40:20,790 --> 00:40:23,760 And if we evenly divide it, return false. 697 00:40:23,760 --> 00:40:25,920 And if we make it through the loop, we'd return true. 698 00:40:25,920 --> 00:40:29,399 699 00:40:29,399 --> 00:40:33,240 AUDIENCE: Does that RETURN stop the loop? 700 00:40:33,240 --> 00:40:36,120 PROFESSOR: Yes. 701 00:40:36,120 --> 00:40:39,000 Well think about what RETURN is doing. 702 00:40:39,000 --> 00:40:42,800 You're in this function, test primality, And as soon as 703 00:40:42,800 --> 00:40:47,440 Python sees return, that's telling Python to kick out of 704 00:40:47,440 --> 00:40:49,990 the function and return whatever is 705 00:40:49,990 --> 00:40:51,080 after the return statement. 706 00:40:51,080 --> 00:40:55,250 So this false here, it says return false, that means that 707 00:40:55,250 --> 00:40:57,220 it doesn't matter where you are, it's just going to kick 708 00:40:57,220 --> 00:40:59,830 out of the innermost function or the function that encloses 709 00:40:59,830 --> 00:41:01,290 that return and return that value. 710 00:41:01,290 --> 00:41:06,130 711 00:41:06,130 --> 00:41:07,380 Any questions? 712 00:41:07,380 --> 00:41:09,255 713 00:41:09,255 --> 00:41:10,505 All right. 714 00:41:10,505 --> 00:41:21,630 715 00:41:21,630 --> 00:41:27,220 So testing primality, 1 is false, 2 is true, 3 is true, 4 716 00:41:27,220 --> 00:41:30,760 is false, and 5 is true. 717 00:41:30,760 --> 00:41:34,730 So it looks like the program works. 718 00:41:34,730 --> 00:41:37,440 And if no one has any questions, I'm going to move 719 00:41:37,440 --> 00:41:38,690 on to the last major topic. 720 00:41:38,690 --> 00:41:43,720 721 00:41:43,720 --> 00:41:45,879 Everyone's good on pseudocode? 722 00:41:45,879 --> 00:41:47,875 All right. 723 00:41:47,875 --> 00:41:51,867 AUDIENCE: What would the main purpose of pseudocode is for 724 00:41:51,867 --> 00:41:53,863 yourself when you're writing a program or when you want to 725 00:41:53,863 --> 00:41:55,859 explain it to other people? 726 00:41:55,859 --> 00:41:58,770 PROFESSOR: Both. 727 00:41:58,770 --> 00:42:01,670 So the question was, is writing pseudocode useful for 728 00:42:01,670 --> 00:42:03,750 just understanding a program yourself or for explaining it 729 00:42:03,750 --> 00:42:05,390 to other people? 730 00:42:05,390 --> 00:42:06,640 The answer is both. 731 00:42:06,640 --> 00:42:11,000 732 00:42:11,000 --> 00:42:11,640 I don't know. 733 00:42:11,640 --> 00:42:15,430 It's the difference between showing someone the derivative 734 00:42:15,430 --> 00:42:18,060 of the function and then explaining that what you're 735 00:42:18,060 --> 00:42:21,580 doing is finding a function that gives you the slope of a 736 00:42:21,580 --> 00:42:23,620 function at that point. 737 00:42:23,620 --> 00:42:27,590 So it's one is more intuitive for some 738 00:42:27,590 --> 00:42:30,415 people than the other. 739 00:42:30,415 --> 00:42:31,920 A mathematician would understand the 740 00:42:31,920 --> 00:42:33,910 former pretty quickly. 741 00:42:33,910 --> 00:42:36,720 An English major would understand the latter maybe. 742 00:42:36,720 --> 00:42:42,440 743 00:42:42,440 --> 00:42:47,850 So when I explain my research to people, I don't tell them 744 00:42:47,850 --> 00:42:52,530 that I mess around with Gaussian Mixture models and 745 00:42:52,530 --> 00:42:54,310 Hidden Markov models. 746 00:42:54,310 --> 00:42:56,980 I tell them that I'm trying to figure out how people 747 00:42:56,980 --> 00:43:00,870 mispronounce words when they speak foreign languages. 748 00:43:00,870 --> 00:43:04,350 A lot easier for people to digest. 749 00:43:04,350 --> 00:43:07,500 With debugging, what are bugs? 750 00:43:07,500 --> 00:43:09,000 AUDIENCE: Mistakes. 751 00:43:09,000 --> 00:43:10,340 PROFESSOR: Mistakes. 752 00:43:10,340 --> 00:43:15,615 And if you see one bug, there are probably many more. 753 00:43:15,615 --> 00:43:18,310 754 00:43:18,310 --> 00:43:24,520 So when you're debugging, your goal is not to move quickly. 755 00:43:24,520 --> 00:43:29,320 This is an instance where the maxim fast is slow and slow is 756 00:43:29,320 --> 00:43:31,890 fast comes into play. 757 00:43:31,890 --> 00:43:34,560 You want to be very deliberate and systematic when you're 758 00:43:34,560 --> 00:43:37,380 trying to debug code. 759 00:43:37,380 --> 00:43:39,360 You want to ask the question why your code is 760 00:43:39,360 --> 00:43:41,610 doing what it does. 761 00:43:41,610 --> 00:43:46,720 And remember, the first recitation I said, that your 762 00:43:46,720 --> 00:43:48,110 computer's not going to do anything that you do 763 00:43:48,110 --> 00:43:49,360 not tell it to do. 764 00:43:49,360 --> 00:43:55,420 765 00:43:55,420 --> 00:43:59,770 It's not something that people do naturally. 766 00:43:59,770 --> 00:44:04,370 If you watch some of the TAs and sometimes a student will 767 00:44:04,370 --> 00:44:06,860 say, how do you find the bug so quickly? 768 00:44:06,860 --> 00:44:11,590 Well it's because I've been programming for 18 years. 769 00:44:11,590 --> 00:44:12,860 Professor Guttag's been programming 770 00:44:12,860 --> 00:44:14,030 for longer than that. 771 00:44:14,030 --> 00:44:16,430 So a lot of it is experience. 772 00:44:16,430 --> 00:44:21,050 And it's just when we've debug our own programs and when we 773 00:44:21,050 --> 00:44:24,100 were learning to program, it was as painful for us as it 774 00:44:24,100 --> 00:44:25,620 was for you. 775 00:44:25,620 --> 00:44:31,990 So that said, you want to start with asking, how could 776 00:44:31,990 --> 00:44:36,070 your code have produced the output that it did? 777 00:44:36,070 --> 00:44:40,200 Then you want to figure out some experiments that are 778 00:44:40,200 --> 00:44:45,000 repeatable and that you have an idea of what the 779 00:44:45,000 --> 00:44:47,020 output should be. 780 00:44:47,020 --> 00:44:54,920 So after you do that, then you want to test your code one by 781 00:44:54,920 --> 00:45:00,510 one on these different test cases and see what it does. 782 00:45:00,510 --> 00:45:02,020 And in order to see what it does, you 783 00:45:02,020 --> 00:45:04,390 can use a print statement. 784 00:45:04,390 --> 00:45:10,460 So when you think you found a bug and you think you have a 785 00:45:10,460 --> 00:45:15,220 solution to your code, you want to make as few changes as 786 00:45:15,220 --> 00:45:20,500 possible at a time, it's because as you're making 787 00:45:20,500 --> 00:45:22,190 corrections, you can still introduce bugs. 788 00:45:22,190 --> 00:45:28,230 789 00:45:28,230 --> 00:45:29,870 Let's see. 790 00:45:29,870 --> 00:45:32,460 So a useful way to do this is to use a test harness. 791 00:45:32,460 --> 00:45:35,340 So when we actually grade your problem sets, a lot of the 792 00:45:35,340 --> 00:45:39,270 time the TAs will put together a set of test 793 00:45:39,270 --> 00:45:41,840 cases for your code. 794 00:45:41,840 --> 00:45:46,020 So one of the things is a lot of the times when you get one 795 00:45:46,020 --> 00:45:48,130 of the problems or when you look at the problems, it'll 796 00:45:48,130 --> 00:45:51,430 have some example input and output. 797 00:45:51,430 --> 00:45:53,250 But that doesn't necessarily mean that we 798 00:45:53,250 --> 00:45:55,060 only test on that. 799 00:45:55,060 --> 00:45:57,250 There's additional test cases that we use. 800 00:45:57,250 --> 00:45:58,630 And it's not to trip you up. 801 00:45:58,630 --> 00:46:00,615 It's because there's a lot of different variations. 802 00:46:00,615 --> 00:46:03,400 803 00:46:03,400 --> 00:46:07,470 And it's also, if you read the specification, you follow the 804 00:46:07,470 --> 00:46:12,250 specification, then you'll be fine. 805 00:46:12,250 --> 00:46:13,420 Moving on. 806 00:46:13,420 --> 00:46:15,100 So let's look at an example. 807 00:46:15,100 --> 00:46:18,570 I have a function here, is palindrome. 808 00:46:18,570 --> 00:46:21,580 You've seen this before, right? 809 00:46:21,580 --> 00:46:21,880 Yes? 810 00:46:21,880 --> 00:46:22,750 Yeah. 811 00:46:22,750 --> 00:46:23,940 OK. 812 00:46:23,940 --> 00:46:28,750 So it's supposed to return true if string s is a 813 00:46:28,750 --> 00:46:30,340 palindrome. 814 00:46:30,340 --> 00:46:32,370 And so I've written this function. 815 00:46:32,370 --> 00:46:34,920 And I've also written a test harness. 816 00:46:34,920 --> 00:46:38,130 Now there's a lot more code in the test harness, but it's 817 00:46:38,130 --> 00:46:39,380 pretty simple code. 818 00:46:39,380 --> 00:46:44,040 819 00:46:44,040 --> 00:46:48,300 When you're writing functions, you want to think of the type 820 00:46:48,300 --> 00:46:49,970 of input you could receive. 821 00:46:49,970 --> 00:46:54,680 And you want to think of, what are the 822 00:46:54,680 --> 00:46:55,520 kind of boundary cases? 823 00:46:55,520 --> 00:46:57,940 So the extremes of input that you can get. 824 00:46:57,940 --> 00:47:01,000 We call these boundary cases, edge cases. 825 00:47:01,000 --> 00:47:04,150 For the is_palindrome function, it would be like the 826 00:47:04,150 --> 00:47:07,960 empty string would be one. 827 00:47:07,960 --> 00:47:11,850 Or just a single character. 828 00:47:11,850 --> 00:47:15,600 These are the kind of minimum we can have or 829 00:47:15,600 --> 00:47:16,880 we could think of. 830 00:47:16,880 --> 00:47:21,270 On the opposite end of the spectrum, theoretically we 831 00:47:21,270 --> 00:47:23,920 could have an infinitely long string. 832 00:47:23,920 --> 00:47:26,350 So we're not going to actually test for an 833 00:47:26,350 --> 00:47:29,160 infinitely long string. 834 00:47:29,160 --> 00:47:32,710 Anyway, all we're going to do is in our test harness, we're 835 00:47:32,710 --> 00:47:35,320 just going to run the function on these inputs. 836 00:47:35,320 --> 00:47:38,740 And we know that an empty string should be true. 837 00:47:38,740 --> 00:47:40,010 We know that a single character 838 00:47:40,010 --> 00:47:41,260 string should be true. 839 00:47:41,260 --> 00:47:43,640 840 00:47:43,640 --> 00:47:47,630 We know that if I have a string that's two characters 841 00:47:47,630 --> 00:47:49,150 long and they're the same character, 842 00:47:49,150 --> 00:47:51,140 that should be true. 843 00:47:51,140 --> 00:47:53,100 If they're two characters, then it should be false. 844 00:47:53,100 --> 00:47:55,620 845 00:47:55,620 --> 00:47:59,694 And what I'm going to do now is I'm looking at kind of 846 00:47:59,694 --> 00:48:01,410 expecting what we call expected input. 847 00:48:01,410 --> 00:48:05,710 So after I've hit my edge cases, I'm going to look at 848 00:48:05,710 --> 00:48:10,250 all the strings of an even length and make sure that the 849 00:48:10,250 --> 00:48:11,890 function works properly. 850 00:48:11,890 --> 00:48:15,120 And then I'm going to look at strings with an odd length. 851 00:48:15,120 --> 00:48:19,490 And then once I get to this point where I've tested a 852 00:48:19,490 --> 00:48:22,140 number of different lengths, and in this case, it's just 2 853 00:48:22,140 --> 00:48:24,470 through 5 or 0 through 5, if you want to 854 00:48:24,470 --> 00:48:26,630 include the edge cases. 855 00:48:26,630 --> 00:48:29,050 Then I'm going to say, well it looks like all tests are pass. 856 00:48:29,050 --> 00:48:32,210 And I think that this function works pretty good for anything 857 00:48:32,210 --> 00:48:35,520 we can expect it to encounter reasonably. 858 00:48:35,520 --> 00:48:41,540 So the way that you use test harnesses, is every time you 859 00:48:41,540 --> 00:48:44,060 make a change to your program, you want to run the test 860 00:48:44,060 --> 00:48:47,950 harness, because that'll catch any bugs you may have 861 00:48:47,950 --> 00:48:48,870 introduced. 862 00:48:48,870 --> 00:48:50,700 And so I'm going to finish up with this really quickly 863 00:48:50,700 --> 00:48:53,120 because I know my time's up. 864 00:48:53,120 --> 00:48:56,000 865 00:48:56,000 --> 00:48:57,730 So I got a bug. 866 00:48:57,730 --> 00:49:01,910 It's telling me that one of my test cases failed. 867 00:49:01,910 --> 00:49:08,340 So line 299, which is this test case. 868 00:49:08,340 --> 00:49:16,430 So what we can do is now that we know that it fails, we can 869 00:49:16,430 --> 00:49:26,270 say, maybe printout our input and see what we have. 870 00:49:26,270 --> 00:49:33,150 And instead of just running, I'm just going to run that one 871 00:49:33,150 --> 00:49:34,400 test case that failed. 872 00:49:34,400 --> 00:49:46,450 873 00:49:46,450 --> 00:49:49,330 So obviously this should be true. 874 00:49:49,330 --> 00:49:51,760 And what we're seeing is that on the first call to 875 00:49:51,760 --> 00:49:55,410 is_palindrome, s is abba. 876 00:49:55,410 --> 00:50:02,120 And then on the recursive call to it, we only get bba. 877 00:50:02,120 --> 00:50:03,330 That means that we've only chopped 878 00:50:03,330 --> 00:50:05,920 of the front character. 879 00:50:05,920 --> 00:50:07,170 So you see the bug? 880 00:50:07,170 --> 00:50:11,350 881 00:50:11,350 --> 00:50:12,600 Well here. 882 00:50:12,600 --> 00:50:15,010 883 00:50:15,010 --> 00:50:18,330 So there we go. 884 00:50:18,330 --> 00:50:23,273