1 00:00:00,000 --> 00:00:00,530 2 00:00:00,530 --> 00:00:02,960 The following content is provided under a Creative 3 00:00:02,960 --> 00:00:04,370 Commons license. 4 00:00:04,370 --> 00:00:07,410 Your support will help MIT OpenCourseWare continue to 5 00:00:07,410 --> 00:00:11,060 offer high quality educational resources for free. 6 00:00:11,060 --> 00:00:13,960 To make a donation or view additional materials from 7 00:00:13,960 --> 00:00:19,790 hundreds of MIT courses, visit MIT OpenCourseWare at 8 00:00:19,790 --> 00:00:21,040 ocw.mit.edu. 9 00:00:21,040 --> 00:00:22,760 10 00:00:22,760 --> 00:00:25,550 PROFESSOR: I wanted to give everybody a more conceptual 11 00:00:25,550 --> 00:00:29,210 idea of what big O notation as well as, hopefully, answer any 12 00:00:29,210 --> 00:00:31,600 lingering questions you might have about object-oriented 13 00:00:31,600 --> 00:00:32,860 programming. 14 00:00:32,860 --> 00:00:35,820 So I have these notes, and I type them up, and they're 15 00:00:35,820 --> 00:00:36,490 pretty detailed. 16 00:00:36,490 --> 00:00:37,650 So I'm just going to go through some 17 00:00:37,650 --> 00:00:39,830 points kind of quickly. 18 00:00:39,830 --> 00:00:41,610 Who's still kind of unclear about why we 19 00:00:41,610 --> 00:00:43,165 even due big O notation? 20 00:00:43,165 --> 00:00:47,550 21 00:00:47,550 --> 00:00:52,150 So who can explain why we do big O notation quickly? 22 00:00:52,150 --> 00:00:53,960 What is it? 23 00:00:53,960 --> 00:00:54,250 Yeah? 24 00:00:54,250 --> 00:00:56,780 AUDIENCE: [INAUDIBLE]. 25 00:00:56,780 --> 00:00:57,330 PROFESSOR: Right. 26 00:00:57,330 --> 00:00:57,780 Exactly. 27 00:00:57,780 --> 00:01:00,300 So big O notation gives us an upper bound on how long 28 00:01:00,300 --> 00:01:01,670 something is going to take. 29 00:01:01,670 --> 00:01:03,870 Now, something that's important to remember is it's 30 00:01:03,870 --> 00:01:06,380 not a time bound. 31 00:01:06,380 --> 00:01:08,510 So something that's often confusing is that people say, 32 00:01:08,510 --> 00:01:12,660 oh, this is something that'll tell us how long our programs 33 00:01:12,660 --> 00:01:13,420 going to run. 34 00:01:13,420 --> 00:01:14,700 That's actually not the case. 35 00:01:14,700 --> 00:01:17,910 Big O notation informs how many steps 36 00:01:17,910 --> 00:01:19,660 something's going to take. 37 00:01:19,660 --> 00:01:20,990 And so why is that important? 38 00:01:20,990 --> 00:01:23,100 Well, I mean I look at all you guys, a couple of you have 39 00:01:23,100 --> 00:01:25,050 laptops out. 40 00:01:25,050 --> 00:01:26,430 Everybody's computer run something 41 00:01:26,430 --> 00:01:27,560 at a different speed. 42 00:01:27,560 --> 00:01:28,450 Right? 43 00:01:28,450 --> 00:01:34,810 But if we say something is big O of n, what we're saying here 44 00:01:34,810 --> 00:01:39,350 is we're saying that the worst case number of steps your 45 00:01:39,350 --> 00:01:42,180 program is going to take is going to be linear with 46 00:01:42,180 --> 00:01:44,450 respect to the size of the input. 47 00:01:44,450 --> 00:01:46,280 So if my computer is five times faster than your 48 00:01:46,280 --> 00:01:48,080 computer, my computer will probably run 49 00:01:48,080 --> 00:01:50,760 it five times faster. 50 00:01:50,760 --> 00:01:53,720 As the size of the input grows, I'm going to expect a 51 00:01:53,720 --> 00:01:56,680 linear speedup in the amount of time it's going to take. 52 00:01:56,680 --> 00:01:59,220 53 00:01:59,220 --> 00:02:02,350 So why is that important? 54 00:02:02,350 --> 00:02:06,750 At the bottom of page one, big O notation, we are 55 00:02:06,750 --> 00:02:07,930 particularly concerned with the 56 00:02:07,930 --> 00:02:10,820 scalability of our functions. 57 00:02:10,820 --> 00:02:15,260 So what the O notation does is it might not predict what's 58 00:02:15,260 --> 00:02:19,600 going to be the fastest for really small inputs, for an 59 00:02:19,600 --> 00:02:22,740 array size 10. 60 00:02:22,740 --> 00:02:26,750 You guys know a little bit about graphs, right? 61 00:02:26,750 --> 00:02:29,640 We have a graph of x-squared, and we 62 00:02:29,640 --> 00:02:30,890 have a graph of x-cubed. 63 00:02:30,890 --> 00:02:33,270 64 00:02:33,270 --> 00:02:36,030 There's a portion of time where the graph of x-squared 65 00:02:36,030 --> 00:02:39,270 is actually bigger than x-cubed. 66 00:02:39,270 --> 00:02:41,870 But then all of a sudden, there's a point where x-cubed 67 00:02:41,870 --> 00:02:44,910 just goes, whoosh, way bigger than x-squared. 68 00:02:44,910 --> 00:02:47,960 So if we're in some really small amount of input, big O 69 00:02:47,960 --> 00:02:51,930 notation might not tell us what's the best function. 70 00:02:51,930 --> 00:02:53,170 But in big O notation, we're not 71 00:02:53,170 --> 00:02:54,530 concerned about small inputs. 72 00:02:54,530 --> 00:02:56,480 We're concerned about really big inputs. 73 00:02:56,480 --> 00:02:58,630 We're concerned about filtering the genome. 74 00:02:58,630 --> 00:03:02,080 We're concerned about analyzing data from Hubble, 75 00:03:02,080 --> 00:03:06,690 really huge blocks of data. 76 00:03:06,690 --> 00:03:10,580 So if we're looking at a program that analyzes the 77 00:03:10,580 --> 00:03:14,500 human genome, like three million base pairs, some 78 00:03:14,500 --> 00:03:17,250 segment that we're looking at, and we have two algorithms. 79 00:03:17,250 --> 00:03:20,380 One runs in order n time, and one runs in 80 00:03:20,380 --> 00:03:21,630 order n-cubed time. 81 00:03:21,630 --> 00:03:24,190 82 00:03:24,190 --> 00:03:26,820 What this means is regardless of the machine that we're 83 00:03:26,820 --> 00:03:30,650 running on, so this is algorithm 1, this is algorithm 84 00:03:30,650 --> 00:03:34,060 2, regardless of the machine that we're running on, we'd 85 00:03:34,060 --> 00:03:39,830 expect algorithm 2 to run approximately n-cubed over n 86 00:03:39,830 --> 00:03:42,950 approximately n-squared slower. 87 00:03:42,950 --> 00:03:45,830 So with big O notation, you can compare two algorithms by 88 00:03:45,830 --> 00:03:50,160 just looking at the ratio of their big O run time. 89 00:03:50,160 --> 00:03:53,040 So if I'm looking at something that has an array of size two 90 00:03:53,040 --> 00:03:56,070 million as its input, is it clear that this is going to be 91 00:03:56,070 --> 00:03:57,320 a much better choice? 92 00:03:57,320 --> 00:04:00,240 93 00:04:00,240 --> 00:04:00,320 Ok. 94 00:04:00,320 --> 00:04:02,120 So you'll run into that, especially a lot of you guys 95 00:04:02,120 --> 00:04:03,510 are taking this for the purposes 96 00:04:03,510 --> 00:04:04,600 of scientific computing. 97 00:04:04,600 --> 00:04:06,720 So you'll run into big O notation a lot. 98 00:04:06,720 --> 00:04:08,790 It's important to have a grasp of what it means. 99 00:04:08,790 --> 00:04:12,170 100 00:04:12,170 --> 00:04:15,270 On the second page of the handout, I have some common 101 00:04:15,270 --> 00:04:16,940 ones that you'll see. 102 00:04:16,940 --> 00:04:18,870 The first one is constant time. 103 00:04:18,870 --> 00:04:21,839 We denote constant time as order 1. 104 00:04:21,839 --> 00:04:25,100 But you'll notice that I have here is order 1 is equal to 105 00:04:25,100 --> 00:04:28,880 order 10 is equal to order 2 to the 100th. 106 00:04:28,880 --> 00:04:31,520 That's unexpected to a lot of people who are learning about 107 00:04:31,520 --> 00:04:32,770 big O notation. 108 00:04:32,770 --> 00:04:35,810 109 00:04:35,810 --> 00:04:37,060 Why is this true? 110 00:04:37,060 --> 00:04:40,690 111 00:04:40,690 --> 00:04:42,160 That seems kind of ridiculous. 112 00:04:42,160 --> 00:04:43,270 This is a really big number. 113 00:04:43,270 --> 00:04:45,392 This is really small number. 114 00:04:45,392 --> 00:04:45,843 Yeah? 115 00:04:45,843 --> 00:04:47,196 AUDIENCE: [INAUDIBLE]. 116 00:04:47,196 --> 00:04:48,530 PROFESSOR: Yeah. 117 00:04:48,530 --> 00:04:49,010 Exactly. 118 00:04:49,010 --> 00:04:54,480 So we look at a graph of 1 and a graph of 2 to the 100th. 119 00:04:54,480 --> 00:05:01,860 120 00:05:01,860 --> 00:05:01,940 Ok. 121 00:05:01,940 --> 00:05:04,770 We'll see that even though 2 to the 100th is much higher, 122 00:05:04,770 --> 00:05:12,920 much bigger than 1, if this is our input size, as the size of 123 00:05:12,920 --> 00:05:19,030 our input grows, do we see any change in these two graphs? 124 00:05:19,030 --> 00:05:20,590 No. 125 00:05:20,590 --> 00:05:21,840 They're completely constant. 126 00:05:21,840 --> 00:05:24,270 127 00:05:24,270 --> 00:05:26,520 When you're doing big O notation, if you run across an 128 00:05:26,520 --> 00:05:31,450 algorithm that does not depend on the size of the input, OK, 129 00:05:31,450 --> 00:05:33,270 it's always going to be order 1. 130 00:05:33,270 --> 00:05:36,330 Even if it's like 2 to the 100th steps, if it's a 131 00:05:36,330 --> 00:05:39,710 constant number of times regardless of the size of the 132 00:05:39,710 --> 00:05:41,820 input, it's constant time. 133 00:05:41,820 --> 00:05:45,280 Other ones you'll see are logarithmic time. 134 00:05:45,280 --> 00:05:49,550 Any base for logarithmic time is about the same order. 135 00:05:49,550 --> 00:05:54,280 So order log base 2 of n is order log base 10 of n. 136 00:05:54,280 --> 00:05:56,900 This is the fastest time bound for search. 137 00:05:56,900 --> 00:05:59,000 Does anybody know what type of search we'd be doing in 138 00:05:59,000 --> 00:06:01,780 logarithmic time? 139 00:06:01,780 --> 00:06:03,015 Something maybe we-- 140 00:06:03,015 --> 00:06:03,825 AUDIENCE: Bisection time 141 00:06:03,825 --> 00:06:04,630 PROFESSOR: Yeah. 142 00:06:04,630 --> 00:06:05,040 Exactly. 143 00:06:05,040 --> 00:06:06,880 Bisection search is logarithmic time. 144 00:06:06,880 --> 00:06:08,270 Because we take our input. 145 00:06:08,270 --> 00:06:11,650 And at every step, we cut in half, cut in half, cut in 146 00:06:11,650 --> 00:06:13,630 half, and that's the fastest search we can do. 147 00:06:13,630 --> 00:06:16,850 148 00:06:16,850 --> 00:06:19,610 The order n is linear time. 149 00:06:19,610 --> 00:06:24,790 Order n log n is the fastest time bound we have for sort. 150 00:06:24,790 --> 00:06:28,040 We'll be talking about sort in a couple of weeks. 151 00:06:28,040 --> 00:06:30,900 And order n-squared is quadratic time. 152 00:06:30,900 --> 00:06:37,140 Anything that is order n to some variable, so order 153 00:06:37,140 --> 00:06:41,110 n-squared, order n-cubed, order n-fourth, all of that is 154 00:06:41,110 --> 00:06:48,060 going to be less than order something to the power of n. 155 00:06:48,060 --> 00:06:52,040 So if we have something that's order 2 to the n, that's 156 00:06:52,040 --> 00:06:54,280 ridiculous. 157 00:06:54,280 --> 00:06:56,400 That's a computationally very intensive algorithm. 158 00:06:56,400 --> 00:06:59,340 159 00:06:59,340 --> 00:07:02,040 So on page two, I have some questions for you. 160 00:07:02,040 --> 00:07:03,180 (1), (2), (3). 161 00:07:03,180 --> 00:07:05,930 Does order 100 n-squared equal order n-squared. 162 00:07:05,930 --> 00:07:08,810 163 00:07:08,810 --> 00:07:10,666 Who says yes? 164 00:07:10,666 --> 00:07:11,170 All right. 165 00:07:11,170 --> 00:07:11,630 Very good. 166 00:07:11,630 --> 00:07:14,040 How about does order one quarter n-cubed 167 00:07:14,040 --> 00:07:15,290 equals order n-cubed? 168 00:07:15,290 --> 00:07:17,830 169 00:07:17,830 --> 00:07:23,040 Does order n plus order n equals order n? 170 00:07:23,040 --> 00:07:25,330 The answer is yes to all of those. 171 00:07:25,330 --> 00:07:28,830 In the intuitive sense behind this is that big O notation 172 00:07:28,830 --> 00:07:31,640 deals with the limiting behavior of function. 173 00:07:31,640 --> 00:07:36,365 So I made some nifty graphs for you guys to look at. 174 00:07:36,365 --> 00:07:44,260 When we're comparing order 100 n-squared to n-squared n cubed 175 00:07:44,260 --> 00:07:48,030 and 1/4 n-cubed, what people often think of is what I have 176 00:07:48,030 --> 00:07:50,140 here in the first figure. 177 00:07:50,140 --> 00:07:53,400 So these are the four functions I just mentioned. 178 00:07:53,400 --> 00:07:55,630 There's a legend in the top left-hand corner. 179 00:07:55,630 --> 00:07:59,740 And the scale of this is up to x equals 80. 180 00:07:59,740 --> 00:08:02,120 So you'll see at this scale, this line 181 00:08:02,120 --> 00:08:04,950 right here is 100 x-squared. 182 00:08:04,950 --> 00:08:07,280 So this is, I think, often a tripping point is that when 183 00:08:07,280 --> 00:08:09,130 people are conceptualizing functions, they're saying, 184 00:08:09,130 --> 00:08:12,950 well, yeah, 100 x-squared is much bigger than x-cubed, 185 00:08:12,950 --> 00:08:16,450 which is a lot bigger than 1/4 x-cubed. 186 00:08:16,450 --> 00:08:19,310 So for very small inputs, yes that's true. 187 00:08:19,310 --> 00:08:26,620 But what we're concerned about is the behavior as the input 188 00:08:26,620 --> 00:08:27,870 gets very, very large. 189 00:08:27,870 --> 00:08:30,600 190 00:08:30,600 --> 00:08:36,309 So now, we're looking at a size of up to 1,000. 191 00:08:36,309 --> 00:08:39,770 So now we see here, x-cubed, even though it's a little bit 192 00:08:39,770 --> 00:08:43,990 smaller than 100 x-squared in the beginning, it shoots off. 193 00:08:43,990 --> 00:08:46,740 x-cubed is much bigger than either of the 2 x-squared. 194 00:08:46,740 --> 00:08:50,340 And even 1/4 x-cubed is becoming bigger than 100 195 00:08:50,340 --> 00:08:54,000 x-squared out of 1,000. 196 00:08:54,000 --> 00:08:56,970 So that's an intuitive sense why x-cubed no matter what the 197 00:08:56,970 --> 00:09:00,190 coefficient is in front of it is going to dominate any term 198 00:09:00,190 --> 00:09:02,550 with x-squared in it, because x-cubed is just going to go, 199 00:09:02,550 --> 00:09:06,720 whoosh, real big like that. 200 00:09:06,720 --> 00:09:09,900 And if we go out even further, let's go out to input size of 201 00:09:09,900 --> 00:09:16,460 50,000, we go out to an input size of 50,000, we see that 202 00:09:16,460 --> 00:09:25,380 even 100 x-squared versus just x-squared, alright? they're 203 00:09:25,380 --> 00:09:27,840 about the same. 204 00:09:27,840 --> 00:09:32,910 The x-cubed terms now, they're way above x-squared. 205 00:09:32,910 --> 00:09:37,910 So the two x-squared terms, 100 versus just 1, as far as 206 00:09:37,910 --> 00:09:42,690 the coefficient goes, they're about the same. 207 00:09:42,690 --> 00:09:46,420 So this is the scale at which we're concerned about when 208 00:09:46,420 --> 00:09:48,550 we're talking about big O notation, the limiting 209 00:09:48,550 --> 00:09:51,400 behavior as your input size grows very large. 210 00:09:51,400 --> 00:09:54,830 50,000 is not even that large, if you think about the size of 211 00:09:54,830 --> 00:09:56,460 the genome. 212 00:09:56,460 --> 00:09:59,420 I mean does anybody here bio? 213 00:09:59,420 --> 00:10:02,000 What's like the size of the human genome. 214 00:10:02,000 --> 00:10:04,460 How many base pairs? 215 00:10:04,460 --> 00:10:08,094 Or even one gene or one chromosome. 216 00:10:08,094 --> 00:10:10,329 AUDIENCE: [INAUDIBLE]. 217 00:10:10,329 --> 00:10:12,020 PROFESSOR: What's the biggest? 218 00:10:12,020 --> 00:10:17,190 219 00:10:17,190 --> 00:10:18,610 AUDIENCE: It's over 50,000. 220 00:10:18,610 --> 00:10:20,620 PROFESSOR: Yeah, over 50,000. 221 00:10:20,620 --> 00:10:23,950 And we're talking about the amount of data that we get 222 00:10:23,950 --> 00:10:26,530 back from the Hubble Space Telescope. 223 00:10:26,530 --> 00:10:28,320 I mean the resolution on those things are absolutely 224 00:10:28,320 --> 00:10:28,920 ridiculous. 225 00:10:28,920 --> 00:10:31,540 And you run all sorts of algorithms on those images to 226 00:10:31,540 --> 00:10:34,650 try and see if there's life in the universe. 227 00:10:34,650 --> 00:10:38,190 So we're very concerned about the big long term behavior of 228 00:10:38,190 --> 00:10:40,380 these functions. 229 00:10:40,380 --> 00:10:41,650 How about page three? 230 00:10:41,650 --> 00:10:42,940 One last question. 231 00:10:42,940 --> 00:10:47,740 Does order 100 n-squared plus 1/4 232 00:10:47,740 --> 00:10:50,580 n-cube equal order n-cubed? 233 00:10:50,580 --> 00:10:53,680 Who says yes? 234 00:10:53,680 --> 00:10:55,000 And so I have one more graph. 235 00:10:55,000 --> 00:11:01,080 236 00:11:01,080 --> 00:11:06,740 Down here, these red dots are 100 x-squared. 237 00:11:06,740 --> 00:11:09,600 These blue circles are 1/4 x-cubed. 238 00:11:09,600 --> 00:11:13,850 And this line is the sum. 239 00:11:13,850 --> 00:11:19,810 We can see that this line is a little bit bigger than the 1/4 240 00:11:19,810 --> 00:11:20,800 x-cubed term. 241 00:11:20,800 --> 00:11:26,570 But really, this has no effect at this far out. 242 00:11:26,570 --> 00:11:28,900 So that's why we're just going to drop any lower order terms 243 00:11:28,900 --> 00:11:33,210 whenever you're approached with a big O expression that 244 00:11:33,210 --> 00:11:35,850 has a bunch of constant factors, it has all sorts of 245 00:11:35,850 --> 00:11:38,510 different powers of n and stuff, you're always just 246 00:11:38,510 --> 00:11:40,550 going to drop all the constant factors and just pick the 247 00:11:40,550 --> 00:11:41,630 biggest thing. 248 00:11:41,630 --> 00:11:48,501 So this line right here is order n-cubed. 249 00:11:48,501 --> 00:11:50,750 Is that clear to everybody? 250 00:11:50,750 --> 00:11:54,450 So now I've gotten through the basics of how we analyze this 251 00:11:54,450 --> 00:11:56,600 and why are we looking at this. 252 00:11:56,600 --> 00:11:57,965 Let's look at some code. 253 00:11:57,965 --> 00:12:01,210 254 00:12:01,210 --> 00:12:12,500 So the first example, all of these things right here, in 255 00:12:12,500 --> 00:12:17,050 Python, we make the assumption that statements like this, x 256 00:12:17,050 --> 00:12:21,020 plus 1, x times y, all these mathematical operations are 257 00:12:21,020 --> 00:12:23,600 all constant time. 258 00:12:23,600 --> 00:12:26,250 That's something that you can just assume. 259 00:12:26,250 --> 00:12:29,130 So for this function down here, we have constant time, 260 00:12:29,130 --> 00:12:32,730 constant time, constant time, constant time operation. 261 00:12:32,730 --> 00:12:36,620 So we'd say, this function bar is what? 262 00:12:36,620 --> 00:12:37,780 What's its complexity? 263 00:12:37,780 --> 00:12:39,955 AUDIENCE: [INAUDIBLE]. 264 00:12:39,955 --> 00:12:40,670 PROFESSOR: Yeah. 265 00:12:40,670 --> 00:12:41,260 Constant time. 266 00:12:41,260 --> 00:12:45,460 So the complexity of all these functions are just a 1, 267 00:12:45,460 --> 00:12:49,860 because it doesn't matter how big the input is. 268 00:12:49,860 --> 00:12:51,280 It's all going to run in constant time. 269 00:12:51,280 --> 00:12:57,610 270 00:12:57,610 --> 00:13:01,060 For this multiplication function here, 271 00:13:01,060 --> 00:13:03,430 we use a for loop. 272 00:13:03,430 --> 00:13:05,470 Oftentimes, when we see for loops that's just going 273 00:13:05,470 --> 00:13:08,500 through the input, there's a signal to us that it's going 274 00:13:08,500 --> 00:13:11,880 to probably contain a factor of O(n). 275 00:13:11,880 --> 00:13:13,800 Why is that? 276 00:13:13,800 --> 00:13:14,990 What do we do in this for loop? 277 00:13:14,990 --> 00:13:18,170 We say for i in range y. 278 00:13:18,170 --> 00:13:18,970 What does that mean? 279 00:13:18,970 --> 00:13:22,950 How many times do we execute that for loop? 280 00:13:22,950 --> 00:13:24,170 Yeah, y times. 281 00:13:24,170 --> 00:13:27,220 So if y is really small, we execute that for loop just a 282 00:13:27,220 --> 00:13:28,460 few number of times. 283 00:13:28,460 --> 00:13:31,130 But if y is really large, we execute that for loop a whole 284 00:13:31,130 --> 00:13:33,130 bunch of times. 285 00:13:33,130 --> 00:13:35,450 So when we're analyzing this, we see this for loop and we 286 00:13:35,450 --> 00:13:39,430 say, ah, that for loop must be O(y). 287 00:13:39,430 --> 00:13:42,630 288 00:13:42,630 --> 00:13:44,884 Does that make sense to everybody? 289 00:13:44,884 --> 00:13:47,780 OK, good. 290 00:13:47,780 --> 00:13:50,501 Let's look at a factorial. 291 00:13:50,501 --> 00:13:56,015 Can anybody tell me what the complexity of factorial is? 292 00:13:56,015 --> 00:13:57,392 AUDIENCE: [INAUDIBLE]. 293 00:13:57,392 --> 00:13:58,310 PROFESSOR: Yeah. 294 00:13:58,310 --> 00:13:58,890 Order n. 295 00:13:58,890 --> 00:14:00,995 Why is it order n? 296 00:14:00,995 --> 00:14:02,830 AUDIENCE: Because it's self for loop. 297 00:14:02,830 --> 00:14:03,710 PROFESSOR: Yeah. 298 00:14:03,710 --> 00:14:04,890 It's the exact same structure. 299 00:14:04,890 --> 00:14:07,750 We have a for loop that's going through 300 00:14:07,750 --> 00:14:10,850 range 1 to n plus 1. 301 00:14:10,850 --> 00:14:12,560 So that's dependent on the size of n. 302 00:14:12,560 --> 00:14:14,460 So this for loop is order n. 303 00:14:14,460 --> 00:14:16,110 And inside the for loop, we just do a 304 00:14:16,110 --> 00:14:17,750 constant time operation. 305 00:14:17,750 --> 00:14:18,920 That's the other thing. 306 00:14:18,920 --> 00:14:21,390 Just because we have this for loop doesn't mean that what's 307 00:14:21,390 --> 00:14:24,420 inside the for loop is going to be constant. 308 00:14:24,420 --> 00:14:29,470 But in this case, if we have order n times, we do a content 309 00:14:29,470 --> 00:14:30,960 time operation. 310 00:14:30,960 --> 00:14:34,560 Then this whole chunk of the for loop is order n. 311 00:14:34,560 --> 00:14:37,660 The rest of everything else is just constant time. 312 00:14:37,660 --> 00:14:41,490 So we have constant time plus order n times constant time 313 00:14:41,490 --> 00:14:43,470 plus constant time, they're going to be order n. 314 00:14:43,470 --> 00:14:48,090 315 00:14:48,090 --> 00:14:50,640 How about this one? 316 00:14:50,640 --> 00:14:53,810 Factorial 2. 317 00:14:53,810 --> 00:14:56,150 AUDIENCE: [INAUDIBLE]. 318 00:14:56,150 --> 00:14:56,960 PROFESSOR: Yeah. 319 00:14:56,960 --> 00:14:57,420 Exactly. 320 00:14:57,420 --> 00:14:59,160 This is also order n. 321 00:14:59,160 --> 00:15:01,480 The only thing that's different in this code is that 322 00:15:01,480 --> 00:15:03,370 we initialize its count variable. 323 00:15:03,370 --> 00:15:04,750 And inside the for loop, we also 324 00:15:04,750 --> 00:15:06,630 increment this count variable. 325 00:15:06,630 --> 00:15:10,220 But both result times equals num and count plus equal 1, 326 00:15:10,220 --> 00:15:12,940 both of these are constant time operations. 327 00:15:12,940 --> 00:15:17,870 So if we do n times 2 constant times operations, that's still 328 00:15:17,870 --> 00:15:20,480 going to be order n. 329 00:15:20,480 --> 00:15:23,780 So the takeaway from these two examples that I'm trying to 330 00:15:23,780 --> 00:15:28,530 demonstrate here is a single line of code can generate a 331 00:15:28,530 --> 00:15:32,010 pretty complex thing. 332 00:15:32,010 --> 00:15:34,100 But a collection of lines of code might 333 00:15:34,100 --> 00:15:35,720 still be constant time. 334 00:15:35,720 --> 00:15:37,845 So you have to look at every line of code 335 00:15:37,845 --> 00:15:39,095 and consider that. 336 00:15:39,095 --> 00:15:44,725 337 00:15:44,725 --> 00:15:49,040 I've thrown in some conditionals here. 338 00:15:49,040 --> 00:15:50,450 What's the complexity of this guy? 339 00:15:50,450 --> 00:15:57,184 340 00:15:57,184 --> 00:15:58,627 AUDIENCE: [INAUDIBLE]. 341 00:15:58,627 --> 00:15:59,190 PROFESSOR: Yeah. 342 00:15:59,190 --> 00:16:00,240 This is also linear. 343 00:16:00,240 --> 00:16:01,540 What's going on here? 344 00:16:01,540 --> 00:16:05,260 We initialize a variable count that's constant time. 345 00:16:05,260 --> 00:16:07,620 We go through character in a string. 346 00:16:07,620 --> 00:16:10,830 This is linear in the size of a string. 347 00:16:10,830 --> 00:16:16,900 Now we say if character equal, equal t, this character equal, 348 00:16:16,900 --> 00:16:21,070 equal t, that's also a constant time operation. 349 00:16:21,070 --> 00:16:25,200 That's just asking if this one thing equals this other thing. 350 00:16:25,200 --> 00:16:26,800 So we're looking at two characters. 351 00:16:26,800 --> 00:16:28,140 We're looking at two numbers. 352 00:16:28,140 --> 00:16:30,690 Equal, equal or not equal is generally a 353 00:16:30,690 --> 00:16:32,640 constant times operation. 354 00:16:32,640 --> 00:16:39,950 The exception to this might be a quality of certain types, 355 00:16:39,950 --> 00:16:42,790 like if you define a class and you define a quality method in 356 00:16:42,790 --> 00:16:45,470 your class, and the equality method of your class is not 357 00:16:45,470 --> 00:16:48,100 constant time, then this equal, equal check might not 358 00:16:48,100 --> 00:16:49,180 be constant time. 359 00:16:49,180 --> 00:16:53,090 But on two strings, equal, equal is constant time. 360 00:16:53,090 --> 00:16:56,020 And this is constant time as well. 361 00:16:56,020 --> 00:16:58,860 So linear in the size of a string. 362 00:16:58,860 --> 00:17:01,480 Something that's important when you're doing this for 363 00:17:01,480 --> 00:17:06,910 exams, it's a good idea to define what n is before you 364 00:17:06,910 --> 00:17:09,000 give the complexity bound. 365 00:17:09,000 --> 00:17:13,700 So here I'm saying n is equal to the size of a string. 366 00:17:13,700 --> 00:17:16,710 So now, I can say this function is order n. 367 00:17:16,710 --> 00:17:19,869 What I'm saying is that it's a linear with respect to the 368 00:17:19,869 --> 00:17:22,670 size or the length of a string. 369 00:17:22,670 --> 00:17:28,140 Because sometimes, like in the one where there is the input x 370 00:17:28,140 --> 00:17:33,840 and y, the running time was only linear in the size of y. 371 00:17:33,840 --> 00:17:37,180 So you want to define that n was equal to the size of y to 372 00:17:37,180 --> 00:17:38,690 say that it was order n. 373 00:17:38,690 --> 00:17:39,640 So always be clear. 374 00:17:39,640 --> 00:17:42,960 If it's not clear, be sure to explicitly state 375 00:17:42,960 --> 00:17:44,210 what n is equal to. 376 00:17:44,210 --> 00:17:48,720 377 00:17:48,720 --> 00:17:51,890 This code's a little more tricky. 378 00:17:51,890 --> 00:17:55,315 What's going on here? 379 00:17:55,315 --> 00:17:56,565 AUDIENCE: [INAUDIBLE]. 380 00:17:56,565 --> 00:18:10,963 381 00:18:10,963 --> 00:18:12,520 PROFESSOR: Yeah. 382 00:18:12,520 --> 00:18:13,680 That was perfect. 383 00:18:13,680 --> 00:18:20,100 So just to reiterate, the for loop we know is linear with 384 00:18:20,100 --> 00:18:21,410 respect to the size of a string. 385 00:18:21,410 --> 00:18:23,710 We have to go through every character in a string. 386 00:18:23,710 --> 00:18:29,220 Now, the second is if char in b string, when we're looking 387 00:18:29,220 --> 00:18:31,790 at big O notation, we're worried about the worst case 388 00:18:31,790 --> 00:18:33,830 complexity in upper bound. 389 00:18:33,830 --> 00:18:35,635 What's the worst case? 390 00:18:35,635 --> 00:18:36,615 AUDIENCE: [INAUDIBLE]. 391 00:18:36,615 --> 00:18:37,740 PROFESSOR: Yeah. 392 00:18:37,740 --> 00:18:40,620 If the character is not in b string, we have to look at 393 00:18:40,620 --> 00:18:42,760 every single character in b string before 394 00:18:42,760 --> 00:18:44,920 we can return false. 395 00:18:44,920 --> 00:18:46,750 So that is linear. 396 00:18:46,750 --> 00:18:50,320 This one single line, if character in b string, that 397 00:18:50,320 --> 00:18:53,150 one line is linear with respect to 398 00:18:53,150 --> 00:18:55,330 the size of b string. 399 00:18:55,330 --> 00:18:59,440 So how do we analyze the complexity of this? 400 00:18:59,440 --> 00:19:03,000 I want to be able to touch the screen. 401 00:19:03,000 --> 00:19:04,920 We have this for loop. 402 00:19:04,920 --> 00:19:08,110 This for loop is executed. 403 00:19:08,110 --> 00:19:10,600 Let's call n is the length of a string. 404 00:19:10,600 --> 00:19:13,460 This for loop is executed n times. 405 00:19:13,460 --> 00:19:15,980 Every time we execute this for loop, we 406 00:19:15,980 --> 00:19:18,100 execute this inner body. 407 00:19:18,100 --> 00:19:20,920 And what's the time bound on the inner body? 408 00:19:20,920 --> 00:19:24,230 Well, if we let m equal the length of b string, when we 409 00:19:24,230 --> 00:19:28,320 say that this check is order m every time we run it, then we 410 00:19:28,320 --> 00:19:34,010 run an order m operation order n times. 411 00:19:34,010 --> 00:19:35,260 So the complexity is-- 412 00:19:35,260 --> 00:19:38,530 413 00:19:38,530 --> 00:19:41,850 we use something of size m, n times. 414 00:19:41,850 --> 00:19:42,850 AUDIENCE: [INAUDIBLE]. 415 00:19:42,850 --> 00:19:43,600 PROFESSOR: Yeah. 416 00:19:43,600 --> 00:19:44,850 Just order n, m. 417 00:19:44,850 --> 00:19:49,640 418 00:19:49,640 --> 00:19:53,270 So we execute an order m check order n time, we say this 419 00:19:53,270 --> 00:19:57,340 function is order n, m. 420 00:19:57,340 --> 00:19:59,750 Does that make sense to everybody? 421 00:19:59,750 --> 00:20:02,120 Because you'll see the nested for loops. 422 00:20:02,120 --> 00:20:04,070 Nested for loops are very similar to this. 423 00:20:04,070 --> 00:20:07,780 424 00:20:07,780 --> 00:20:11,960 While loops combine the best of conditionals with the best 425 00:20:11,960 --> 00:20:14,560 of for loops. 426 00:20:14,560 --> 00:20:17,350 Because a while loop has a chance to act like for loop, 427 00:20:17,350 --> 00:20:19,930 but a while loop can also have a conditional. 428 00:20:19,930 --> 00:20:22,910 It's actually possible to write a while loop that has a 429 00:20:22,910 --> 00:20:26,250 complex conditional that also executes a number of times. 430 00:20:26,250 --> 00:20:28,790 And so you could have one single line of code generating 431 00:20:28,790 --> 00:20:32,530 like an order n-squared complexity. 432 00:20:32,530 --> 00:20:34,610 Let's look at factorial 3. 433 00:20:34,610 --> 00:20:37,290 Who can tell the complexity of factorial 3? 434 00:20:37,290 --> 00:20:43,751 435 00:20:43,751 --> 00:20:45,739 AUDIENCE: [INAUDIBLE]. 436 00:20:45,739 --> 00:20:46,750 PROFESSOR: Yeah. 437 00:20:46,750 --> 00:20:48,100 It's also linear. 438 00:20:48,100 --> 00:20:51,070 It's interesting that factorial is always linear 439 00:20:51,070 --> 00:20:53,360 despite its name. 440 00:20:53,360 --> 00:20:55,070 We have constant time operations. 441 00:20:55,070 --> 00:20:57,942 How many times does the while loop executed? 442 00:20:57,942 --> 00:20:58,766 AUDIENCE: n times. 443 00:20:58,766 --> 00:21:00,580 PROFESSOR: Yeah, n times. 444 00:21:00,580 --> 00:21:03,960 And what's inside the body of the while loop? 445 00:21:03,960 --> 00:21:05,770 Constant time operations. 446 00:21:05,770 --> 00:21:08,740 So we execute a bunch of constant time operation n 447 00:21:08,740 --> 00:21:11,690 times order n. 448 00:21:11,690 --> 00:21:13,170 How about this char split example? 449 00:21:13,170 --> 00:21:16,884 450 00:21:16,884 --> 00:21:19,640 This one's a little tricky because you're like, well, 451 00:21:19,640 --> 00:21:21,060 what's the complexity of len? 452 00:21:21,060 --> 00:21:23,600 453 00:21:23,600 --> 00:21:26,660 In Python, len's actually a constant time operation. 454 00:21:26,660 --> 00:21:29,110 This example's very crafted such that all of the 455 00:21:29,110 --> 00:21:31,970 operations that are here are constant time. 456 00:21:31,970 --> 00:21:34,660 So appending to a list is constant time. 457 00:21:34,660 --> 00:21:38,700 And indexing a string is constant time. 458 00:21:38,700 --> 00:21:41,300 So what's the complexity of char split? 459 00:21:41,300 --> 00:21:47,940 460 00:21:47,940 --> 00:21:50,666 Constant time. 461 00:21:50,666 --> 00:21:51,916 AUDIENCE: [INAUDIBLE]. 462 00:21:51,916 --> 00:21:57,442 463 00:21:57,442 --> 00:22:00,760 PROFESSOR: Who would agree with constant time? 464 00:22:00,760 --> 00:22:03,302 And who would say it's linear time? 465 00:22:03,302 --> 00:22:04,140 OK, yeah. 466 00:22:04,140 --> 00:22:04,530 Very good. 467 00:22:04,530 --> 00:22:06,090 It is linear time. 468 00:22:06,090 --> 00:22:08,150 That's a correct intuition. 469 00:22:08,150 --> 00:22:10,990 We say while the length of the a string is not equal to the 470 00:22:10,990 --> 00:22:13,550 length of the result, these are two constant time 471 00:22:13,550 --> 00:22:14,560 operations. 472 00:22:14,560 --> 00:22:15,360 Well, what do we do? 473 00:22:15,360 --> 00:22:18,850 We append a value to the result, and then we add up 474 00:22:18,850 --> 00:22:19,920 this index. 475 00:22:19,920 --> 00:22:23,190 So when is this check going to be equal? 476 00:22:23,190 --> 00:22:25,490 This check's going to be equal when the length of the result 477 00:22:25,490 --> 00:22:27,060 is equal to the length of a string. 478 00:22:27,060 --> 00:22:28,980 And that's only going to happen after we've gone 479 00:22:28,980 --> 00:22:31,730 through the entire a string, and we've added each of its 480 00:22:31,730 --> 00:22:33,650 characters to result. 481 00:22:33,650 --> 00:22:38,520 So this is linear with respect to the size of a string. 482 00:22:38,520 --> 00:22:42,200 Something that's important to recognize is that not all 483 00:22:42,200 --> 00:22:45,390 string in the list operations are constant time. 484 00:22:45,390 --> 00:22:50,110 There's a website here that first off, it says C Python if 485 00:22:50,110 --> 00:22:50,780 you go to it. 486 00:22:50,780 --> 00:22:53,710 C Python just means Python implemented in C, which is 487 00:22:53,710 --> 00:22:56,860 actually what you're running, C Python. 488 00:22:56,860 --> 00:22:59,200 So don't worry about that. 489 00:22:59,200 --> 00:23:01,920 There's often two time bound complexities. 490 00:23:01,920 --> 00:23:05,380 It says the amortized time and the worst case time. 491 00:23:05,380 --> 00:23:07,990 And so if you're looking for big O notation, you don't want 492 00:23:07,990 --> 00:23:09,010 to use the amortized time. 493 00:23:09,010 --> 00:23:12,150 You want to use the worst case time. 494 00:23:12,150 --> 00:23:14,870 And it's important to note that operations like slicing 495 00:23:14,870 --> 00:23:18,250 and copying actually aren't constant time. 496 00:23:18,250 --> 00:23:22,770 If you slice a list or a string, the complexity of that 497 00:23:22,770 --> 00:23:25,870 operation is going to depend on how big your slice is. 498 00:23:25,870 --> 00:23:27,670 Does that makes sense? 499 00:23:27,670 --> 00:23:29,690 Does the way that a slice works is that walks through 500 00:23:29,690 --> 00:23:33,750 the list until it gets to the index, and then keeps walking 501 00:23:33,750 --> 00:23:36,330 until the final index, and then copies that and 502 00:23:36,330 --> 00:23:37,870 returns it to you. 503 00:23:37,870 --> 00:23:40,990 So slicing is not constant time. 504 00:23:40,990 --> 00:23:43,150 Copying is similarly not constant time. 505 00:23:43,150 --> 00:23:47,070 506 00:23:47,070 --> 00:23:50,630 For this little snippet of code, this is just 507 00:23:50,630 --> 00:23:51,600 similar to what we-- 508 00:23:51,600 --> 00:23:52,850 yeah? 509 00:23:52,850 --> 00:23:54,100 AUDIENCE: [INAUDIBLE]. 510 00:23:54,100 --> 00:24:06,215 511 00:24:06,215 --> 00:24:09,140 PROFESSOR: So this is what I was saying. 512 00:24:09,140 --> 00:24:12,420 You want to define what n is. 513 00:24:12,420 --> 00:24:15,800 So we say something like n equals the length of a string. 514 00:24:15,800 --> 00:24:19,400 And then you can say it's order n. 515 00:24:19,400 --> 00:24:23,180 It's important to define what you're saying the complexity 516 00:24:23,180 --> 00:24:24,430 is related to. 517 00:24:24,430 --> 00:24:27,070 518 00:24:27,070 --> 00:24:31,280 So here, I'm saying if we let n equal to the size of z, can 519 00:24:31,280 --> 00:24:33,280 anybody tell me what the complexity of this 520 00:24:33,280 --> 00:24:36,870 snippet of code is? 521 00:24:36,870 --> 00:24:37,285 [UNINTELLIGIBLE]. 522 00:24:37,285 --> 00:24:37,960 AUDIENCE: [INAUDIBLE]. 523 00:24:37,960 --> 00:24:38,840 PROFESSOR: Yeah, precisesly. 524 00:24:38,840 --> 00:24:39,750 Order n-squared. 525 00:24:39,750 --> 00:24:40,740 Why? 526 00:24:40,740 --> 00:24:44,360 Well, because we execute this for i for loop 527 00:24:44,360 --> 00:24:47,870 here order n times. 528 00:24:47,870 --> 00:24:50,860 Each time through this for loop, the body of this for 529 00:24:50,860 --> 00:24:53,880 loop is, in fact, another for loop. 530 00:24:53,880 --> 00:24:58,290 So my approach to problems like this is just step back a 531 00:24:58,290 --> 00:25:01,150 minute and ignore the outer loop. 532 00:25:01,150 --> 00:25:02,380 Just concentrate on the inner loop. 533 00:25:02,380 --> 00:25:04,010 What's the runtime of this inner loop? 534 00:25:04,010 --> 00:25:06,510 535 00:25:06,510 --> 00:25:06,740 Yeah. 536 00:25:06,740 --> 00:25:07,620 This is order n. 537 00:25:07,620 --> 00:25:08,830 We go through this. 538 00:25:08,830 --> 00:25:10,760 Now, go to the outer loop. 539 00:25:10,760 --> 00:25:12,630 Just ignore the body since we've already 540 00:25:12,630 --> 00:25:13,550 analyzed the body. 541 00:25:13,550 --> 00:25:14,500 Ignore it. 542 00:25:14,500 --> 00:25:17,640 What's the complexity of the outer loop? 543 00:25:17,640 --> 00:25:19,270 Also order n. 544 00:25:19,270 --> 00:25:21,200 So now you can combine the analysis. 545 00:25:21,200 --> 00:25:26,190 You can say for order n times, I execute this body. 546 00:25:26,190 --> 00:25:28,950 This body takes order n times. 547 00:25:28,950 --> 00:25:34,160 So if execute something that's order n order n times, that is 548 00:25:34,160 --> 00:25:36,040 order n squared complexity. 549 00:25:36,040 --> 00:25:39,040 So we just multiply how long it takes the outer body of the 550 00:25:39,040 --> 00:25:40,550 loop to take the inner body of the loop. 551 00:25:40,550 --> 00:25:44,370 And so in this fashion, I could give you now probably a 552 00:25:44,370 --> 00:25:46,240 four or five nested for loop, and you could tell me the 553 00:25:46,240 --> 00:25:47,490 complexity of it. 554 00:25:47,490 --> 00:25:52,900 555 00:25:52,900 --> 00:25:57,550 Harder sometimes to understand is recursion. 556 00:25:57,550 --> 00:26:00,180 I don't know how important it is to understand this because 557 00:26:00,180 --> 00:26:01,820 I've never actually taught this class before. 558 00:26:01,820 --> 00:26:03,440 But Mitch did tell me to go over this. 559 00:26:03,440 --> 00:26:06,500 So I'd like to touch on it. 560 00:26:06,500 --> 00:26:09,080 So consider recursive factorial. 561 00:26:09,080 --> 00:26:10,630 What's the time complexity of this? 562 00:26:10,630 --> 00:26:13,740 How can we figure out the time complexity 563 00:26:13,740 --> 00:26:14,990 over a recursive function? 564 00:26:14,990 --> 00:26:23,020 565 00:26:23,020 --> 00:26:24,950 The way we want to figure out the time complexity of a 566 00:26:24,950 --> 00:26:27,530 recursive function is just to figure out how many times 567 00:26:27,530 --> 00:26:30,570 we're executing said recursive function. 568 00:26:30,570 --> 00:26:35,290 So here I have recursive factorial of n. 569 00:26:35,290 --> 00:26:39,750 When I make a call to this, what do I do? 570 00:26:39,750 --> 00:26:45,850 I make a call to recursive factorial n minus 1. 571 00:26:45,850 --> 00:26:47,330 And then what does this do? 572 00:26:47,330 --> 00:26:51,430 This calls recursive factorial on a sub problem the 573 00:26:51,430 --> 00:26:54,340 size n minus 2. 574 00:26:54,340 --> 00:27:00,180 So oftentimes, when you're dealing with recursive 575 00:27:00,180 --> 00:27:02,620 problems to figure out the complexity, what you need to 576 00:27:02,620 --> 00:27:06,190 do is you need to figure out how many times you're going to 577 00:27:06,190 --> 00:27:10,040 make a recursive call before a result is returned. 578 00:27:10,040 --> 00:27:12,590 Intuitively, we can start to see a pattern. 579 00:27:12,590 --> 00:27:16,630 We can say, I called on n, and then n minus 1, and then n 580 00:27:16,630 --> 00:27:22,850 minus 2, and I keep calling recursive factorial until n is 581 00:27:22,850 --> 00:27:24,740 less than or equal to 0. 582 00:27:24,740 --> 00:27:27,180 When is n going to be less than or equal to 0? 583 00:27:27,180 --> 00:27:28,740 Well, when I get n minus n. 584 00:27:28,740 --> 00:27:31,360 585 00:27:31,360 --> 00:27:34,626 So how many calls is that? 586 00:27:34,626 --> 00:27:35,594 AUDIENCE: [INAUDIBLE]. 587 00:27:35,594 --> 00:27:36,195 PROFESSOR: Yeah. 588 00:27:36,195 --> 00:27:38,030 This is n calls. 589 00:27:38,030 --> 00:27:43,530 So it's a good practice to get into being able to draw this 590 00:27:43,530 --> 00:27:46,430 out and work yourself through how many times you're running 591 00:27:46,430 --> 00:27:47,680 the recursion. 592 00:27:47,680 --> 00:27:51,000 And we see we're making n calls, we can say, oh, this 593 00:27:51,000 --> 00:27:52,250 must be linear in time. 594 00:27:52,250 --> 00:27:56,720 595 00:27:56,720 --> 00:27:58,815 How about this one, this foo function? 596 00:27:58,815 --> 00:28:06,410 597 00:28:06,410 --> 00:28:09,720 This one's a little harder to see. 598 00:28:09,720 --> 00:28:12,330 But what are we doing? 599 00:28:12,330 --> 00:28:20,480 We call foo on input of size n, which then makes a call to 600 00:28:20,480 --> 00:28:24,370 sub problem the size n/2, which makes the call to a sub 601 00:28:24,370 --> 00:28:34,200 problem of size n/4 and so on until I make a call to sub 602 00:28:34,200 --> 00:28:35,760 problem of some size. 603 00:28:35,760 --> 00:28:38,260 So this is n. 604 00:28:38,260 --> 00:28:40,840 This is 2 to the 1st. 605 00:28:40,840 --> 00:28:43,150 This is 2-squared. 606 00:28:43,150 --> 00:28:44,350 We start to see a pattern-- 607 00:28:44,350 --> 00:28:47,610 2-squared, 2-cubed, 2 to the fourth. 608 00:28:47,610 --> 00:28:50,570 So we're going to keep making calls on a smaller, and 609 00:28:50,570 --> 00:28:52,820 smaller, and smaller sub problem size. 610 00:28:52,820 --> 00:28:56,770 But instead of being linear like before, we're decreasing 611 00:28:56,770 --> 00:28:58,100 at an exponential rate. 612 00:28:58,100 --> 00:29:01,630 613 00:29:01,630 --> 00:29:03,470 There's a bunch of different ways to try and work this out 614 00:29:03,470 --> 00:29:04,140 in your head. 615 00:29:04,140 --> 00:29:06,160 I wrote up one possible description. 616 00:29:06,160 --> 00:29:10,330 But when we're decreasing at this exponential rate, what's 617 00:29:10,330 --> 00:29:15,360 going to end up happening is that this recursive problem 618 00:29:15,360 --> 00:29:21,900 where we make a recursive call in the form to sub problem of 619 00:29:21,900 --> 00:29:28,310 size n/b, the complexity of that is always going to be log 620 00:29:28,310 --> 00:29:30,450 base b of n. 621 00:29:30,450 --> 00:29:33,840 So this is just like bisection search, where bisection 622 00:29:33,840 --> 00:29:36,620 search, we essentially do in bisection search. 623 00:29:36,620 --> 00:29:39,450 We restrict the problem size by half every time. 624 00:29:39,450 --> 00:29:41,950 And that leads to logarithmic time, actually 625 00:29:41,950 --> 00:29:43,610 log base 2 of n. 626 00:29:43,610 --> 00:29:46,540 This problem is also log base 2 of n. 627 00:29:46,540 --> 00:29:54,520 If we change this recursive call from n/2 to n/6, we get a 628 00:29:54,520 --> 00:29:58,590 cut time complexity of log base 6 of n. 629 00:29:58,590 --> 00:30:00,120 So try and work that through. 630 00:30:00,120 --> 00:30:02,610 You can read this closer later. 631 00:30:02,610 --> 00:30:06,280 Definitely ask me if you need more help on that one. 632 00:30:06,280 --> 00:30:09,460 The last one is how do we deal time complexity of something 633 00:30:09,460 --> 00:30:10,710 like Fibonacci? 634 00:30:10,710 --> 00:30:13,250 635 00:30:13,250 --> 00:30:19,260 Fibonacci, fib n minus 1 plus fib n minus 2, initially, that 636 00:30:19,260 --> 00:30:20,360 kind of looks linear. 637 00:30:20,360 --> 00:30:20,700 Right? 638 00:30:20,700 --> 00:30:24,960 We just went over the recursive factorial, and it 639 00:30:24,960 --> 00:30:28,520 made the call to a sub problem the size n minus 1. 640 00:30:28,520 --> 00:30:31,280 And that was linear. 641 00:30:31,280 --> 00:30:33,170 Fibonacci's a little bit different. 642 00:30:33,170 --> 00:30:36,870 If you actually draw out in a tree, you start to see like at 643 00:30:36,870 --> 00:30:44,090 every level of the tree, we expand the call by 2. 644 00:30:44,090 --> 00:30:47,690 Now imagine this is just for Fibonacci of 6. 645 00:30:47,690 --> 00:30:49,610 Whenever you're doing big O complexity, you want to 646 00:30:49,610 --> 00:30:53,516 imagine it and put 100,000, 50,000. 647 00:30:53,516 --> 00:30:55,585 And you could imagine how big that tree grows. 648 00:30:55,585 --> 00:30:58,710 649 00:30:58,710 --> 00:31:03,640 Intuitively, the point to see here is that they're going to 650 00:31:03,640 --> 00:31:10,340 be about n levels to get down to 1 from your 651 00:31:10,340 --> 00:31:12,500 initial input of 6. 652 00:31:12,500 --> 00:31:15,150 So to get down to 1 from an initial input of size n is 653 00:31:15,150 --> 00:31:17,260 going to take about n levels. 654 00:31:17,260 --> 00:31:21,790 The branching factor of this tree at each level is 2. 655 00:31:21,790 --> 00:31:26,460 So if we have n levels, and at each level, we increase our 656 00:31:26,460 --> 00:31:30,780 branching factor by another 2, we can say that a loose bound 657 00:31:30,780 --> 00:31:32,840 on the complexity of this is actually 2 to the n. 658 00:31:32,840 --> 00:31:35,700 659 00:31:35,700 --> 00:31:39,450 This is something that's even less intuitive, I think, than 660 00:31:39,450 --> 00:31:41,430 what we did before with the logarithms. 661 00:31:41,430 --> 00:31:43,640 So try and work through it again. 662 00:31:43,640 --> 00:31:45,020 Play with it a little bit. 663 00:31:45,020 --> 00:31:47,370 There's actually a tighter bound on this, which is like 664 00:31:47,370 --> 00:31:51,170 1.62 to the n, which is a lot more complicated math that you 665 00:31:51,170 --> 00:31:53,130 could look up. 666 00:31:53,130 --> 00:31:56,530 But for the purposes of this class, it's sufficient to say 667 00:31:56,530 --> 00:31:58,610 that Fibonacci is order 2 to the n. 668 00:31:58,610 --> 00:32:02,900 669 00:32:02,900 --> 00:32:07,870 So does that roughly clear up some time complexities stuff 670 00:32:07,870 --> 00:32:08,910 for you guys? 671 00:32:08,910 --> 00:32:09,400 OK, awesome. 672 00:32:09,400 --> 00:32:10,320 Does anybody have the time? 673 00:32:10,320 --> 00:32:11,700 I forgot my watch today. 674 00:32:11,700 --> 00:32:12,660 AUDIENCE: 12:42. 675 00:32:12,660 --> 00:32:15,350 PROFESSOR: OK, excellent. 676 00:32:15,350 --> 00:32:17,590 That gives us a little bit of time to talk about 677 00:32:17,590 --> 00:32:18,555 object-oriented programming. 678 00:32:18,555 --> 00:32:21,650 Does anybody had any specific questions that object-oriented 679 00:32:21,650 --> 00:32:24,500 programming? 680 00:32:24,500 --> 00:32:24,990 How about this? 681 00:32:24,990 --> 00:32:27,106 How many of you guys finished the problem set and turned it 682 00:32:27,106 --> 00:32:28,330 in already? 683 00:32:28,330 --> 00:32:32,890 Or did any of you guys not turn in the problem set yet? 684 00:32:32,890 --> 00:32:36,640 I'll talk loosely about it then, not too specifically. 685 00:32:36,640 --> 00:32:39,120 Does anybody have any questions from, I guess, at 686 00:32:39,120 --> 00:32:40,210 least the first part? 687 00:32:40,210 --> 00:32:43,720 We're making some classes, making some trigger classes. 688 00:32:43,720 --> 00:32:44,593 Yeah? 689 00:32:44,593 --> 00:32:45,843 AUDIENCE: [INAUDIBLE]? 690 00:32:45,843 --> 00:32:50,251 691 00:32:50,251 --> 00:32:51,535 PROFESSOR: Self dot what? 692 00:32:51,535 --> 00:32:52,785 AUDIENCE: [INAUDIBLE]. 693 00:32:52,785 --> 00:32:56,415 694 00:32:56,415 --> 00:32:58,050 PROFESSOR: When we have like self-- 695 00:32:58,050 --> 00:33:00,120 we have like the getter methods. 696 00:33:00,120 --> 00:33:02,505 So what's important about that? 697 00:33:02,505 --> 00:33:04,870 I'll Tell you what's important about that. 698 00:33:04,870 --> 00:33:07,600 So we have a class. 699 00:33:07,600 --> 00:33:08,930 Let's say we have a class person. 700 00:33:08,930 --> 00:33:16,490 701 00:33:16,490 --> 00:33:21,900 So we define our INIT method to just take a name. 702 00:33:21,900 --> 00:33:32,230 703 00:33:32,230 --> 00:33:34,600 And so now, what the problems that ask you to do was to 704 00:33:34,600 --> 00:33:36,310 define a getter method. 705 00:33:36,310 --> 00:33:44,410 Define a getter method called get_name that 706 00:33:44,410 --> 00:33:45,660 just returns the attribute. 707 00:33:45,660 --> 00:33:50,690 708 00:33:50,690 --> 00:33:52,500 So what's the point of this? 709 00:33:52,500 --> 00:33:58,333 Because I can just say Sally equals person. 710 00:33:58,333 --> 00:34:07,550 711 00:34:07,550 --> 00:34:09,639 So here, I defined a person named Sally. 712 00:34:09,639 --> 00:34:15,580 And I initialized a person with the string Sally. 713 00:34:15,580 --> 00:34:19,730 If I just look at sally.name, that's going to just directly 714 00:34:19,730 --> 00:34:21,630 print the attribute. 715 00:34:21,630 --> 00:34:25,010 So why do we need this get name function? 716 00:34:25,010 --> 00:34:27,989 What's the point of this additional getter method? 717 00:34:27,989 --> 00:34:30,590 Does anybody know why that is? 718 00:34:30,590 --> 00:34:31,840 AUDIENCE: [INAUDIBLE]. 719 00:34:31,840 --> 00:34:34,510 720 00:34:34,510 --> 00:34:35,239 PROFESSOR: Right. 721 00:34:35,239 --> 00:34:36,070 So that's what it does. 722 00:34:36,070 --> 00:34:38,380 This get_name does return the attribute name. 723 00:34:38,380 --> 00:34:42,130 But we don't need this method to just look at 724 00:34:42,130 --> 00:34:43,380 the attribute name. 725 00:34:43,380 --> 00:34:46,070 726 00:34:46,070 --> 00:34:47,320 Let's actually code this up. 727 00:34:47,320 --> 00:34:58,970 728 00:34:58,970 --> 00:35:00,220 So we have class person. 729 00:35:00,220 --> 00:35:18,760 730 00:35:18,760 --> 00:35:22,470 So if we run this code, and over here in the shell, we 731 00:35:22,470 --> 00:35:28,210 define Sally equals person with the name Sally. 732 00:35:28,210 --> 00:35:31,110 733 00:35:31,110 --> 00:35:37,200 If I just print sally.name, it prints the attribute. 734 00:35:37,200 --> 00:35:42,970 So why did I need to provide this getter method called 735 00:35:42,970 --> 00:35:46,660 get_name that does the same thing? 736 00:35:46,660 --> 00:35:47,890 That's the question. 737 00:35:47,890 --> 00:35:51,200 That seems sort of redundant. 738 00:35:51,200 --> 00:35:54,610 But there's actually a pretty big important reason for it. 739 00:35:54,610 --> 00:35:59,200 Let's say we set s name equal to the attribute sally.name. 740 00:35:59,200 --> 00:36:02,930 741 00:36:02,930 --> 00:36:07,300 If we look at s name, we see Sally. 742 00:36:07,300 --> 00:36:09,202 Now if I say-- 743 00:36:09,202 --> 00:36:11,830 actually, I'm not sure if this is the correct reasoning. 744 00:36:11,830 --> 00:36:41,770 745 00:36:41,770 --> 00:36:43,970 This is going to be better. 746 00:36:43,970 --> 00:36:49,910 Let's say Sally equals a person Sally 747 00:36:49,910 --> 00:36:51,600 who's taking what? 748 00:36:51,600 --> 00:36:59,560 1803, 605, 11.1. 749 00:36:59,560 --> 00:37:04,460 So now I can look at the attribute classes to show 750 00:37:04,460 --> 00:37:08,020 Sally's classes, which are weird flows. 751 00:37:08,020 --> 00:37:12,610 And I can also use sally.getclasses to look at 752 00:37:12,610 --> 00:37:15,300 Sally's classes. 753 00:37:15,300 --> 00:37:21,740 If I set a variable s classes equal to sally.classes, this 754 00:37:21,740 --> 00:37:25,240 binds this variable s classes to the attribute 755 00:37:25,240 --> 00:37:26,960 sally.classes. 756 00:37:26,960 --> 00:37:38,290 Now if I say sclasses.append 1401, if I now look at the 757 00:37:38,290 --> 00:37:46,990 attribute sally.classes, it now has 1401 in it. 758 00:37:46,990 --> 00:37:48,140 This is not safe. 759 00:37:48,140 --> 00:37:49,500 This is not type safe. 760 00:37:49,500 --> 00:37:52,970 Because the reason for that is if you define a class, and you 761 00:37:52,970 --> 00:37:56,310 access the classes' attributes directly instead of through a 762 00:37:56,310 --> 00:37:58,910 getter method, you can then do this. 763 00:37:58,910 --> 00:38:01,250 And sometimes, it's accidental. 764 00:38:01,250 --> 00:38:05,080 You'll set some variable equal to some attribute of a class. 765 00:38:05,080 --> 00:38:10,130 Then later on in your code, you'll alter that variable. 766 00:38:10,130 --> 00:38:14,010 But that variable is not a copy of the attribute. 767 00:38:14,010 --> 00:38:17,250 Yes, you can make copies of that attribute and stuff, but 768 00:38:17,250 --> 00:38:20,680 the overall takeaway is that in programming, we try to do 769 00:38:20,680 --> 00:38:22,610 something called defensive programming. 770 00:38:22,610 --> 00:38:23,900 This isn't defensive. 771 00:38:23,900 --> 00:38:29,770 Because it is possible if you code it incorrectly to alter 772 00:38:29,770 --> 00:38:33,570 the attribute the instance of the class. 773 00:38:33,570 --> 00:38:36,030 But if we use the getter method, if instead of 774 00:38:36,030 --> 00:38:38,470 sally.classes, instead of directly accessing the 775 00:38:38,470 --> 00:38:41,160 attribute here, we have set s classes equal to 776 00:38:41,160 --> 00:38:42,930 sally.getclasses. 777 00:38:42,930 --> 00:38:45,780 And then, we had changed s classes around. 778 00:38:45,780 --> 00:38:50,180 That wouldn't have happened, because the getter method, it 779 00:38:50,180 --> 00:38:51,900 does return self.classes. 780 00:38:51,900 --> 00:38:55,830 But in the way that Python is scoped and when we return 781 00:38:55,830 --> 00:39:00,630 something, we're not returning the exact same thing, the 782 00:39:00,630 --> 00:39:02,850 reference that we're returning a copy of it. 783 00:39:02,850 --> 00:39:04,075 Does that make sense? 784 00:39:04,075 --> 00:39:04,670 All right. 785 00:39:04,670 --> 00:39:05,730 Cool. 786 00:39:05,730 --> 00:39:07,360 Other questions about classes? 787 00:39:07,360 --> 00:39:09,770 We have a little class appear if there's like some basic 788 00:39:09,770 --> 00:39:12,820 stuff that you'd like explained again. 789 00:39:12,820 --> 00:39:14,050 Now's the time. 790 00:39:14,050 --> 00:39:15,300 AUDIENCE: [INAUDIBLE]. 791 00:39:15,300 --> 00:39:20,290 792 00:39:20,290 --> 00:39:23,900 PROFESSOR: So here, I'm setting just some variable s 793 00:39:23,900 --> 00:39:28,620 classes equal to the attribute sally classes. 794 00:39:28,620 --> 00:39:31,800 it's just like setting any sort of variable equal to some 795 00:39:31,800 --> 00:39:32,370 other quantity. 796 00:39:32,370 --> 00:39:35,230 AUDIENCE: So you appending the variable, but it also appended 797 00:39:35,230 --> 00:39:38,282 like the attribute of Sally? 798 00:39:38,282 --> 00:39:42,540 PROFESSOR: So what I did here was I set the variable s 799 00:39:42,540 --> 00:39:46,710 classes equal to this attribute sallly.classes. 800 00:39:46,710 --> 00:39:49,730 And then, because I know this is a list, I appended another 801 00:39:49,730 --> 00:39:51,550 value to it. 802 00:39:51,550 --> 00:39:54,440 But this is the same as when we have two lists. 803 00:39:54,440 --> 00:39:58,500 If we have a list called a, and we say a is equal to 1, 2, 804 00:39:58,500 --> 00:40:02,440 3, then I say b is equal to a. 805 00:40:02,440 --> 00:40:04,894 What is b? 806 00:40:04,894 --> 00:40:12,910 Now If I say b.append 1401, what does b look like? 807 00:40:12,910 --> 00:40:15,500 What does a look like? 808 00:40:15,500 --> 00:40:17,430 Because they're aliases of each other. 809 00:40:17,430 --> 00:40:20,900 So what I did here, when I set s classes directly equal to 810 00:40:20,900 --> 00:40:24,170 the attribute sally.classes, I made s classes an 811 00:40:24,170 --> 00:40:26,180 alias of the attribute. 812 00:40:26,180 --> 00:40:30,440 But the problem with that is that then I can change them. 813 00:40:30,440 --> 00:40:31,770 And because they're aliases, the 814 00:40:31,770 --> 00:40:33,720 attribute itself has changed. 815 00:40:33,720 --> 00:40:36,100 And we don't want to do that in object-oriented 816 00:40:36,100 --> 00:40:36,680 programming. 817 00:40:36,680 --> 00:40:38,300 We need to find an object. 818 00:40:38,300 --> 00:40:41,520 The only way you should be able to change an attribute is 819 00:40:41,520 --> 00:40:44,330 through some method of the class that allows you to 820 00:40:44,330 --> 00:40:46,760 change that attribute. 821 00:40:46,760 --> 00:40:49,970 So if I want to be able to add a class to Sally's class 822 00:40:49,970 --> 00:40:59,700 lists, I should define a method called define add class 823 00:40:59,700 --> 00:41:04,260 that does self.classes.append new class. 824 00:41:04,260 --> 00:41:08,280 825 00:41:08,280 --> 00:41:11,740 While technically, it's possible to directly access an 826 00:41:11,740 --> 00:41:15,540 attribute, it's really bad practice to do so simply 827 00:41:15,540 --> 00:41:18,260 because this unexpected behavior can result. 828 00:41:18,260 --> 00:41:21,340 And also because if you say, oh, well, it's not going to 829 00:41:21,340 --> 00:41:23,240 matter for this one time, I'll remember how to 830 00:41:23,240 --> 00:41:24,420 do the right thing. 831 00:41:24,420 --> 00:41:26,750 The problem with that is it's often the case that you're not 832 00:41:26,750 --> 00:41:29,110 the only person using your code. 833 00:41:29,110 --> 00:41:32,320 So it's a better practice to provide all the sorts of 834 00:41:32,320 --> 00:41:35,230 methods that you would need to do with the class in order to 835 00:41:35,230 --> 00:41:38,570 get an access and change attributes as 836 00:41:38,570 --> 00:41:41,930 methods within the class. 837 00:41:41,930 --> 00:41:43,180 Does that make sense? 838 00:41:43,180 --> 00:41:46,060 839 00:41:46,060 --> 00:41:48,380 So yeah, this is maybe our one violation if you guys have 840 00:41:48,380 --> 00:41:50,990 been attending my recitation. 841 00:41:50,990 --> 00:41:53,910 Our mantra of programmers are lazy. 842 00:41:53,910 --> 00:41:56,320 This is less lazy than just directly accessing the 843 00:41:56,320 --> 00:41:57,240 attributes. 844 00:41:57,240 --> 00:41:59,690 But even though we know that programmers are super, super 845 00:41:59,690 --> 00:42:02,950 lazy, programmers also like to be super, super safe. 846 00:42:02,950 --> 00:42:05,580 So when there's a trade off between defensive programming 847 00:42:05,580 --> 00:42:07,780 and being lazy, always pick defensive programming. 848 00:42:07,780 --> 00:42:11,882