1 00:00:00,000 --> 00:00:00,530 2 00:00:00,530 --> 00:00:02,960 The following content is provided under a Creative 3 00:00:02,960 --> 00:00:04,370 Commons license. 4 00:00:04,370 --> 00:00:07,410 Your support will help MIT OpenCourseWare continue to 5 00:00:07,410 --> 00:00:11,060 offer high quality educational resources for free. 6 00:00:11,060 --> 00:00:13,960 To make a donation or view additional materials from 7 00:00:13,960 --> 00:00:19,790 hundreds of MIT courses, visit MIT OpenCourseWare at 8 00:00:19,790 --> 00:00:22,456 ocw.mit.edu. 9 00:00:22,456 --> 00:00:25,760 PROFESSOR: Today's focus is probability and statistics. 10 00:00:25,760 --> 00:00:29,180 So let's start with probability. 11 00:00:29,180 --> 00:00:33,246 Let's look at probability for binary variables. 12 00:00:33,246 --> 00:00:39,160 13 00:00:39,160 --> 00:00:43,070 What do you mean by a binary variable? 14 00:00:43,070 --> 00:00:45,650 It can take only two outcomes. 15 00:00:45,650 --> 00:00:48,920 So it can take only two values. 16 00:00:48,920 --> 00:00:55,320 For example, it could be 0 or 1, head or tail, on or off. 17 00:00:55,320 --> 00:00:58,460 18 00:00:58,460 --> 00:01:03,670 So we are going to call this variable A, for instance. 19 00:01:03,670 --> 00:01:11,900 So A could be H, or A is equal to T. But that could happen. 20 00:01:11,900 --> 00:01:16,110 That event could happen with a certain probability. 21 00:01:16,110 --> 00:01:18,920 So by that, I mean the probabilities, like we are 22 00:01:18,920 --> 00:01:21,520 expressing the belief that the 23 00:01:21,520 --> 00:01:24,170 particularly event could happen. 24 00:01:24,170 --> 00:01:28,190 So we could assign a value to that. 25 00:01:28,190 --> 00:01:36,380 That is the probability of A taking value H. 26 00:01:36,380 --> 00:01:41,170 So here, the values of A and B-- 27 00:01:41,170 --> 00:01:45,410 sorry, here, the value of A can be either H or T, which 28 00:01:45,410 --> 00:01:49,370 means it has only two possible outcomes. 29 00:01:49,370 --> 00:01:51,190 That's why we call it a binary variable. 30 00:01:51,190 --> 00:02:06,470 However, P of A is equal to H can lie anywhere from 0 and 1, 31 00:02:06,470 --> 00:02:08,878 including 0 and 1. 32 00:02:08,878 --> 00:02:10,190 AUDIENCE: They don't have to be even? 33 00:02:10,190 --> 00:02:10,876 PROFESSOR: Sorry? 34 00:02:10,876 --> 00:02:12,750 AUDIENCE: They don't have to be even? 35 00:02:12,750 --> 00:02:13,090 PROFESSOR: Even? 36 00:02:13,090 --> 00:02:17,190 AUDIENCE: Even chance, even probability, like the same. 37 00:02:17,190 --> 00:02:19,900 PROFESSOR: Sorry, I didn't get your question. 38 00:02:19,900 --> 00:02:23,050 AUDIENCE: Even though they're binary, don't you need be able 39 00:02:23,050 --> 00:02:27,780 to have the same probability? 40 00:02:27,780 --> 00:02:29,450 PROFESSOR: OK, we'll look at that later. 41 00:02:29,450 --> 00:02:32,630 Like, this particular event can take a particular 42 00:02:32,630 --> 00:02:33,600 probability. 43 00:02:33,600 --> 00:02:36,500 And we'll look at that particular case later. 44 00:02:36,500 --> 00:02:39,130 But in general, a probability will always lie 45 00:02:39,130 --> 00:02:40,740 between 0 and 1. 46 00:02:40,740 --> 00:02:43,770 47 00:02:43,770 --> 00:02:48,740 And it can take any value between 0 and 1 since the 48 00:02:48,740 --> 00:02:51,395 range it can take is continuous, sorry discrete. 49 00:02:51,395 --> 00:02:57,870 50 00:02:57,870 --> 00:03:02,330 However, the value the variable can take is going to 51 00:03:02,330 --> 00:03:03,330 be discrete. 52 00:03:03,330 --> 00:03:08,190 It can take only H or T. So that's why you call it a 53 00:03:08,190 --> 00:03:09,520 binary variable. 54 00:03:09,520 --> 00:03:13,110 For example, take a deck of cards. 55 00:03:13,110 --> 00:03:17,910 Here, the value could be, for example if you consider only 56 00:03:17,910 --> 00:03:20,420 one particular suit, then it can be any 57 00:03:20,420 --> 00:03:21,810 one of those 13 values. 58 00:03:21,810 --> 00:03:24,350 59 00:03:24,350 --> 00:03:27,560 So there, this variable is not binary. 60 00:03:27,560 --> 00:03:30,380 However, the probability of a particular event happening is 61 00:03:30,380 --> 00:03:33,830 always between 0 and 1. 62 00:03:33,830 --> 00:03:37,690 Now, let's look at some probability, like what you 63 00:03:37,690 --> 00:03:42,470 asked earlier is whether they will be equal, whether the 64 00:03:42,470 --> 00:03:46,430 probably of head and tail can be equal. 65 00:03:46,430 --> 00:03:51,730 So let's represent the probability of A of H. This 66 00:03:51,730 --> 00:03:54,570 can be between 0 and 1. 67 00:03:54,570 --> 00:04:00,480 What is the probability of A not happening? 68 00:04:00,480 --> 00:04:01,810 So we call it by A bar. 69 00:04:01,810 --> 00:04:06,000 70 00:04:06,000 --> 00:04:09,700 Given P of A, can you give me P of A bar? 71 00:04:09,700 --> 00:04:10,660 AUDIENCE:1 minus P of A. 72 00:04:10,660 --> 00:04:16,070 PROFESSOR: 1 minus P of A. If there are two events 73 00:04:16,070 --> 00:04:22,190 happening, for example, you're throwing two coins, then we 74 00:04:22,190 --> 00:04:23,730 can consider their joint probabilities. 75 00:04:23,730 --> 00:04:26,520 76 00:04:26,520 --> 00:04:33,030 So let's say we have a coin, A, and this coin, B. So this 77 00:04:33,030 --> 00:04:34,940 coin can take two values. 78 00:04:34,940 --> 00:04:39,070 And so this coin can take another two values. 79 00:04:39,070 --> 00:04:40,320 Sorry. 80 00:04:40,320 --> 00:04:51,618 81 00:04:51,618 --> 00:04:57,410 We know A can take H with probability, say I assume it's 82 00:04:57,410 --> 00:04:59,380 unbiased, so it'll be 1/2. 83 00:04:59,380 --> 00:05:02,050 84 00:05:02,050 --> 00:05:03,330 All these are going to be 1/2. 85 00:05:03,330 --> 00:05:09,050 86 00:05:09,050 --> 00:05:11,415 What's the probability of HT? 87 00:05:11,415 --> 00:05:14,110 88 00:05:14,110 --> 00:05:20,020 So now, we are considering a joint event, P of A is equal 89 00:05:20,020 --> 00:05:32,900 to H and P of B is equal to T. So in probability, we 90 00:05:32,900 --> 00:05:35,160 represent it by something like this. 91 00:05:35,160 --> 00:05:40,200 P A- do you know what is that? 92 00:05:40,200 --> 00:05:45,450 P A intersection B, you want both events to happen. 93 00:05:45,450 --> 00:05:48,000 94 00:05:48,000 --> 00:05:57,350 That will be P of A. And in this case, it's P of B. So we 95 00:05:57,350 --> 00:06:00,160 could simply say it's 1/4. 96 00:06:00,160 --> 00:06:01,570 Why is this possible? 97 00:06:01,570 --> 00:06:04,200 98 00:06:04,200 --> 00:06:06,380 It's because these two events are independent. 99 00:06:06,380 --> 00:06:09,760 100 00:06:09,760 --> 00:06:14,100 The coin A getting head doesn't affect 101 00:06:14,100 --> 00:06:17,810 coin B getting a tail. 102 00:06:17,810 --> 00:06:20,790 So it doesn't have any influence. 103 00:06:20,790 --> 00:06:23,510 That's why these two events are independent. 104 00:06:23,510 --> 00:06:27,360 The dependent events are a bit complex, to analyze. 105 00:06:27,360 --> 00:06:30,860 Let's skip them at the moment. 106 00:06:30,860 --> 00:06:33,190 So we know all these probabilities 107 00:06:33,190 --> 00:06:34,440 are going to be 1/4. 108 00:06:34,440 --> 00:06:36,770 109 00:06:36,770 --> 00:06:41,380 So we looked at a particular condition here. 110 00:06:41,380 --> 00:06:44,780 That is, A taking head and B taking tail. 111 00:06:44,780 --> 00:06:53,640 What about the condition, what about the case where either A 112 00:06:53,640 --> 00:06:57,290 or B takes a head? 113 00:06:57,290 --> 00:06:59,950 How can we represent that? 114 00:06:59,950 --> 00:07:05,560 So it will be something like A is equal to H or B is equal to 115 00:07:05,560 --> 00:07:13,600 H. Oh, probability at least 1, so by that, I can also 116 00:07:13,600 --> 00:07:14,850 represent something like this. 117 00:07:14,850 --> 00:07:18,370 118 00:07:18,370 --> 00:07:20,100 OK, here, this is sufficient anyway. 119 00:07:20,100 --> 00:07:27,900 120 00:07:27,900 --> 00:07:29,400 So what are the possibility events? 121 00:07:29,400 --> 00:07:54,000 122 00:07:54,000 --> 00:08:00,660 So these three events could give rise to this probability. 123 00:08:00,660 --> 00:08:02,840 It's better if you can represent this in a diagram. 124 00:08:02,840 --> 00:08:06,760 So let's go and represent this in a diagram. 125 00:08:06,760 --> 00:08:12,175 This is A and this is B getting, say, head. 126 00:08:12,175 --> 00:08:15,200 127 00:08:15,200 --> 00:08:18,250 In one case, both can take head. 128 00:08:18,250 --> 00:08:22,410 That is, this particular condition, intersection we 129 00:08:22,410 --> 00:08:23,660 earlier looked at. 130 00:08:23,660 --> 00:08:32,309 131 00:08:32,309 --> 00:08:35,789 So what is this whole thing? 132 00:08:35,789 --> 00:08:41,150 133 00:08:41,150 --> 00:08:46,010 That is, either A gets H or B gets H, 134 00:08:46,010 --> 00:08:47,720 which is this condition. 135 00:08:47,720 --> 00:08:51,200 136 00:08:51,200 --> 00:08:54,450 We call it P of A union B. Ok. 137 00:08:54,450 --> 00:09:02,360 138 00:09:02,360 --> 00:09:07,720 Is there an efficient way of finding this rather than 139 00:09:07,720 --> 00:09:09,450 writing down all possible cases? 140 00:09:09,450 --> 00:09:11,950 141 00:09:11,950 --> 00:09:17,760 Is there an efficient way of finding P of A union B? 142 00:09:17,760 --> 00:09:19,916 From high school maths, probably? 143 00:09:19,916 --> 00:09:21,730 No idea? 144 00:09:21,730 --> 00:09:22,670 OK. 145 00:09:22,670 --> 00:09:29,350 P of A union B is equal to P of A plus P of B minus P of A 146 00:09:29,350 --> 00:09:32,530 intersection B. Because if you consider P of A, you would 147 00:09:32,530 --> 00:09:34,800 have taken this full circle. 148 00:09:34,800 --> 00:09:35,940 When you take P of B, you would have 149 00:09:35,940 --> 00:09:38,010 taken this full circle. 150 00:09:38,010 --> 00:09:41,490 So which means you're counting this area twice. 151 00:09:41,490 --> 00:09:42,740 So here, we deduct it once. 152 00:09:42,740 --> 00:09:47,130 153 00:09:47,130 --> 00:09:48,283 OK? 154 00:09:48,283 --> 00:09:49,533 Great. 155 00:09:49,533 --> 00:09:51,600 156 00:09:51,600 --> 00:09:54,720 So this is the basics of the probability. 157 00:09:54,720 --> 00:10:00,900 Now, actually we looked at two events, two joint events here. 158 00:10:00,900 --> 00:10:04,040 But we should have a formal way of 159 00:10:04,040 --> 00:10:07,350 looking at multiple events. 160 00:10:07,350 --> 00:10:09,720 So how can we do that? 161 00:10:09,720 --> 00:10:11,640 The first way is doing it by trees. 162 00:10:11,640 --> 00:10:16,020 163 00:10:16,020 --> 00:10:20,120 Let's say we represent the outcome of the 164 00:10:20,120 --> 00:10:22,540 first trial by a branch. 165 00:10:22,540 --> 00:10:28,240 166 00:10:28,240 --> 00:10:32,240 We can represent the outcome of the second trial by another 167 00:10:32,240 --> 00:10:37,160 branch from these two previous branches. 168 00:10:37,160 --> 00:10:38,710 So this would be H HH HT TH TT. 169 00:10:38,710 --> 00:10:51,840 170 00:10:51,840 --> 00:10:56,370 And we know this could happen with probability 1/2. 171 00:10:56,370 --> 00:11:00,430 So we know it's, again, 1/2, 1/2, 1/2, 1/2. 172 00:11:00,430 --> 00:11:01,680 So this is 1/4. 173 00:11:01,680 --> 00:11:11,440 174 00:11:11,440 --> 00:11:18,955 Suppose we want to do this for an outcome of throwing dice. 175 00:11:18,955 --> 00:11:22,060 176 00:11:22,060 --> 00:11:25,540 Then, probably we would have 6 branches here. 177 00:11:25,540 --> 00:11:28,330 178 00:11:28,330 --> 00:11:33,270 Which, again, forks into another 36 branches. 179 00:11:33,270 --> 00:11:35,730 So there should be another easier way. 180 00:11:35,730 --> 00:11:38,730 For that, we could use a second method call grid. 181 00:11:38,730 --> 00:11:42,410 182 00:11:42,410 --> 00:11:44,130 We could simply put that in a diagram. 183 00:11:44,130 --> 00:11:52,470 184 00:11:52,470 --> 00:11:54,310 So this is the first trial. 185 00:11:54,310 --> 00:11:57,040 186 00:11:57,040 --> 00:11:59,165 And this will be our second trial. 187 00:11:59,165 --> 00:12:11,580 188 00:12:11,580 --> 00:12:17,710 So now, we can represent any possible outcome on this grid. 189 00:12:17,710 --> 00:12:22,090 For example, can give you me an example where you throw the 190 00:12:22,090 --> 00:12:27,430 same number in both the trials? 191 00:12:27,430 --> 00:12:30,710 Then, what would be the layout of it in this grid? 192 00:12:30,710 --> 00:12:34,150 193 00:12:34,150 --> 00:12:37,790 Throwing the same number in both the trials. 194 00:12:37,790 --> 00:12:39,025 Here's the first trial. 195 00:12:39,025 --> 00:12:40,976 This, the second. 196 00:12:40,976 --> 00:12:42,720 Then it would be the diagonal. 197 00:12:42,720 --> 00:12:48,000 198 00:12:48,000 --> 00:12:51,940 If you want to calculate the probability, do you know the 199 00:12:51,940 --> 00:13:01,580 probability is the ratio between the outcomes we expect 200 00:13:01,580 --> 00:13:04,170 over all possible outcomes? 201 00:13:04,170 --> 00:13:09,170 So here, we know there will be 6 instances in this 202 00:13:09,170 --> 00:13:11,060 highlighted area. 203 00:13:11,060 --> 00:13:15,840 Compare that, 36 to all possibilities. 204 00:13:15,840 --> 00:13:17,266 So it'll be simpler 6/36. 205 00:13:17,266 --> 00:13:22,620 206 00:13:22,620 --> 00:13:26,840 How can you find the probability of getting a 207 00:13:26,840 --> 00:13:28,670 cumulative total of, say, 6? 208 00:13:28,670 --> 00:13:33,880 209 00:13:33,880 --> 00:13:36,480 Then again, it would be very simple. 210 00:13:36,480 --> 00:13:43,930 It could be 1, 5; 2, 4; 3, 3; 4, 2; 1, 5. 211 00:13:43,930 --> 00:13:46,000 All right? 212 00:13:46,000 --> 00:13:49,360 So it'll be 5 by 36. 213 00:13:49,360 --> 00:13:52,550 214 00:13:52,550 --> 00:13:53,480 OK? 215 00:13:53,480 --> 00:13:56,550 So either by using trees or grid, you can easily find the 216 00:13:56,550 --> 00:13:57,800 probabilities. 217 00:13:57,800 --> 00:14:04,060 218 00:14:04,060 --> 00:14:06,806 Now, let's look at a few concrete examples. 219 00:14:06,806 --> 00:14:22,440 220 00:14:22,440 --> 00:14:24,030 Let's see. 221 00:14:24,030 --> 00:14:27,710 Suppose we are throwing three coins. 222 00:14:27,710 --> 00:14:33,610 Then, what is the probability of one particular outcome in 223 00:14:33,610 --> 00:14:36,590 that trial, in all three trials? 224 00:14:36,590 --> 00:14:38,980 What is the probability, assuming that these are 225 00:14:38,980 --> 00:14:41,020 unbiased coins? 226 00:14:41,020 --> 00:14:44,170 What is the probability of one particular outcome? 227 00:14:44,170 --> 00:14:46,915 Because how many possible outcomes are there if you are 228 00:14:46,915 --> 00:14:48,165 throwing three coins? 229 00:14:48,165 --> 00:14:50,440 230 00:14:50,440 --> 00:14:53,410 Consider this tree. 231 00:14:53,410 --> 00:14:54,760 First, it splits into 2. 232 00:14:54,760 --> 00:14:56,040 Then, it splits into 4. 233 00:14:56,040 --> 00:14:57,940 Then? 234 00:14:57,940 --> 00:15:00,860 8, all right? 235 00:15:00,860 --> 00:15:04,990 OK, so there are 8 possible outcomes. 236 00:15:04,990 --> 00:15:08,110 So each outcome will have the probability 1/8. 237 00:15:08,110 --> 00:15:12,230 238 00:15:12,230 --> 00:15:16,310 so what is the probability of heads appearing exactly twice? 239 00:15:16,310 --> 00:15:19,370 240 00:15:19,370 --> 00:15:21,760 How can you do that? 241 00:15:21,760 --> 00:15:24,620 Of course, you can write the tree and count. 242 00:15:24,620 --> 00:15:26,480 What is the easier way of doing that? 243 00:15:26,480 --> 00:15:30,450 Since we know this count, since we know this probability 244 00:15:30,450 --> 00:15:32,310 of a particular event happening? 245 00:15:32,310 --> 00:15:34,590 How can we come up with the probability of 246 00:15:34,590 --> 00:15:36,510 getting exactly 2 heads? 247 00:15:36,510 --> 00:15:41,960 248 00:15:41,960 --> 00:15:44,880 It could be head, head, or tail-- so this is by 249 00:15:44,880 --> 00:15:47,790 enumerating all the possible outcomes. 250 00:15:47,790 --> 00:15:51,700 So it could have been head, head, tail, where me put the 251 00:15:51,700 --> 00:15:54,080 tail only at the end. 252 00:15:54,080 --> 00:15:56,545 It could have been head, tail, head. 253 00:15:56,545 --> 00:16:01,100 Or it could have been tail, head, head. 254 00:16:01,100 --> 00:16:06,670 In these three cases, you're getting exactly 2 heads. 255 00:16:06,670 --> 00:16:10,150 So we are enumerating all possible outcomes. 256 00:16:10,150 --> 00:16:12,560 And we know each possible outcome will take the 257 00:16:12,560 --> 00:16:14,400 probability 1/8. 258 00:16:14,400 --> 00:16:18,360 So the total probability here is 3/8. 259 00:16:18,360 --> 00:16:18,950 OK? 260 00:16:18,950 --> 00:16:21,540 So this is one way of handling a probability question. 261 00:16:21,540 --> 00:16:25,600 262 00:16:25,600 --> 00:16:28,670 You can do that only because these are independent events. 263 00:16:28,670 --> 00:16:31,240 And you can sum them. 264 00:16:31,240 --> 00:16:32,490 We'll come to that later. 265 00:16:32,490 --> 00:16:43,070 266 00:16:43,070 --> 00:16:47,500 Suppose you are rolling two four-sided dice. 267 00:16:47,500 --> 00:16:52,000 And assuming they're fair, how many possible 268 00:16:52,000 --> 00:16:53,900 outcomes are there? 269 00:16:53,900 --> 00:16:59,585 Two four-sided dice, and assuming that each of them are 270 00:16:59,585 --> 00:17:02,890 fair-- that means unbiased-- 271 00:17:02,890 --> 00:17:05,740 how many possible outcomes are there? 272 00:17:05,740 --> 00:17:08,839 Consider this tree. 273 00:17:08,839 --> 00:17:13,040 First, it branches into 4, OK? 274 00:17:13,040 --> 00:17:15,849 In the first trial, it's a four-sided dice, so there are 275 00:17:15,849 --> 00:17:17,710 4 possible outcomes. 276 00:17:17,710 --> 00:17:18,960 So it branches into 4. 277 00:17:18,960 --> 00:17:22,770 278 00:17:22,770 --> 00:17:25,329 Then, each branch will, in turn, fork 279 00:17:25,329 --> 00:17:27,280 into another 4 branches. 280 00:17:27,280 --> 00:17:31,100 So there are totally 16 outcomes. 281 00:17:31,100 --> 00:17:35,540 So what is the probability of rolling a 2 and a 3? 282 00:17:35,540 --> 00:17:39,900 What is the probability of rolling a 2 and a 3? 283 00:17:39,900 --> 00:17:44,950 Not in a given order, not in the given order. 284 00:17:44,950 --> 00:17:46,770 Can anyone give the answer? 285 00:17:46,770 --> 00:17:49,870 286 00:17:49,870 --> 00:17:51,130 OK, let's see. 287 00:17:51,130 --> 00:17:54,510 So we have to roll a 2 and a 3. 288 00:17:54,510 --> 00:17:57,250 So which means it could have been 2, 3, or 3, 2. 289 00:17:57,250 --> 00:18:00,350 290 00:18:00,350 --> 00:18:05,730 And we know the probability of each event is 1/16. 291 00:18:05,730 --> 00:18:07,750 So this will be 1/16. 292 00:18:07,750 --> 00:18:12,230 And this will be 1/16. 293 00:18:12,230 --> 00:18:14,810 So the total probability is 1/8. 294 00:18:14,810 --> 00:18:17,560 295 00:18:17,560 --> 00:18:23,820 What is the probability of getting the sum of the rolls 296 00:18:23,820 --> 00:18:25,960 an odd number? 297 00:18:25,960 --> 00:18:28,090 What is the probability of getting an odd number as sum 298 00:18:28,090 --> 00:18:30,110 of the rolls? 299 00:18:30,110 --> 00:18:33,790 Now, this is getting a bit tricky because now it's maybe 300 00:18:33,790 --> 00:18:37,880 a bit harder to enumerate all possible cases. 301 00:18:37,880 --> 00:18:39,130 So how can we do that? 302 00:18:39,130 --> 00:18:46,250 303 00:18:46,250 --> 00:18:47,190 There should be a short cut. 304 00:18:47,190 --> 00:18:48,681 AUDIENCE: It can either be odd or even. 305 00:18:48,681 --> 00:18:49,540 PROFESSOR: Sorry? 306 00:18:49,540 --> 00:18:51,640 AUDIENCE: You can either get odd or even. 307 00:18:51,640 --> 00:18:53,620 PROFESSOR: It can be either odd or even, right? 308 00:18:53,620 --> 00:18:55,230 So it will be 1/2. 309 00:18:55,230 --> 00:18:59,000 OK, there's another trick we might be able to use to get 310 00:18:59,000 --> 00:19:00,250 the answers quickly. 311 00:19:00,250 --> 00:19:02,640 312 00:19:02,640 --> 00:19:07,060 What is the probability of the first roll being equal to the 313 00:19:07,060 --> 00:19:08,310 second roll? 314 00:19:08,310 --> 00:19:13,200 315 00:19:13,200 --> 00:19:16,240 In the same line, you can think. 316 00:19:16,240 --> 00:19:19,840 What is the probability of getting the first roll equal 317 00:19:19,840 --> 00:19:21,320 to the second roll? 318 00:19:21,320 --> 00:19:22,580 It's quite similar to this. 319 00:19:22,580 --> 00:19:25,870 320 00:19:25,870 --> 00:19:27,120 Any ideas? 321 00:19:27,120 --> 00:19:30,550 322 00:19:30,550 --> 00:19:31,840 It's a four-sided dice. 323 00:19:31,840 --> 00:19:34,380 There are 4 possible outcomes. 324 00:19:34,380 --> 00:19:37,510 This is one case where it could be 1, 1, or it could be 325 00:19:37,510 --> 00:19:39,900 2, 2, or 3, 3, or 4, 4. 326 00:19:39,900 --> 00:19:45,320 And if it's inside a dice, it would be n, right? 327 00:19:45,320 --> 00:19:50,690 So if it's n-sided dice, there and n possible outcomes 328 00:19:50,690 --> 00:19:56,550 desired, and totally n by n outcomes. 329 00:19:56,550 --> 00:19:58,960 So you get 1/n probability. 330 00:19:58,960 --> 00:20:03,240 331 00:20:03,240 --> 00:20:08,340 What is the probability of at least 1 roll equal to 4? 332 00:20:08,340 --> 00:20:10,340 At least 1 roll equal to 4? 333 00:20:10,340 --> 00:20:14,490 334 00:20:14,490 --> 00:20:15,750 This is very interesting. 335 00:20:15,750 --> 00:20:17,040 These type of questions, you'll get in 336 00:20:17,040 --> 00:20:19,770 that Psets, I know. 337 00:20:19,770 --> 00:20:21,890 Probably in the quiz, too. 338 00:20:21,890 --> 00:20:25,060 What is the probability of getting at least 1 339 00:20:25,060 --> 00:20:26,310 roll equal to 4? 340 00:20:26,310 --> 00:20:28,690 341 00:20:28,690 --> 00:20:30,780 OK, so what are the possible outcomes? 342 00:20:30,780 --> 00:20:35,330 First roll, could be a 4. 343 00:20:35,330 --> 00:20:39,300 And the second roll could be anything. 344 00:20:39,300 --> 00:20:42,565 345 00:20:42,565 --> 00:20:44,480 Or it could be 4, and the first roll 346 00:20:44,480 --> 00:20:46,650 could have been anything. 347 00:20:46,650 --> 00:20:49,740 Or both could have been 4, but we would have considered that 348 00:20:49,740 --> 00:20:50,990 here, as well. 349 00:20:50,990 --> 00:20:56,640 350 00:20:56,640 --> 00:20:59,340 So what we had to do is we had to calculate this probability 351 00:20:59,340 --> 00:21:02,030 and this probability, add them, and deduct this, because 352 00:21:02,030 --> 00:21:04,760 this would have been double counted. 353 00:21:04,760 --> 00:21:08,230 It's quite like, this intersection. 354 00:21:08,230 --> 00:21:12,490 We want to remove that, and we want to find the union OK? 355 00:21:12,490 --> 00:21:15,560 So what is this probability? 356 00:21:15,560 --> 00:21:18,455 Since we don't care about the second roll, we have to care 357 00:21:18,455 --> 00:21:21,300 only about the first roll, our first roll 358 00:21:21,300 --> 00:21:24,590 getting 4, which is 1/4. 359 00:21:24,590 --> 00:21:28,450 And this is 1/4 similarly. 360 00:21:28,450 --> 00:21:32,770 And this is 1/4 by 1/4, so 1/16. 361 00:21:32,770 --> 00:21:35,280 So it'll be 1/2 minus 1/16. 362 00:21:35,280 --> 00:21:37,890 363 00:21:37,890 --> 00:21:41,460 And when you give the answers, if it's hard, you can just 364 00:21:41,460 --> 00:21:43,050 leave it like this. 365 00:21:43,050 --> 00:21:46,190 So this is what we call giving the answers as formula instead 366 00:21:46,190 --> 00:21:47,990 of giving exact fractions. 367 00:21:47,990 --> 00:21:50,280 Because sometimes it might be hard to find the fraction. 368 00:21:50,280 --> 00:21:53,500 Suppose it's something like 1 over, say, 2 to the power 5 369 00:21:53,500 --> 00:21:55,370 and a 3 to the 2, something like this. 370 00:21:55,370 --> 00:21:56,800 Or we'll say 5. 371 00:21:56,800 --> 00:22:00,095 You're not supposed to give the exact value in this amount 372 00:22:00,095 --> 00:22:01,100 or even the fractions. 373 00:22:01,100 --> 00:22:03,180 You can give such formulas. 374 00:22:03,180 --> 00:22:07,205 You can give something like this, too, to give the inverse 375 00:22:07,205 --> 00:22:10,930 probability of that not happening. 376 00:22:10,930 --> 00:22:11,400 Let's see. 377 00:22:11,400 --> 00:22:16,310 Let's move into a little bit more complicated example. 378 00:22:16,310 --> 00:22:18,710 A pack of cards-- 379 00:22:18,710 --> 00:22:21,750 what is the probability of getting an ace? 380 00:22:21,750 --> 00:22:23,000 Anyone? 381 00:22:23,000 --> 00:22:25,442 382 00:22:25,442 --> 00:22:26,354 AUDIENCE: 1 out of 2? 383 00:22:26,354 --> 00:22:28,180 PROFESSOR: 1 out of 2? 384 00:22:28,180 --> 00:22:30,880 AUDIENCE: out of 52. 385 00:22:30,880 --> 00:22:32,920 PROFESSOR: Not a particular-- 386 00:22:32,920 --> 00:22:37,055 an ace, yes, just ace. 387 00:22:37,055 --> 00:22:38,438 AUDIENCE: Is it 4 out of 52? 388 00:22:38,438 --> 00:22:42,030 PROFESSOR: 4/52, yes. 389 00:22:42,030 --> 00:22:44,400 Or if you consider one suit, it would have 390 00:22:44,400 --> 00:22:46,100 been like 1/13, right? 391 00:22:46,100 --> 00:22:48,480 You could have considered one suit, and out of-- 392 00:22:48,480 --> 00:22:50,690 OK. 393 00:22:50,690 --> 00:22:52,612 It's the same analysis, right? 394 00:22:52,612 --> 00:22:54,220 OK. 395 00:22:54,220 --> 00:22:57,630 What is the probability of getting a specific card, which 396 00:22:57,630 --> 00:22:59,795 means, say, the ace of hearts? 397 00:22:59,795 --> 00:23:04,690 398 00:23:04,690 --> 00:23:08,560 It's what she said, yeah, 1/52. 399 00:23:08,560 --> 00:23:10,990 What is the probability of not getting an ace? 400 00:23:10,990 --> 00:23:14,190 401 00:23:14,190 --> 00:23:15,170 AUDIENCE: [INAUDIBLE]? 402 00:23:15,170 --> 00:23:16,750 PROFESSOR: Sorry? 403 00:23:16,750 --> 00:23:18,220 AUDIENCE: 1 minus-- 404 00:23:18,220 --> 00:23:19,470 PROFESSOR: 1/13. 405 00:23:19,470 --> 00:23:22,060 406 00:23:22,060 --> 00:23:25,950 OK, this is where me make you solve the inverse probability. 407 00:23:25,950 --> 00:23:29,480 OK, so that will come into play very often. 408 00:23:29,480 --> 00:23:33,980 OK, now let's get into two decks of playing cards. 409 00:23:33,980 --> 00:23:39,160 OK, what is the sample size? 410 00:23:39,160 --> 00:23:42,930 What is the sample size of drawing cards from 411 00:23:42,930 --> 00:23:44,630 two decks of cards? 412 00:23:44,630 --> 00:23:45,420 Two cards, actually. 413 00:23:45,420 --> 00:23:48,110 You're going to draw two cards from two different decks. 414 00:23:48,110 --> 00:23:51,930 415 00:23:51,930 --> 00:23:53,530 Sorry? 416 00:23:53,530 --> 00:23:54,470 OK. 417 00:23:54,470 --> 00:23:59,850 What is the sample size of drawing a card from one deck? 418 00:23:59,850 --> 00:24:03,530 There are 52 possible outcomes. 419 00:24:03,530 --> 00:24:07,890 So for each outcome here, we have 52 outcomes there, right? 420 00:24:07,890 --> 00:24:09,500 So it's 52 by 52. 421 00:24:09,500 --> 00:24:11,810 It's like the tree, but here, we have 52 branches. 422 00:24:11,810 --> 00:24:15,060 423 00:24:15,060 --> 00:24:17,830 So eventually, you will have 52 by 52. 424 00:24:17,830 --> 00:24:19,220 This is where you can't enumerate all 425 00:24:19,220 --> 00:24:20,650 the possible cases. 426 00:24:20,650 --> 00:24:24,420 So you should have a way to find the final 427 00:24:24,420 --> 00:24:26,317 probability, OK? 428 00:24:26,317 --> 00:24:29,810 429 00:24:29,810 --> 00:24:33,440 So in this case, what is the probability of getting at 430 00:24:33,440 --> 00:24:34,775 least one ace? 431 00:24:34,775 --> 00:24:37,820 432 00:24:37,820 --> 00:24:42,280 What's the probability of getting at least one ace? 433 00:24:42,280 --> 00:24:46,150 This is, again, similar to this case. 434 00:24:46,150 --> 00:24:47,000 Remember this diagram. 435 00:24:47,000 --> 00:24:48,250 It's called Venn diagram. 436 00:24:48,250 --> 00:24:53,250 437 00:24:53,250 --> 00:24:54,720 Remember this. 438 00:24:54,720 --> 00:24:58,260 So what is the probability of getting at least one ace, 439 00:24:58,260 --> 00:25:01,170 which means you could have got the ace from the first deck, 440 00:25:01,170 --> 00:25:03,940 or the second deck, or both. 441 00:25:03,940 --> 00:25:05,950 But if you're getting from both, you have to deduct it 442 00:25:05,950 --> 00:25:11,510 because otherwise, you would have double counted it. 443 00:25:11,510 --> 00:25:16,220 So getting an ace from the first deck is 1/13. 444 00:25:16,220 --> 00:25:18,130 Second deck, 1/13. 445 00:25:18,130 --> 00:25:22,240 Getting from both is 1/52 by 52. 446 00:25:22,240 --> 00:25:27,460 Sorry, 1/13 by 1/13. 447 00:25:27,460 --> 00:25:39,900 448 00:25:39,900 --> 00:25:40,375 Sorry. 449 00:25:40,375 --> 00:25:41,784 AUDIENCE: Are you adding them? 450 00:25:41,784 --> 00:25:46,010 PROFESSOR: Yeah, that's what I explained earlier. 451 00:25:46,010 --> 00:25:47,310 You're doing two trials. 452 00:25:47,310 --> 00:25:50,400 453 00:25:50,400 --> 00:25:52,290 You could have got the ace from here. 454 00:25:52,290 --> 00:25:54,310 And this could have been anything. 455 00:25:54,310 --> 00:25:56,270 You could have got the ace from here, and this could have 456 00:25:56,270 --> 00:25:57,150 been anything. 457 00:25:57,150 --> 00:26:00,400 You could have got an ace from both. 458 00:26:00,400 --> 00:26:03,150 So you should add these two probabilities because we need 459 00:26:03,150 --> 00:26:07,590 a case where at least one card is ace. 460 00:26:07,590 --> 00:26:10,775 But the problem is, this could have happened here and here. 461 00:26:10,775 --> 00:26:12,025 And so you will deduct it. 462 00:26:12,025 --> 00:26:16,330 463 00:26:16,330 --> 00:26:20,690 What is the probability of getting neither card-- 464 00:26:20,690 --> 00:26:22,690 what is the probability of neither card being an ace? 465 00:26:22,690 --> 00:26:26,394 466 00:26:26,394 --> 00:26:27,320 AUDIENCE: 1 minus that? 467 00:26:27,320 --> 00:26:31,890 PROFESSOR: 1 minus this, exactly. 468 00:26:31,890 --> 00:26:33,040 OK, you're getting comfortable with the 469 00:26:33,040 --> 00:26:35,550 inverse probability now. 470 00:26:35,550 --> 00:26:42,320 What's the probability of two cards from the same suit? 471 00:26:42,320 --> 00:26:44,060 What is the probability of getting two cards 472 00:26:44,060 --> 00:26:45,310 from the same suit? 473 00:26:45,310 --> 00:26:50,290 474 00:26:50,290 --> 00:26:52,810 Now, it's getting interesting. 475 00:26:52,810 --> 00:26:55,390 Two cards from the same suit. 476 00:26:55,390 --> 00:26:58,600 So how can we think about this? 477 00:26:58,600 --> 00:27:02,910 Of course, you can enumerate all possible cases and count. 478 00:27:02,910 --> 00:27:04,160 We don't want to do that. 479 00:27:04,160 --> 00:27:08,970 480 00:27:08,970 --> 00:27:13,315 OK, you're going to use the grid here to visualize this. 481 00:27:13,315 --> 00:27:18,100 482 00:27:18,100 --> 00:27:19,110 OK? 483 00:27:19,110 --> 00:27:21,240 It could have been a spades, or hearts, 484 00:27:21,240 --> 00:27:22,890 or clubs, or a diamond. 485 00:27:22,890 --> 00:27:29,270 486 00:27:29,270 --> 00:27:32,270 So we want two cards of the same suit, right? 487 00:27:32,270 --> 00:27:38,440 488 00:27:38,440 --> 00:27:42,280 So it's 4/16 possible outcomes. 489 00:27:42,280 --> 00:27:45,480 490 00:27:45,480 --> 00:27:47,310 Do you see that? 491 00:27:47,310 --> 00:27:50,270 So see, we are using all the tools 492 00:27:50,270 --> 00:27:51,650 available at our disposal-- 493 00:27:51,650 --> 00:27:58,340 trees, grids, counting, Ven diagrams, inverse probability. 494 00:27:58,340 --> 00:28:01,000 Yeah, you should be able to do that to get the answers 495 00:28:01,000 --> 00:28:04,270 quickly because you could have actually done-- you could have 496 00:28:04,270 --> 00:28:06,130 done something like this, too. 497 00:28:06,130 --> 00:28:08,180 But it will take more time, right? 498 00:28:08,180 --> 00:28:14,240 So this will be a simpler way of visualizing things. 499 00:28:14,240 --> 00:28:18,170 What is the probability of getting neither card a diamond 500 00:28:18,170 --> 00:28:19,420 nor a club? 501 00:28:19,420 --> 00:28:25,300 502 00:28:25,300 --> 00:28:27,615 Neither card is diamond nor club. 503 00:28:27,615 --> 00:28:28,865 That is tricky. 504 00:28:28,865 --> 00:28:31,000 505 00:28:31,000 --> 00:28:36,080 But since we have this grid, we can easily visualize that. 506 00:28:36,080 --> 00:28:39,360 507 00:28:39,360 --> 00:28:43,590 So if neither card is diamond nor club, then it could have 508 00:28:43,590 --> 00:28:45,130 been only these two values, right? 509 00:28:45,130 --> 00:28:48,740 510 00:28:48,740 --> 00:28:52,530 Which is, again, 4/16. 511 00:28:52,530 --> 00:28:54,200 So there are 4 possible cases. 512 00:28:54,200 --> 00:28:57,490 513 00:28:57,490 --> 00:28:58,740 OK? 514 00:28:58,740 --> 00:29:06,930 515 00:29:06,930 --> 00:29:09,860 So what is the summary? 516 00:29:09,860 --> 00:29:11,870 What is the take home message here? 517 00:29:11,870 --> 00:29:18,680 518 00:29:18,680 --> 00:29:22,940 In probability, the probability of the belief, or 519 00:29:22,940 --> 00:29:26,635 the way of expressing the belief, of a 520 00:29:26,635 --> 00:29:29,320 particular event happening. 521 00:29:29,320 --> 00:29:32,990 Now, there could be several possible outcomes. 522 00:29:32,990 --> 00:29:35,390 Out of those possible outcomes, you have a certain 523 00:29:35,390 --> 00:29:37,400 number of desired outcomes. 524 00:29:37,400 --> 00:29:39,900 How can you find that? 525 00:29:39,900 --> 00:29:41,600 You can either enumerate all of them. 526 00:29:41,600 --> 00:29:44,610 You can put them in a tree, or you can put them in a grid. 527 00:29:44,610 --> 00:29:48,430 Or you can use some sort of Venn diagram and come up with 528 00:29:48,430 --> 00:29:50,470 some sort of analysis. 529 00:29:50,470 --> 00:29:57,310 Here, we start with our belief that the coin is unbiased, or 530 00:29:57,310 --> 00:29:59,350 we have a fair chance of drawing any card 531 00:29:59,350 --> 00:30:00,820 from the deck of cards. 532 00:30:00,820 --> 00:30:06,650 So we have all these unbiased beliefs, or beliefs about the 533 00:30:06,650 --> 00:30:09,680 characteristics of each trial. 534 00:30:09,680 --> 00:30:11,140 So we start from that. 535 00:30:11,140 --> 00:30:13,690 536 00:30:13,690 --> 00:30:19,440 Then, we find the probability of a particular event 537 00:30:19,440 --> 00:30:22,540 happening in a certain number of trials. 538 00:30:22,540 --> 00:30:30,230 But what if you don't have the knowledge about the coin? 539 00:30:30,230 --> 00:30:32,250 What if you don't know whether it's fair or not? 540 00:30:32,250 --> 00:30:37,920 What if you don't know P of A is equal to H is equal to 1/2? 541 00:30:37,920 --> 00:30:38,910 Suppose you don't know that. 542 00:30:38,910 --> 00:30:42,380 Suppose it's P. How can you find it? 543 00:30:42,380 --> 00:30:47,250 544 00:30:47,250 --> 00:30:50,800 What you could do is you could simulate this. 545 00:30:50,800 --> 00:30:55,040 You can throw coin several times and count the total 546 00:30:55,040 --> 00:30:58,910 number of heads you get, OK? 547 00:30:58,910 --> 00:31:04,140 So it could be n of heads over n trial will 548 00:31:04,140 --> 00:31:05,550 give you the P, right? 549 00:31:05,550 --> 00:31:10,860 550 00:31:10,860 --> 00:31:14,950 This is a way of finding the probabilities through a 551 00:31:14,950 --> 00:31:16,290 certain number of trials. 552 00:31:16,290 --> 00:31:20,150 It's like simulating the experiments. 553 00:31:20,150 --> 00:31:21,515 It's called Monte Carlo simulation. 554 00:31:21,515 --> 00:31:24,400 555 00:31:24,400 --> 00:31:27,950 And using that, we try to find a particular 556 00:31:27,950 --> 00:31:29,986 parameter of the model. 557 00:31:29,986 --> 00:31:33,750 You know how they actually found the value of pi at the 558 00:31:33,750 --> 00:31:35,240 beginning, pi? 559 00:31:35,240 --> 00:31:38,210 560 00:31:38,210 --> 00:31:40,290 It's again using a Monte Carlo simulation. 561 00:31:40,290 --> 00:31:49,980 What you could do is for a given radius, you can actually 562 00:31:49,980 --> 00:31:51,920 check whether it lies within a circle or not. 563 00:31:51,920 --> 00:31:53,680 You can simulate the Monte Carlo simulation. 564 00:31:53,680 --> 00:31:59,070 And given this radius, you can come up with a particular 565 00:31:59,070 --> 00:32:04,520 location at random and check whether it's within this 566 00:32:04,520 --> 00:32:07,700 boundary or not, OK? 567 00:32:07,700 --> 00:32:10,080 So then, you know the outcome. 568 00:32:10,080 --> 00:32:11,170 You know the outcomes, right? 569 00:32:11,170 --> 00:32:22,280 So suppose this is n_a, And the total outcome is n_t. 570 00:32:22,280 --> 00:32:24,250 This gives you the area, right? 571 00:32:24,250 --> 00:32:29,212 We know this is r-squared, and this is pi r-squared. 572 00:32:29,212 --> 00:32:30,462 Sorry. 573 00:32:30,462 --> 00:32:36,920 574 00:32:36,920 --> 00:32:39,621 When this is 4 r-squared, this is 2r, right? 575 00:32:39,621 --> 00:32:44,700 576 00:32:44,700 --> 00:32:47,135 So using this, you can easily calculate pi. 577 00:32:47,135 --> 00:32:55,020 578 00:32:55,020 --> 00:32:59,270 So now, since we are going to come up with these parameters 579 00:32:59,270 --> 00:33:05,800 through repeating the trials, we need to have a standardized 580 00:33:05,800 --> 00:33:08,740 way of finding these parameters. 581 00:33:08,740 --> 00:33:11,470 We can't simply say this, right? 582 00:33:11,470 --> 00:33:13,615 Take this example. 583 00:33:13,615 --> 00:33:17,640 You know this MIT shuttle right? 584 00:33:17,640 --> 00:33:21,380 A shuttle arriving at the right time, or the time 585 00:33:21,380 --> 00:33:24,870 difference between the arrival and the actual quoted time can 586 00:33:24,870 --> 00:33:27,380 be plotted in a graph. 587 00:33:27,380 --> 00:33:31,220 So if you put that it is spread around 0, right? 588 00:33:31,220 --> 00:33:35,010 Probably, or we hope so. 589 00:33:35,010 --> 00:33:36,370 OK? 590 00:33:36,370 --> 00:33:47,840 Now, from this, we can see that actually the mean of this 591 00:33:47,840 --> 00:33:52,950 simulation will give you the expected difference in the 592 00:33:52,950 --> 00:33:58,330 time, the expected difference in the arrival time from the 593 00:33:58,330 --> 00:33:59,580 actual quoted time. 594 00:33:59,580 --> 00:34:01,890 595 00:34:01,890 --> 00:34:06,830 And we hope this expectation to be 0. 596 00:34:06,830 --> 00:34:09,150 We call that mean. 597 00:34:09,150 --> 00:34:10,400 Means is taking the average. 598 00:34:10,400 --> 00:34:20,550 599 00:34:20,550 --> 00:34:26,389 But this distribution might actually give you some 600 00:34:26,389 --> 00:34:29,650 information, some extra information, as well. 601 00:34:29,650 --> 00:34:34,150 That is, how well we can actually believe this, how 602 00:34:34,150 --> 00:34:35,700 much we can rely on this. 603 00:34:35,700 --> 00:34:41,340 If the spread is greater, something like this, then 604 00:34:41,340 --> 00:34:44,449 probably you might actually not trust the system, right? 605 00:34:44,449 --> 00:34:47,340 Although the mean is 0, it's going to come 606 00:34:47,340 --> 00:34:48,449 early or late, right? 607 00:34:48,449 --> 00:34:49,699 Which means it's useless. 608 00:34:49,699 --> 00:34:52,750 609 00:34:52,750 --> 00:35:00,090 Similarly, in this case, we have a spread around mean 0. 610 00:35:00,090 --> 00:35:10,280 But if you take the score, the marks you get for 600, it 611 00:35:10,280 --> 00:35:11,290 could be something like this. 612 00:35:11,290 --> 00:35:13,730 It's not centered around 0, right? 613 00:35:13,730 --> 00:35:14,790 Hopefully. 614 00:35:14,790 --> 00:35:18,570 It's probably, say, 50. 615 00:35:18,570 --> 00:35:22,850 Then, we actually want the spread to be small or large? 616 00:35:22,850 --> 00:35:25,420 617 00:35:25,420 --> 00:35:31,290 We want the spread to be large because we want to distinguish 618 00:35:31,290 --> 00:35:32,580 the levels, right? 619 00:35:32,580 --> 00:35:34,360 The students' level of understanding. 620 00:35:34,360 --> 00:35:40,270 600. 621 00:35:40,270 --> 00:35:44,930 Anyway, so the spread determines what is the 622 00:35:44,930 --> 00:35:50,650 variation percent in their distribution of the scores? 623 00:35:50,650 --> 00:35:53,305 We measure that by a variable called standard deviation. 624 00:35:53,305 --> 00:35:59,340 625 00:35:59,340 --> 00:36:05,630 In this case, this particular sample will be different from 626 00:36:05,630 --> 00:36:10,770 its mean by a particular value, right? 627 00:36:10,770 --> 00:36:17,980 We can express that as x_i minus its mean. 628 00:36:17,980 --> 00:36:19,318 Let's call the mean mu. 629 00:36:19,318 --> 00:36:22,070 630 00:36:22,070 --> 00:36:24,440 So this would be the difference. 631 00:36:24,440 --> 00:36:29,400 Standard deviation is summing up all the differences. 632 00:36:29,400 --> 00:36:32,210 But the problem is, when you sum up the differences, it'll 633 00:36:32,210 --> 00:36:34,210 be 0, right? 634 00:36:34,210 --> 00:36:36,890 The total summation of the differences will be 0 if 635 00:36:36,890 --> 00:36:42,380 that's how you get the mean because if you expand this, 636 00:36:42,380 --> 00:36:43,760 it'll be something like this, right? 637 00:36:43,760 --> 00:36:49,000 638 00:36:49,000 --> 00:36:50,250 Which will be n mu. 639 00:36:50,250 --> 00:37:03,030 640 00:37:03,030 --> 00:37:05,160 Should be equal to 0. 641 00:37:05,160 --> 00:37:08,690 So we have to sum, or actually take the 642 00:37:08,690 --> 00:37:10,490 differences into account. 643 00:37:10,490 --> 00:37:12,540 So, let's square this. 644 00:37:12,540 --> 00:37:17,350 So now, it will no longer be 0. 645 00:37:17,350 --> 00:37:21,142 Now, this gives 0, the differences. 646 00:37:21,142 --> 00:37:24,330 It's the squared sum of the differences averaged across 647 00:37:24,330 --> 00:37:25,580 all the samples. 648 00:37:25,580 --> 00:37:27,820 649 00:37:27,820 --> 00:37:29,315 We call this variance. 650 00:37:29,315 --> 00:37:32,280 651 00:37:32,280 --> 00:37:33,370 And the square root of 652 00:37:33,370 --> 00:37:36,555 variance is standard deviation. 653 00:37:36,555 --> 00:37:45,530 654 00:37:45,530 --> 00:37:47,320 OK? 655 00:37:47,320 --> 00:37:51,930 Now, having a standard deviation-- 656 00:37:51,930 --> 00:37:54,650 657 00:37:54,650 --> 00:37:57,050 so we know the standard deviation tells you how spread 658 00:37:57,050 --> 00:37:59,910 the distribution is. 659 00:37:59,910 --> 00:38:04,280 But can we actually rely only on the standard deviation to 660 00:38:04,280 --> 00:38:09,230 determine the consistency of some event? 661 00:38:09,230 --> 00:38:11,390 Can we? 662 00:38:11,390 --> 00:38:12,070 Probably not. 663 00:38:12,070 --> 00:38:19,050 Suppose take two examples, one is the scores, 50. 664 00:38:19,050 --> 00:38:20,920 And suppose the standard deviation is minus 665 00:38:20,920 --> 00:38:24,270 10, plus 10, OK? 666 00:38:24,270 --> 00:38:26,290 So the standard deviation is 10 here. 667 00:38:26,290 --> 00:38:29,720 Suppose it lies in this form. 668 00:38:29,720 --> 00:38:34,420 Consider another example, the weight, the weight of the 669 00:38:34,420 --> 00:38:38,360 people, like say at MIT. 670 00:38:38,360 --> 00:38:44,850 And suppose it's centered around 150. 671 00:38:44,850 --> 00:38:50,120 Now, if the standard deviation is, say, 10, then the standard 672 00:38:50,120 --> 00:38:53,780 deviation 10 here and the standard deviation 10 here 673 00:38:53,780 --> 00:38:59,640 don't convey the same message, OK? 674 00:38:59,640 --> 00:39:07,110 So we need to have a different way of expressing the 675 00:39:07,110 --> 00:39:10,650 consistency of a distribution. 676 00:39:10,650 --> 00:39:29,110 So we represent it by coefficient of variation, 677 00:39:29,110 --> 00:39:37,690 which is equal to the standard deviation divided by mean. 678 00:39:37,690 --> 00:39:42,810 679 00:39:42,810 --> 00:39:47,240 Now here, it will be 10/150. 680 00:39:47,240 --> 00:39:50,810 Here, it will be 10/50. 681 00:39:50,810 --> 00:39:55,100 So we know this is more consistent than this. 682 00:39:55,100 --> 00:40:01,110 The weights of the students at MIT, it's more consistent than 683 00:40:01,110 --> 00:40:05,215 the marks you might get, or you get, for 600. 684 00:40:05,215 --> 00:40:06,465 It might be true. 685 00:40:06,465 --> 00:40:11,120 686 00:40:11,120 --> 00:40:16,530 Now, what is for the use of the standard deviation? 687 00:40:16,530 --> 00:40:17,780 How can we use that? 688 00:40:17,780 --> 00:40:20,610 689 00:40:20,610 --> 00:40:28,220 Let's look at this graph where suppose the mean is 0 and the 690 00:40:28,220 --> 00:40:31,150 standard deviation is, say, 5. 691 00:40:31,150 --> 00:40:34,370 692 00:40:34,370 --> 00:40:37,460 Consider another example where standard deviation is 10. 693 00:40:37,460 --> 00:40:43,890 694 00:40:43,890 --> 00:40:46,510 It might have been like this, OK? 695 00:40:46,510 --> 00:40:58,680 Now, before that, let me sort of digress a little bit so I 696 00:40:58,680 --> 00:41:00,030 can explain this better. 697 00:41:00,030 --> 00:41:03,120 698 00:41:03,120 --> 00:41:08,106 We can take the outcome of a particular event as a sample 699 00:41:08,106 --> 00:41:10,440 in our distribution. 700 00:41:10,440 --> 00:41:12,920 So suppose you're throwing a die. 701 00:41:12,920 --> 00:41:15,560 So you get an outcome. 702 00:41:15,560 --> 00:41:21,650 You can represent that outcome as a distribution, OK? 703 00:41:21,650 --> 00:41:29,810 So here, there's x, which can take 1 to, say, 6. 704 00:41:29,810 --> 00:41:33,450 And we can represent x_i as a sample point in our 705 00:41:33,450 --> 00:41:36,110 distribution. 706 00:41:36,110 --> 00:41:43,060 So I don't know, it might be uniform, probably, we hope. 707 00:41:43,060 --> 00:41:46,750 So it's with 1/6 probability, we always take 708 00:41:46,750 --> 00:41:47,402 one of these values. 709 00:41:47,402 --> 00:41:48,652 OK. 710 00:41:48,652 --> 00:41:50,120 711 00:41:50,120 --> 00:41:53,020 But this might not be the case with all events. 712 00:41:53,020 --> 00:41:57,250 713 00:41:57,250 --> 00:42:02,090 OK, so what I'm trying to say here is you can actually 714 00:42:02,090 --> 00:42:07,040 represent the outcome of the trial in the distribution. 715 00:42:07,040 --> 00:42:10,410 Or you can also represent the probability of something 716 00:42:10,410 --> 00:42:12,115 happening in a distribution. 717 00:42:12,115 --> 00:42:15,650 718 00:42:15,650 --> 00:42:16,700 How does it work? 719 00:42:16,700 --> 00:42:19,840 OK, in this case, we throw our dice. 720 00:42:19,840 --> 00:42:20,930 We get an outcome. 721 00:42:20,930 --> 00:42:23,170 We go and put it in the x-axis. 722 00:42:23,170 --> 00:42:26,140 It could be between 1 and 6. 723 00:42:26,140 --> 00:42:27,575 And it takes this distribution. 724 00:42:27,575 --> 00:42:30,220 725 00:42:30,220 --> 00:42:34,550 In addition, what you could do is you could 726 00:42:34,550 --> 00:42:36,780 have, say, 100 trials. 727 00:42:36,780 --> 00:42:38,810 So you throw a coin. 728 00:42:38,810 --> 00:42:40,270 You take 100 trials. 729 00:42:40,270 --> 00:42:44,570 You get the mean, you get the probability of getting a head. 730 00:42:44,570 --> 00:42:46,850 And you have that mean, right? 731 00:42:46,850 --> 00:42:49,230 So probability of getting a head for 100 732 00:42:49,230 --> 00:42:53,690 trials, say, 0.51. 733 00:42:53,690 --> 00:42:58,610 You do another 100 trials, you got another one. 734 00:42:58,610 --> 00:43:00,900 So you have now another distribution. 735 00:43:00,900 --> 00:43:03,600 So there's a distribution of probabilities. 736 00:43:03,600 --> 00:43:05,910 So you can have a distribution of probabilities, or you can 737 00:43:05,910 --> 00:43:08,600 have a distribution for the events. 738 00:43:08,600 --> 00:43:12,280 We handle these two cases in the p-set. 739 00:43:12,280 --> 00:43:16,150 So probably you should be able to distinguish those two. 740 00:43:16,150 --> 00:43:21,410 Anyway, so here in this particular example, let's take 741 00:43:21,410 --> 00:43:23,710 this as our mu. 742 00:43:23,710 --> 00:43:25,700 Let's take this as our standard deviation. 743 00:43:25,700 --> 00:43:29,090 And for the first distribution, let's take the 744 00:43:29,090 --> 00:43:30,870 standard deviation to be 5. 745 00:43:30,870 --> 00:43:32,920 When the standard deviation is great, it's 746 00:43:32,920 --> 00:43:36,130 going to be more spread. 747 00:43:36,130 --> 00:43:39,320 It's going to be more distributed than the former. 748 00:43:39,320 --> 00:43:41,960 So here, say the standard deviation is 10. 749 00:43:41,960 --> 00:43:45,330 750 00:43:45,330 --> 00:43:49,200 The standard deviation is a way of expressing how many 751 00:43:49,200 --> 00:43:53,790 items, how many samples are going to lie between those 752 00:43:53,790 --> 00:43:56,610 particular boundaries. 753 00:43:56,610 --> 00:44:02,080 So for a normal distribution, we know the exact area, exact 754 00:44:02,080 --> 00:44:03,950 probability of things happening. 755 00:44:03,950 --> 00:44:07,670 756 00:44:07,670 --> 00:44:12,320 If there's no mu, we know within the first standard 757 00:44:12,320 --> 00:44:26,710 deviation, there will be 68% of events lie in that area. 758 00:44:26,710 --> 00:44:28,135 Within two standard deviations-- 759 00:44:28,135 --> 00:44:34,150 760 00:44:34,150 --> 00:44:39,465 OK, one standard deviation, 68%. 761 00:44:39,465 --> 00:44:42,580 762 00:44:42,580 --> 00:44:45,520 Two standard deviations on either side, 763 00:44:45,520 --> 00:44:47,910 it's going to be 95%. 764 00:44:47,910 --> 00:44:53,110 Three standard deviations, it's going to be 99%. 765 00:44:53,110 --> 00:44:59,760 So suppose you conducted so many trials. 766 00:44:59,760 --> 00:45:02,260 And you get the values. 767 00:45:02,260 --> 00:45:08,500 And in the distribution, suppose mu, mean, is 10, and 768 00:45:08,500 --> 00:45:09,750 the standard deviation is, say, 1. 769 00:45:09,750 --> 00:45:12,510 770 00:45:12,510 --> 00:45:19,430 So now, with 99% confidence, we can say then the outcome of 771 00:45:19,430 --> 00:45:22,710 the next trial is going to be between what? 772 00:45:22,710 --> 00:45:26,200 773 00:45:26,200 --> 00:45:31,250 7 and 13, right? 774 00:45:31,250 --> 00:45:34,480 So this is where finding the distribution and standard 775 00:45:34,480 --> 00:45:40,540 deviation helps us giving a confidence interval, 776 00:45:40,540 --> 00:45:43,340 expressing our belief of that particular event happening. 777 00:45:43,340 --> 00:45:47,050 778 00:45:47,050 --> 00:45:51,290 We will look at a few examples because you might need this in 779 00:45:51,290 --> 00:45:52,540 your p-set. 780 00:45:52,540 --> 00:46:18,600 781 00:46:18,600 --> 00:46:20,156 So this particular function you have 782 00:46:20,156 --> 00:46:22,930 already seen in the lecture. 783 00:46:22,930 --> 00:46:27,310 784 00:46:27,310 --> 00:46:31,870 But we need to understand this particular part. 785 00:46:31,870 --> 00:46:35,160 786 00:46:35,160 --> 00:46:39,150 Suppose you have a probability of something happening. 787 00:46:39,150 --> 00:46:40,620 Suppose you estimated the probability 788 00:46:40,620 --> 00:46:41,300 of something happening. 789 00:46:41,300 --> 00:46:46,880 Suppose you're given the coin is biased, OK? 790 00:46:46,880 --> 00:46:47,940 Sorry, unbiased. 791 00:46:47,940 --> 00:46:51,840 So we know p of H is equal to 1/2. 792 00:46:51,840 --> 00:46:55,210 How can we simulate an outcome? 793 00:46:55,210 --> 00:46:57,740 How can you simulate an outcome and see whether it's a 794 00:46:57,740 --> 00:47:01,030 head or a tail with this particular probability? 795 00:47:01,030 --> 00:47:08,090 We do that by calling this function, random.random(), 796 00:47:08,090 --> 00:47:12,160 which is going to give you a random value between 0 and 1. 797 00:47:12,160 --> 00:47:14,160 And you're going to check whether it's 798 00:47:14,160 --> 00:47:16,300 below this or not. 799 00:47:16,300 --> 00:47:19,710 If it's below this, we can take it as head. 800 00:47:19,710 --> 00:47:21,620 If it's not, it's tail. 801 00:47:21,620 --> 00:47:25,740 And this will happen with probability 1/2, because the 802 00:47:25,740 --> 00:47:29,780 random function is going to return a value between 0 and 1 803 00:47:29,780 --> 00:47:31,180 with equal probabilities. 804 00:47:31,180 --> 00:47:34,270 It's uniform probabilities. 805 00:47:34,270 --> 00:47:37,970 So to simulate a head or tail, you call that function. 806 00:47:37,970 --> 00:47:41,930 You write the expression like that, OK? 807 00:47:41,930 --> 00:47:48,180 808 00:47:48,180 --> 00:47:53,110 Then, if you consider this example, for a certain number 809 00:47:53,110 --> 00:47:56,890 of flips, we simulate the event. 810 00:47:56,890 --> 00:47:58,970 And we count the number of heads we obtain. 811 00:47:58,970 --> 00:48:03,950 812 00:48:03,950 --> 00:48:06,240 And also from that, you can calculate the 813 00:48:06,240 --> 00:48:07,590 number of tails as well. 814 00:48:07,590 --> 00:48:11,580 If you know the total flips, you know the number of tails. 815 00:48:11,580 --> 00:48:14,765 Using that, we are taking two ratios. 816 00:48:14,765 --> 00:48:16,890 Now, the ratio between the heads and tails, and the 817 00:48:16,890 --> 00:48:19,690 difference between heads and tails. 818 00:48:19,690 --> 00:48:24,160 We are doing this for certain number of trials. 819 00:48:24,160 --> 00:48:27,530 And we're going to take the mean and standard deviation of 820 00:48:27,530 --> 00:48:32,220 these trials, OK? 821 00:48:32,220 --> 00:48:38,170 So here in our distribution, what are we considering? 822 00:48:38,170 --> 00:48:42,220 823 00:48:42,220 --> 00:48:46,000 What is going to build our distribution here? 824 00:48:46,000 --> 00:48:49,010 825 00:48:49,010 --> 00:48:50,310 The ratios, right? 826 00:48:50,310 --> 00:48:53,560 The ratios of the events. 827 00:48:53,560 --> 00:48:58,130 And we simulated certain number of trials to get those 828 00:48:58,130 --> 00:49:01,570 events, OK? 829 00:49:01,570 --> 00:49:04,470 Only if you simulate certain number of trials, you can 830 00:49:04,470 --> 00:49:08,240 actually summarize the outcome of the events in mean and 831 00:49:08,240 --> 00:49:10,530 standard deviation. 832 00:49:10,530 --> 00:49:14,480 This is exactly like the difference in the times of the 833 00:49:14,480 --> 00:49:20,350 bus arriving and the quoted times. 834 00:49:20,350 --> 00:49:23,700 Let's check this example. 835 00:49:23,700 --> 00:49:25,010 Let's plot this and see. 836 00:49:25,010 --> 00:49:41,830 837 00:49:41,830 --> 00:49:43,080 It's going to take a while. 838 00:49:43,080 --> 00:49:48,390 839 00:49:48,390 --> 00:49:51,590 OK, that's another thing I want to explain here because 840 00:49:51,590 --> 00:49:53,555 since you're going to be going to plot-- 841 00:49:53,555 --> 00:49:58,050 we are going to use PyLab extensively and plot graphs. 842 00:49:58,050 --> 00:50:02,090 You'll need to put a title and labels to all the plots you're 843 00:50:02,090 --> 00:50:03,040 generating. 844 00:50:03,040 --> 00:50:06,160 Plus, you can use this text to actually put 845 00:50:06,160 --> 00:50:07,190 the text in the graph. 846 00:50:07,190 --> 00:50:09,720 We will show that in a while. 847 00:50:09,720 --> 00:50:10,970 Plus-- 848 00:50:10,970 --> 00:50:13,250 849 00:50:13,250 --> 00:50:14,230 here, sorry. 850 00:50:14,230 --> 00:50:19,310 If you want to change the axis to log-log scale, you can call 851 00:50:19,310 --> 00:50:24,310 this comma at the end after calling the plot. 852 00:50:24,310 --> 00:50:27,250 Because you might sometimes need to change the axis to log 853 00:50:27,250 --> 00:50:28,913 scale in x and y-axis. 854 00:50:28,913 --> 00:50:33,930 855 00:50:33,930 --> 00:50:40,260 So this is the mean, heads versus tails. 856 00:50:40,260 --> 00:50:45,760 And if you can see it, the mean tends to be 1 when we 857 00:50:45,760 --> 00:50:48,870 have a large number of flips. 858 00:50:48,870 --> 00:50:52,860 So to get the consistency, we need to simulate 859 00:50:52,860 --> 00:50:55,950 large number of trials. 860 00:50:55,950 --> 00:51:00,610 Then only it will tend to be close to the mean, OK? 861 00:51:00,610 --> 00:51:04,000 862 00:51:04,000 --> 00:51:07,740 This is sort of a way of checking the evolution of the 863 00:51:07,740 --> 00:51:13,540 series by actually doing it for a certain number of flips 864 00:51:13,540 --> 00:51:14,926 at every time. 865 00:51:14,926 --> 00:51:19,280 So it's quite like a scatter plot. 866 00:51:19,280 --> 00:51:27,250 A scatter plot is like plotting the outcomes of our 867 00:51:27,250 --> 00:51:28,500 experiments. 868 00:51:28,500 --> 00:51:30,530 869 00:51:30,530 --> 00:51:36,230 Suppose it's x1 and x2 in a graph. 870 00:51:36,230 --> 00:51:37,420 So we are going to say-- 871 00:51:37,420 --> 00:51:42,800 so for example, suppose you have a variable, and the 872 00:51:42,800 --> 00:51:46,090 variable causes an outcome-- 873 00:51:46,090 --> 00:51:51,360 a probability of the coin flip, so p of H. And it can 874 00:51:51,360 --> 00:51:57,290 result in a certain number of heads appearing, say n of H. 875 00:51:57,290 --> 00:52:02,150 Now, you can do a scatter plot between these two variables. 876 00:52:02,150 --> 00:52:04,050 And it will be probably a spread. 877 00:52:04,050 --> 00:52:07,990 But we know that if you increase the probability of 878 00:52:07,990 --> 00:52:11,510 heads, the number of heads is going to increase as well. 879 00:52:11,510 --> 00:52:13,340 So it would be probably something like this. 880 00:52:13,340 --> 00:52:18,320 881 00:52:18,320 --> 00:52:20,742 From this, we can assume that it's linear or 882 00:52:20,742 --> 00:52:21,140 something like that. 883 00:52:21,140 --> 00:52:24,660 But the scatter plot is actually representing the 884 00:52:24,660 --> 00:52:28,620 outcomes of the trial versus some other variable in the 885 00:52:28,620 --> 00:52:30,877 graph and visualize it. 886 00:52:30,877 --> 00:52:33,560 887 00:52:33,560 --> 00:52:36,230 And let me show the last graph, and we'll 888 00:52:36,230 --> 00:52:37,480 be done with that. 889 00:52:37,480 --> 00:52:52,040 890 00:52:52,040 --> 00:52:55,710 So this, again, we actually know, instead of putting a 891 00:52:55,710 --> 00:53:00,080 scatter plot, we're actually giving the distribution as a 892 00:53:00,080 --> 00:53:06,340 histogram and printing a text box in the graph. 893 00:53:06,340 --> 00:53:09,970 This might be useful if you want to display something on 894 00:53:09,970 --> 00:53:12,840 your graph. 895 00:53:12,840 --> 00:53:15,990 I guess we will be uploading the code to the site. 896 00:53:15,990 --> 00:53:19,080 So you can check the code if you want later, OK? 897 00:53:19,080 --> 00:53:20,510 Sure. 898 00:53:20,510 --> 00:53:21,760 See you next week. 899 00:53:21,760 --> 00:53:27,615