1
00:00:00,000 --> 00:00:00,530

2
00:00:00,530 --> 00:00:02,960
The following content is
provided under a Creative

3
00:00:02,960 --> 00:00:04,370
Commons license.

4
00:00:04,370 --> 00:00:07,410
Your support will help MIT
OpenCourseWare continue to

5
00:00:07,410 --> 00:00:11,060
offer high quality educational
resources for free.

6
00:00:11,060 --> 00:00:13,960
To make a donation or view
additional materials from

7
00:00:13,960 --> 00:00:19,790
hundreds of MIT courses, visit
MIT OpenCourseWare at

8
00:00:19,790 --> 00:00:22,456
ocw.mit.edu.

9
00:00:22,456 --> 00:00:25,760
PROFESSOR: Today's focus is
probability and statistics.

10
00:00:25,760 --> 00:00:29,180
So let's start with
probability.

11
00:00:29,180 --> 00:00:33,246
Let's look at probability
for binary variables.

12
00:00:33,246 --> 00:00:39,160

13
00:00:39,160 --> 00:00:43,070
What do you mean by
a binary variable?

14
00:00:43,070 --> 00:00:45,650
It can take only two outcomes.

15
00:00:45,650 --> 00:00:48,920
So it can take only
two values.

16
00:00:48,920 --> 00:00:55,320
For example, it could be 0 or
1, head or tail, on or off.

17
00:00:55,320 --> 00:00:58,460

18
00:00:58,460 --> 00:01:03,670
So we are going to call this
variable A, for instance.

19
00:01:03,670 --> 00:01:11,900
So A could be H, or A is equal
to T. But that could happen.

20
00:01:11,900 --> 00:01:16,110
That event could happen with
a certain probability.

21
00:01:16,110 --> 00:01:18,920
So by that, I mean the
probabilities, like we are

22
00:01:18,920 --> 00:01:21,520
expressing the belief that the

23
00:01:21,520 --> 00:01:24,170
particularly event could happen.

24
00:01:24,170 --> 00:01:28,190
So we could assign
a value to that.

25
00:01:28,190 --> 00:01:36,380
That is the probability
of A taking value H.

26
00:01:36,380 --> 00:01:41,170
So here, the values
of A and B--

27
00:01:41,170 --> 00:01:45,410
sorry, here, the value of A can
be either H or T, which

28
00:01:45,410 --> 00:01:49,370
means it has only two
possible outcomes.

29
00:01:49,370 --> 00:01:51,190
That's why we call it
a binary variable.

30
00:01:51,190 --> 00:02:06,470
However, P of A is equal to H
can lie anywhere from 0 and 1,

31
00:02:06,470 --> 00:02:08,878
including 0 and 1.

32
00:02:08,878 --> 00:02:10,190
AUDIENCE: They don't
have to be even?

33
00:02:10,190 --> 00:02:10,876
PROFESSOR: Sorry?

34
00:02:10,876 --> 00:02:12,750
AUDIENCE: They don't
have to be even?

35
00:02:12,750 --> 00:02:13,090
PROFESSOR: Even?

36
00:02:13,090 --> 00:02:17,190
AUDIENCE: Even chance, even
probability, like the same.

37
00:02:17,190 --> 00:02:19,900
PROFESSOR: Sorry, I didn't
get your question.

38
00:02:19,900 --> 00:02:23,050
AUDIENCE: Even though they're
binary, don't you need be able

39
00:02:23,050 --> 00:02:27,780
to have the same probability?

40
00:02:27,780 --> 00:02:29,450
PROFESSOR: OK, we'll
look at that later.

41
00:02:29,450 --> 00:02:32,630
Like, this particular event
can take a particular

42
00:02:32,630 --> 00:02:33,600
probability.

43
00:02:33,600 --> 00:02:36,500
And we'll look at that
particular case later.

44
00:02:36,500 --> 00:02:39,130
But in general, a probability
will always lie

45
00:02:39,130 --> 00:02:40,740
between 0 and 1.

46
00:02:40,740 --> 00:02:43,770

47
00:02:43,770 --> 00:02:48,740
And it can take any value
between 0 and 1 since the

48
00:02:48,740 --> 00:02:51,395
range it can take is continuous,
sorry discrete.

49
00:02:51,395 --> 00:02:57,870

50
00:02:57,870 --> 00:03:02,330
However, the value the variable
can take is going to

51
00:03:02,330 --> 00:03:03,330
be discrete.

52
00:03:03,330 --> 00:03:08,190
It can take only H or T. So
that's why you call it a

53
00:03:08,190 --> 00:03:09,520
binary variable.

54
00:03:09,520 --> 00:03:13,110
For example, take
a deck of cards.

55
00:03:13,110 --> 00:03:17,910
Here, the value could be, for
example if you consider only

56
00:03:17,910 --> 00:03:20,420
one particular suit,
then it can be any

57
00:03:20,420 --> 00:03:21,810
one of those 13 values.

58
00:03:21,810 --> 00:03:24,350

59
00:03:24,350 --> 00:03:27,560
So there, this variable
is not binary.

60
00:03:27,560 --> 00:03:30,380
However, the probability of a
particular event happening is

61
00:03:30,380 --> 00:03:33,830
always between 0 and 1.

62
00:03:33,830 --> 00:03:37,690
Now, let's look at some
probability, like what you

63
00:03:37,690 --> 00:03:42,470
asked earlier is whether they
will be equal, whether the

64
00:03:42,470 --> 00:03:46,430
probably of head and
tail can be equal.

65
00:03:46,430 --> 00:03:51,730
So let's represent the
probability of A of H. This

66
00:03:51,730 --> 00:03:54,570
can be between 0 and 1.

67
00:03:54,570 --> 00:04:00,480
What is the probability
of A not happening?

68
00:04:00,480 --> 00:04:01,810
So we call it by A bar.

69
00:04:01,810 --> 00:04:06,000

70
00:04:06,000 --> 00:04:09,700
Given P of A, can you
give me P of A bar?

71
00:04:09,700 --> 00:04:10,660
AUDIENCE:1 minus P of A.

72
00:04:10,660 --> 00:04:16,070
PROFESSOR: 1 minus P of A.
If there are two events

73
00:04:16,070 --> 00:04:22,190
happening, for example, you're
throwing two coins, then we

74
00:04:22,190 --> 00:04:23,730
can consider their joint
probabilities.

75
00:04:23,730 --> 00:04:26,520

76
00:04:26,520 --> 00:04:33,030
So let's say we have a coin, A,
and this coin, B. So this

77
00:04:33,030 --> 00:04:34,940
coin can take two values.

78
00:04:34,940 --> 00:04:39,070
And so this coin can take
another two values.

79
00:04:39,070 --> 00:04:40,320
Sorry.

80
00:04:40,320 --> 00:04:51,618

81
00:04:51,618 --> 00:04:57,410
We know A can take H with
probability, say I assume it's

82
00:04:57,410 --> 00:04:59,380
unbiased, so it'll be 1/2.

83
00:04:59,380 --> 00:05:02,050

84
00:05:02,050 --> 00:05:03,330
All these are going to be 1/2.

85
00:05:03,330 --> 00:05:09,050

86
00:05:09,050 --> 00:05:11,415
What's the probability of HT?

87
00:05:11,415 --> 00:05:14,110

88
00:05:14,110 --> 00:05:20,020
So now, we are considering a
joint event, P of A is equal

89
00:05:20,020 --> 00:05:32,900
to H and P of B is equal to
T. So in probability, we

90
00:05:32,900 --> 00:05:35,160
represent it by something
like this.

91
00:05:35,160 --> 00:05:40,200
P A- do you know what is that?

92
00:05:40,200 --> 00:05:45,450
P A intersection B, you want
both events to happen.

93
00:05:45,450 --> 00:05:48,000

94
00:05:48,000 --> 00:05:57,350
That will be P of A. And in this
case, it's P of B. So we

95
00:05:57,350 --> 00:06:00,160
could simply say it's 1/4.

96
00:06:00,160 --> 00:06:01,570
Why is this possible?

97
00:06:01,570 --> 00:06:04,200

98
00:06:04,200 --> 00:06:06,380
It's because these two events
are independent.

99
00:06:06,380 --> 00:06:09,760

100
00:06:09,760 --> 00:06:14,100
The coin A getting head
doesn't affect

101
00:06:14,100 --> 00:06:17,810
coin B getting a tail.

102
00:06:17,810 --> 00:06:20,790
So it doesn't have
any influence.

103
00:06:20,790 --> 00:06:23,510
That's why these two events
are independent.

104
00:06:23,510 --> 00:06:27,360
The dependent events are a
bit complex, to analyze.

105
00:06:27,360 --> 00:06:30,860
Let's skip them at the moment.

106
00:06:30,860 --> 00:06:33,190
So we know all these
probabilities

107
00:06:33,190 --> 00:06:34,440
are going to be 1/4.

108
00:06:34,440 --> 00:06:36,770

109
00:06:36,770 --> 00:06:41,380
So we looked at a particular
condition here.

110
00:06:41,380 --> 00:06:44,780
That is, A taking head
and B taking tail.

111
00:06:44,780 --> 00:06:53,640
What about the condition, what
about the case where either A

112
00:06:53,640 --> 00:06:57,290
or B takes a head?

113
00:06:57,290 --> 00:06:59,950
How can we represent that?

114
00:06:59,950 --> 00:07:05,560
So it will be something like A
is equal to H or B is equal to

115
00:07:05,560 --> 00:07:13,600
H. Oh, probability at least
1, so by that, I can also

116
00:07:13,600 --> 00:07:14,850
represent something like this.

117
00:07:14,850 --> 00:07:18,370

118
00:07:18,370 --> 00:07:20,100
OK, here, this is sufficient
anyway.

119
00:07:20,100 --> 00:07:27,900

120
00:07:27,900 --> 00:07:29,400
So what are the possibility
events?

121
00:07:29,400 --> 00:07:54,000

122
00:07:54,000 --> 00:08:00,660
So these three events could give
rise to this probability.

123
00:08:00,660 --> 00:08:02,840
It's better if you can represent
this in a diagram.

124
00:08:02,840 --> 00:08:06,760
So let's go and represent
this in a diagram.

125
00:08:06,760 --> 00:08:12,175
This is A and this is B
getting, say, head.

126
00:08:12,175 --> 00:08:15,200

127
00:08:15,200 --> 00:08:18,250
In one case, both
can take head.

128
00:08:18,250 --> 00:08:22,410
That is, this particular
condition, intersection we

129
00:08:22,410 --> 00:08:23,660
earlier looked at.

130
00:08:23,660 --> 00:08:32,309

131
00:08:32,309 --> 00:08:35,789
So what is this whole thing?

132
00:08:35,789 --> 00:08:41,150

133
00:08:41,150 --> 00:08:46,010
That is, either A gets
H or B gets H,

134
00:08:46,010 --> 00:08:47,720
which is this condition.

135
00:08:47,720 --> 00:08:51,200

136
00:08:51,200 --> 00:08:54,450
We call it P of A union B. Ok.

137
00:08:54,450 --> 00:09:02,360

138
00:09:02,360 --> 00:09:07,720
Is there an efficient way of
finding this rather than

139
00:09:07,720 --> 00:09:09,450
writing down all
possible cases?

140
00:09:09,450 --> 00:09:11,950

141
00:09:11,950 --> 00:09:17,760
Is there an efficient way of
finding P of A union B?

142
00:09:17,760 --> 00:09:19,916
From high school maths,
probably?

143
00:09:19,916 --> 00:09:21,730
No idea?

144
00:09:21,730 --> 00:09:22,670
OK.

145
00:09:22,670 --> 00:09:29,350
P of A union B is equal to P of
A plus P of B minus P of A

146
00:09:29,350 --> 00:09:32,530
intersection B. Because if you
consider P of A, you would

147
00:09:32,530 --> 00:09:34,800
have taken this full circle.

148
00:09:34,800 --> 00:09:35,940
When you take P of
B, you would have

149
00:09:35,940 --> 00:09:38,010
taken this full circle.

150
00:09:38,010 --> 00:09:41,490
So which means you're counting
this area twice.

151
00:09:41,490 --> 00:09:42,740
So here, we deduct it once.

152
00:09:42,740 --> 00:09:47,130

153
00:09:47,130 --> 00:09:48,283
OK?

154
00:09:48,283 --> 00:09:49,533
Great.

155
00:09:49,533 --> 00:09:51,600

156
00:09:51,600 --> 00:09:54,720
So this is the basics
of the probability.

157
00:09:54,720 --> 00:10:00,900
Now, actually we looked at two
events, two joint events here.

158
00:10:00,900 --> 00:10:04,040
But we should have
a formal way of

159
00:10:04,040 --> 00:10:07,350
looking at multiple events.

160
00:10:07,350 --> 00:10:09,720
So how can we do that?

161
00:10:09,720 --> 00:10:11,640
The first way is doing
it by trees.

162
00:10:11,640 --> 00:10:16,020

163
00:10:16,020 --> 00:10:20,120
Let's say we represent
the outcome of the

164
00:10:20,120 --> 00:10:22,540
first trial by a branch.

165
00:10:22,540 --> 00:10:28,240

166
00:10:28,240 --> 00:10:32,240
We can represent the outcome of
the second trial by another

167
00:10:32,240 --> 00:10:37,160
branch from these two
previous branches.

168
00:10:37,160 --> 00:10:38,710
So this would be
H HH HT TH TT.

169
00:10:38,710 --> 00:10:51,840

170
00:10:51,840 --> 00:10:56,370
And we know this could happen
with probability 1/2.

171
00:10:56,370 --> 00:11:00,430
So we know it's, again,
1/2, 1/2, 1/2, 1/2.

172
00:11:00,430 --> 00:11:01,680
So this is 1/4.

173
00:11:01,680 --> 00:11:11,440

174
00:11:11,440 --> 00:11:18,955
Suppose we want to do this for
an outcome of throwing dice.

175
00:11:18,955 --> 00:11:22,060

176
00:11:22,060 --> 00:11:25,540
Then, probably we would
have 6 branches here.

177
00:11:25,540 --> 00:11:28,330

178
00:11:28,330 --> 00:11:33,270
Which, again, forks into
another 36 branches.

179
00:11:33,270 --> 00:11:35,730
So there should be another
easier way.

180
00:11:35,730 --> 00:11:38,730
For that, we could use a second
method call grid.

181
00:11:38,730 --> 00:11:42,410

182
00:11:42,410 --> 00:11:44,130
We could simply put
that in a diagram.

183
00:11:44,130 --> 00:11:52,470

184
00:11:52,470 --> 00:11:54,310
So this is the first trial.

185
00:11:54,310 --> 00:11:57,040

186
00:11:57,040 --> 00:11:59,165
And this will be our
second trial.

187
00:11:59,165 --> 00:12:11,580

188
00:12:11,580 --> 00:12:17,710
So now, we can represent any
possible outcome on this grid.

189
00:12:17,710 --> 00:12:22,090
For example, can give you me an
example where you throw the

190
00:12:22,090 --> 00:12:27,430
same number in both
the trials?

191
00:12:27,430 --> 00:12:30,710
Then, what would be the layout
of it in this grid?

192
00:12:30,710 --> 00:12:34,150

193
00:12:34,150 --> 00:12:37,790
Throwing the same number
in both the trials.

194
00:12:37,790 --> 00:12:39,025
Here's the first trial.

195
00:12:39,025 --> 00:12:40,976
This, the second.

196
00:12:40,976 --> 00:12:42,720
Then it would be the diagonal.

197
00:12:42,720 --> 00:12:48,000

198
00:12:48,000 --> 00:12:51,940
If you want to calculate the
probability, do you know the

199
00:12:51,940 --> 00:13:01,580
probability is the ratio between
the outcomes we expect

200
00:13:01,580 --> 00:13:04,170
over all possible outcomes?

201
00:13:04,170 --> 00:13:09,170
So here, we know there will
be 6 instances in this

202
00:13:09,170 --> 00:13:11,060
highlighted area.

203
00:13:11,060 --> 00:13:15,840
Compare that, 36 to
all possibilities.

204
00:13:15,840 --> 00:13:17,266
So it'll be simpler 6/36.

205
00:13:17,266 --> 00:13:22,620

206
00:13:22,620 --> 00:13:26,840
How can you find the probability
of getting a

207
00:13:26,840 --> 00:13:28,670
cumulative total of, say, 6?

208
00:13:28,670 --> 00:13:33,880

209
00:13:33,880 --> 00:13:36,480
Then again, it would
be very simple.

210
00:13:36,480 --> 00:13:43,930
It could be 1, 5; 2, 4;
3, 3; 4, 2; 1, 5.

211
00:13:43,930 --> 00:13:46,000
All right?

212
00:13:46,000 --> 00:13:49,360
So it'll be 5 by 36.

213
00:13:49,360 --> 00:13:52,550

214
00:13:52,550 --> 00:13:53,480
OK?

215
00:13:53,480 --> 00:13:56,550
So either by using trees or
grid, you can easily find the

216
00:13:56,550 --> 00:13:57,800
probabilities.

217
00:13:57,800 --> 00:14:04,060

218
00:14:04,060 --> 00:14:06,806
Now, let's look at a few
concrete examples.

219
00:14:06,806 --> 00:14:22,440

220
00:14:22,440 --> 00:14:24,030
Let's see.

221
00:14:24,030 --> 00:14:27,710
Suppose we are throwing
three coins.

222
00:14:27,710 --> 00:14:33,610
Then, what is the probability
of one particular outcome in

223
00:14:33,610 --> 00:14:36,590
that trial, in all
three trials?

224
00:14:36,590 --> 00:14:38,980
What is the probability,
assuming that these are

225
00:14:38,980 --> 00:14:41,020
unbiased coins?

226
00:14:41,020 --> 00:14:44,170
What is the probability of
one particular outcome?

227
00:14:44,170 --> 00:14:46,915
Because how many possible
outcomes are there if you are

228
00:14:46,915 --> 00:14:48,165
throwing three coins?

229
00:14:48,165 --> 00:14:50,440

230
00:14:50,440 --> 00:14:53,410
Consider this tree.

231
00:14:53,410 --> 00:14:54,760
First, it splits into 2.

232
00:14:54,760 --> 00:14:56,040
Then, it splits into 4.

233
00:14:56,040 --> 00:14:57,940
Then?

234
00:14:57,940 --> 00:15:00,860
8, all right?

235
00:15:00,860 --> 00:15:04,990
OK, so there are 8 possible
outcomes.

236
00:15:04,990 --> 00:15:08,110
So each outcome will have
the probability 1/8.

237
00:15:08,110 --> 00:15:12,230

238
00:15:12,230 --> 00:15:16,310
so what is the probability of
heads appearing exactly twice?

239
00:15:16,310 --> 00:15:19,370

240
00:15:19,370 --> 00:15:21,760
How can you do that?

241
00:15:21,760 --> 00:15:24,620
Of course, you can write
the tree and count.

242
00:15:24,620 --> 00:15:26,480
What is the easier way
of doing that?

243
00:15:26,480 --> 00:15:30,450
Since we know this count, since
we know this probability

244
00:15:30,450 --> 00:15:32,310
of a particular event
happening?

245
00:15:32,310 --> 00:15:34,590
How can we come up with
the probability of

246
00:15:34,590 --> 00:15:36,510
getting exactly 2 heads?

247
00:15:36,510 --> 00:15:41,960

248
00:15:41,960 --> 00:15:44,880
It could be head, head, or
tail-- so this is by

249
00:15:44,880 --> 00:15:47,790
enumerating all the
possible outcomes.

250
00:15:47,790 --> 00:15:51,700
So it could have been head,
head, tail, where me put the

251
00:15:51,700 --> 00:15:54,080
tail only at the end.

252
00:15:54,080 --> 00:15:56,545
It could have been
head, tail, head.

253
00:15:56,545 --> 00:16:01,100
Or it could have been
tail, head, head.

254
00:16:01,100 --> 00:16:06,670
In these three cases, you're
getting exactly 2 heads.

255
00:16:06,670 --> 00:16:10,150
So we are enumerating all
possible outcomes.

256
00:16:10,150 --> 00:16:12,560
And we know each possible
outcome will take the

257
00:16:12,560 --> 00:16:14,400
probability 1/8.

258
00:16:14,400 --> 00:16:18,360
So the total probability
here is 3/8.

259
00:16:18,360 --> 00:16:18,950
OK?

260
00:16:18,950 --> 00:16:21,540
So this is one way of handling
a probability question.

261
00:16:21,540 --> 00:16:25,600

262
00:16:25,600 --> 00:16:28,670
You can do that only because
these are independent events.

263
00:16:28,670 --> 00:16:31,240
And you can sum them.

264
00:16:31,240 --> 00:16:32,490
We'll come to that later.

265
00:16:32,490 --> 00:16:43,070

266
00:16:43,070 --> 00:16:47,500
Suppose you are rolling
two four-sided dice.

267
00:16:47,500 --> 00:16:52,000
And assuming they're fair,
how many possible

268
00:16:52,000 --> 00:16:53,900
outcomes are there?

269
00:16:53,900 --> 00:16:59,585
Two four-sided dice, and
assuming that each of them are

270
00:16:59,585 --> 00:17:02,890
fair-- that means unbiased--

271
00:17:02,890 --> 00:17:05,740
how many possible outcomes
are there?

272
00:17:05,740 --> 00:17:08,839
Consider this tree.

273
00:17:08,839 --> 00:17:13,040
First, it branches into 4, OK?

274
00:17:13,040 --> 00:17:15,849
In the first trial, it's a
four-sided dice, so there are

275
00:17:15,849 --> 00:17:17,710
4 possible outcomes.

276
00:17:17,710 --> 00:17:18,960
So it branches into 4.

277
00:17:18,960 --> 00:17:22,770

278
00:17:22,770 --> 00:17:25,329
Then, each branch will,
in turn, fork

279
00:17:25,329 --> 00:17:27,280
into another 4 branches.

280
00:17:27,280 --> 00:17:31,100
So there are totally
16 outcomes.

281
00:17:31,100 --> 00:17:35,540
So what is the probability
of rolling a 2 and a 3?

282
00:17:35,540 --> 00:17:39,900
What is the probability of
rolling a 2 and a 3?

283
00:17:39,900 --> 00:17:44,950
Not in a given order, not
in the given order.

284
00:17:44,950 --> 00:17:46,770
Can anyone give the answer?

285
00:17:46,770 --> 00:17:49,870

286
00:17:49,870 --> 00:17:51,130
OK, let's see.

287
00:17:51,130 --> 00:17:54,510
So we have to roll
a 2 and a 3.

288
00:17:54,510 --> 00:17:57,250
So which means it could have
been 2, 3, or 3, 2.

289
00:17:57,250 --> 00:18:00,350

290
00:18:00,350 --> 00:18:05,730
And we know the probability
of each event is 1/16.

291
00:18:05,730 --> 00:18:07,750
So this will be 1/16.

292
00:18:07,750 --> 00:18:12,230
And this will be 1/16.

293
00:18:12,230 --> 00:18:14,810
So the total probability
is 1/8.

294
00:18:14,810 --> 00:18:17,560

295
00:18:17,560 --> 00:18:23,820
What is the probability of
getting the sum of the rolls

296
00:18:23,820 --> 00:18:25,960
an odd number?

297
00:18:25,960 --> 00:18:28,090
What is the probability of
getting an odd number as sum

298
00:18:28,090 --> 00:18:30,110
of the rolls?

299
00:18:30,110 --> 00:18:33,790
Now, this is getting a bit
tricky because now it's maybe

300
00:18:33,790 --> 00:18:37,880
a bit harder to enumerate
all possible cases.

301
00:18:37,880 --> 00:18:39,130
So how can we do that?

302
00:18:39,130 --> 00:18:46,250

303
00:18:46,250 --> 00:18:47,190
There should be a short cut.

304
00:18:47,190 --> 00:18:48,681
AUDIENCE: It can either
be odd or even.

305
00:18:48,681 --> 00:18:49,540
PROFESSOR: Sorry?

306
00:18:49,540 --> 00:18:51,640
AUDIENCE: You can either
get odd or even.

307
00:18:51,640 --> 00:18:53,620
PROFESSOR: It can be either
odd or even, right?

308
00:18:53,620 --> 00:18:55,230
So it will be 1/2.

309
00:18:55,230 --> 00:18:59,000
OK, there's another trick we
might be able to use to get

310
00:18:59,000 --> 00:19:00,250
the answers quickly.

311
00:19:00,250 --> 00:19:02,640

312
00:19:02,640 --> 00:19:07,060
What is the probability of the
first roll being equal to the

313
00:19:07,060 --> 00:19:08,310
second roll?

314
00:19:08,310 --> 00:19:13,200

315
00:19:13,200 --> 00:19:16,240
In the same line,
you can think.

316
00:19:16,240 --> 00:19:19,840
What is the probability of
getting the first roll equal

317
00:19:19,840 --> 00:19:21,320
to the second roll?

318
00:19:21,320 --> 00:19:22,580
It's quite similar to this.

319
00:19:22,580 --> 00:19:25,870

320
00:19:25,870 --> 00:19:27,120
Any ideas?

321
00:19:27,120 --> 00:19:30,550

322
00:19:30,550 --> 00:19:31,840
It's a four-sided dice.

323
00:19:31,840 --> 00:19:34,380
There are 4 possible outcomes.

324
00:19:34,380 --> 00:19:37,510
This is one case where it could
be 1, 1, or it could be

325
00:19:37,510 --> 00:19:39,900
2, 2, or 3, 3, or 4, 4.

326
00:19:39,900 --> 00:19:45,320
And if it's inside a dice,
it would be n, right?

327
00:19:45,320 --> 00:19:50,690
So if it's n-sided dice, there
and n possible outcomes

328
00:19:50,690 --> 00:19:56,550
desired, and totally
n by n outcomes.

329
00:19:56,550 --> 00:19:58,960
So you get 1/n probability.

330
00:19:58,960 --> 00:20:03,240

331
00:20:03,240 --> 00:20:08,340
What is the probability of at
least 1 roll equal to 4?

332
00:20:08,340 --> 00:20:10,340
At least 1 roll equal to 4?

333
00:20:10,340 --> 00:20:14,490

334
00:20:14,490 --> 00:20:15,750
This is very interesting.

335
00:20:15,750 --> 00:20:17,040
These type of questions,
you'll get in

336
00:20:17,040 --> 00:20:19,770
that Psets, I know.

337
00:20:19,770 --> 00:20:21,890
Probably in the quiz, too.

338
00:20:21,890 --> 00:20:25,060
What is the probability
of getting at least 1

339
00:20:25,060 --> 00:20:26,310
roll equal to 4?

340
00:20:26,310 --> 00:20:28,690

341
00:20:28,690 --> 00:20:30,780
OK, so what are the
possible outcomes?

342
00:20:30,780 --> 00:20:35,330
First roll, could be a 4.

343
00:20:35,330 --> 00:20:39,300
And the second roll
could be anything.

344
00:20:39,300 --> 00:20:42,565

345
00:20:42,565 --> 00:20:44,480
Or it could be 4, and
the first roll

346
00:20:44,480 --> 00:20:46,650
could have been anything.

347
00:20:46,650 --> 00:20:49,740
Or both could have been 4, but
we would have considered that

348
00:20:49,740 --> 00:20:50,990
here, as well.

349
00:20:50,990 --> 00:20:56,640

350
00:20:56,640 --> 00:20:59,340
So what we had to do is we had
to calculate this probability

351
00:20:59,340 --> 00:21:02,030
and this probability, add them,
and deduct this, because

352
00:21:02,030 --> 00:21:04,760
this would have been
double counted.

353
00:21:04,760 --> 00:21:08,230
It's quite like, this
intersection.

354
00:21:08,230 --> 00:21:12,490
We want to remove that, and we
want to find the union OK?

355
00:21:12,490 --> 00:21:15,560
So what is this probability?

356
00:21:15,560 --> 00:21:18,455
Since we don't care about the
second roll, we have to care

357
00:21:18,455 --> 00:21:21,300
only about the first roll,
our first roll

358
00:21:21,300 --> 00:21:24,590
getting 4, which is 1/4.

359
00:21:24,590 --> 00:21:28,450
And this is 1/4 similarly.

360
00:21:28,450 --> 00:21:32,770
And this is 1/4 by
1/4, so 1/16.

361
00:21:32,770 --> 00:21:35,280
So it'll be 1/2 minus 1/16.

362
00:21:35,280 --> 00:21:37,890

363
00:21:37,890 --> 00:21:41,460
And when you give the answers,
if it's hard, you can just

364
00:21:41,460 --> 00:21:43,050
leave it like this.

365
00:21:43,050 --> 00:21:46,190
So this is what we call giving
the answers as formula instead

366
00:21:46,190 --> 00:21:47,990
of giving exact fractions.

367
00:21:47,990 --> 00:21:50,280
Because sometimes it might be
hard to find the fraction.

368
00:21:50,280 --> 00:21:53,500
Suppose it's something like 1
over, say, 2 to the power 5

369
00:21:53,500 --> 00:21:55,370
and a 3 to the 2, something
like this.

370
00:21:55,370 --> 00:21:56,800
Or we'll say 5.

371
00:21:56,800 --> 00:22:00,095
You're not supposed to give the
exact value in this amount

372
00:22:00,095 --> 00:22:01,100
or even the fractions.

373
00:22:01,100 --> 00:22:03,180
You can give such formulas.

374
00:22:03,180 --> 00:22:07,205
You can give something like
this, too, to give the inverse

375
00:22:07,205 --> 00:22:10,930
probability of that
not happening.

376
00:22:10,930 --> 00:22:11,400
Let's see.

377
00:22:11,400 --> 00:22:16,310
Let's move into a little bit
more complicated example.

378
00:22:16,310 --> 00:22:18,710
A pack of cards--

379
00:22:18,710 --> 00:22:21,750
what is the probability
of getting an ace?

380
00:22:21,750 --> 00:22:23,000
Anyone?

381
00:22:23,000 --> 00:22:25,442

382
00:22:25,442 --> 00:22:26,354
AUDIENCE: 1 out of 2?

383
00:22:26,354 --> 00:22:28,180
PROFESSOR: 1 out of 2?

384
00:22:28,180 --> 00:22:30,880
AUDIENCE: out of 52.

385
00:22:30,880 --> 00:22:32,920
PROFESSOR: Not a particular--

386
00:22:32,920 --> 00:22:37,055
an ace, yes, just ace.

387
00:22:37,055 --> 00:22:38,438
AUDIENCE: Is it 4 out of 52?

388
00:22:38,438 --> 00:22:42,030
PROFESSOR: 4/52, yes.

389
00:22:42,030 --> 00:22:44,400
Or if you consider one
suit, it would have

390
00:22:44,400 --> 00:22:46,100
been like 1/13, right?

391
00:22:46,100 --> 00:22:48,480
You could have considered
one suit, and out of--

392
00:22:48,480 --> 00:22:50,690
OK.

393
00:22:50,690 --> 00:22:52,612
It's the same analysis, right?

394
00:22:52,612 --> 00:22:54,220
OK.

395
00:22:54,220 --> 00:22:57,630
What is the probability of
getting a specific card, which

396
00:22:57,630 --> 00:22:59,795
means, say, the ace of hearts?

397
00:22:59,795 --> 00:23:04,690

398
00:23:04,690 --> 00:23:08,560
It's what she said,
yeah, 1/52.

399
00:23:08,560 --> 00:23:10,990
What is the probability
of not getting an ace?

400
00:23:10,990 --> 00:23:14,190

401
00:23:14,190 --> 00:23:15,170
AUDIENCE: [INAUDIBLE]?

402
00:23:15,170 --> 00:23:16,750
PROFESSOR: Sorry?

403
00:23:16,750 --> 00:23:18,220
AUDIENCE: 1 minus--

404
00:23:18,220 --> 00:23:19,470
PROFESSOR: 1/13.

405
00:23:19,470 --> 00:23:22,060

406
00:23:22,060 --> 00:23:25,950
OK, this is where me make you
solve the inverse probability.

407
00:23:25,950 --> 00:23:29,480
OK, so that will come into
play very often.

408
00:23:29,480 --> 00:23:33,980
OK, now let's get into two
decks of playing cards.

409
00:23:33,980 --> 00:23:39,160
OK, what is the sample size?

410
00:23:39,160 --> 00:23:42,930
What is the sample size
of drawing cards from

411
00:23:42,930 --> 00:23:44,630
two decks of cards?

412
00:23:44,630 --> 00:23:45,420
Two cards, actually.

413
00:23:45,420 --> 00:23:48,110
You're going to draw two cards
from two different decks.

414
00:23:48,110 --> 00:23:51,930

415
00:23:51,930 --> 00:23:53,530
Sorry?

416
00:23:53,530 --> 00:23:54,470
OK.

417
00:23:54,470 --> 00:23:59,850
What is the sample size of
drawing a card from one deck?

418
00:23:59,850 --> 00:24:03,530
There are 52 possible
outcomes.

419
00:24:03,530 --> 00:24:07,890
So for each outcome here, we
have 52 outcomes there, right?

420
00:24:07,890 --> 00:24:09,500
So it's 52 by 52.

421
00:24:09,500 --> 00:24:11,810
It's like the tree, but here,
we have 52 branches.

422
00:24:11,810 --> 00:24:15,060

423
00:24:15,060 --> 00:24:17,830
So eventually, you will
have 52 by 52.

424
00:24:17,830 --> 00:24:19,220
This is where you can't
enumerate all

425
00:24:19,220 --> 00:24:20,650
the possible cases.

426
00:24:20,650 --> 00:24:24,420
So you should have a way
to find the final

427
00:24:24,420 --> 00:24:26,317
probability, OK?

428
00:24:26,317 --> 00:24:29,810

429
00:24:29,810 --> 00:24:33,440
So in this case, what is the
probability of getting at

430
00:24:33,440 --> 00:24:34,775
least one ace?

431
00:24:34,775 --> 00:24:37,820

432
00:24:37,820 --> 00:24:42,280
What's the probability of
getting at least one ace?

433
00:24:42,280 --> 00:24:46,150
This is, again, similar
to this case.

434
00:24:46,150 --> 00:24:47,000
Remember this diagram.

435
00:24:47,000 --> 00:24:48,250
It's called Venn diagram.

436
00:24:48,250 --> 00:24:53,250

437
00:24:53,250 --> 00:24:54,720
Remember this.

438
00:24:54,720 --> 00:24:58,260
So what is the probability of
getting at least one ace,

439
00:24:58,260 --> 00:25:01,170
which means you could have got
the ace from the first deck,

440
00:25:01,170 --> 00:25:03,940
or the second deck, or both.

441
00:25:03,940 --> 00:25:05,950
But if you're getting from both,
you have to deduct it

442
00:25:05,950 --> 00:25:11,510
because otherwise, you would
have double counted it.

443
00:25:11,510 --> 00:25:16,220
So getting an ace from the
first deck is 1/13.

444
00:25:16,220 --> 00:25:18,130
Second deck, 1/13.

445
00:25:18,130 --> 00:25:22,240
Getting from both
is 1/52 by 52.

446
00:25:22,240 --> 00:25:27,460
Sorry, 1/13 by 1/13.

447
00:25:27,460 --> 00:25:39,900

448
00:25:39,900 --> 00:25:40,375
Sorry.

449
00:25:40,375 --> 00:25:41,784
AUDIENCE: Are you adding them?

450
00:25:41,784 --> 00:25:46,010
PROFESSOR: Yeah, that's what
I explained earlier.

451
00:25:46,010 --> 00:25:47,310
You're doing two trials.

452
00:25:47,310 --> 00:25:50,400

453
00:25:50,400 --> 00:25:52,290
You could have got the
ace from here.

454
00:25:52,290 --> 00:25:54,310
And this could have
been anything.

455
00:25:54,310 --> 00:25:56,270
You could have got the ace from
here, and this could have

456
00:25:56,270 --> 00:25:57,150
been anything.

457
00:25:57,150 --> 00:26:00,400
You could have got
an ace from both.

458
00:26:00,400 --> 00:26:03,150
So you should add these two
probabilities because we need

459
00:26:03,150 --> 00:26:07,590
a case where at least
one card is ace.

460
00:26:07,590 --> 00:26:10,775
But the problem is, this could
have happened here and here.

461
00:26:10,775 --> 00:26:12,025
And so you will deduct it.

462
00:26:12,025 --> 00:26:16,330

463
00:26:16,330 --> 00:26:20,690
What is the probability of
getting neither card--

464
00:26:20,690 --> 00:26:22,690
what is the probability of
neither card being an ace?

465
00:26:22,690 --> 00:26:26,394

466
00:26:26,394 --> 00:26:27,320
AUDIENCE: 1 minus that?

467
00:26:27,320 --> 00:26:31,890
PROFESSOR: 1 minus
this, exactly.

468
00:26:31,890 --> 00:26:33,040
OK, you're getting comfortable
with the

469
00:26:33,040 --> 00:26:35,550
inverse probability now.

470
00:26:35,550 --> 00:26:42,320
What's the probability of two
cards from the same suit?

471
00:26:42,320 --> 00:26:44,060
What is the probability
of getting two cards

472
00:26:44,060 --> 00:26:45,310
from the same suit?

473
00:26:45,310 --> 00:26:50,290

474
00:26:50,290 --> 00:26:52,810
Now, it's getting interesting.

475
00:26:52,810 --> 00:26:55,390
Two cards from the same suit.

476
00:26:55,390 --> 00:26:58,600
So how can we think
about this?

477
00:26:58,600 --> 00:27:02,910
Of course, you can enumerate all
possible cases and count.

478
00:27:02,910 --> 00:27:04,160
We don't want to do that.

479
00:27:04,160 --> 00:27:08,970

480
00:27:08,970 --> 00:27:13,315
OK, you're going to use the grid
here to visualize this.

481
00:27:13,315 --> 00:27:18,100

482
00:27:18,100 --> 00:27:19,110
OK?

483
00:27:19,110 --> 00:27:21,240
It could have been a
spades, or hearts,

484
00:27:21,240 --> 00:27:22,890
or clubs, or a diamond.

485
00:27:22,890 --> 00:27:29,270

486
00:27:29,270 --> 00:27:32,270
So we want two cards of
the same suit, right?

487
00:27:32,270 --> 00:27:38,440

488
00:27:38,440 --> 00:27:42,280
So it's 4/16 possible
outcomes.

489
00:27:42,280 --> 00:27:45,480

490
00:27:45,480 --> 00:27:47,310
Do you see that?

491
00:27:47,310 --> 00:27:50,270
So see, we are using
all the tools

492
00:27:50,270 --> 00:27:51,650
available at our disposal--

493
00:27:51,650 --> 00:27:58,340
trees, grids, counting, Ven
diagrams, inverse probability.

494
00:27:58,340 --> 00:28:01,000
Yeah, you should be able to
do that to get the answers

495
00:28:01,000 --> 00:28:04,270
quickly because you could have
actually done-- you could have

496
00:28:04,270 --> 00:28:06,130
done something like this, too.

497
00:28:06,130 --> 00:28:08,180
But it will take more
time, right?

498
00:28:08,180 --> 00:28:14,240
So this will be a simpler way
of visualizing things.

499
00:28:14,240 --> 00:28:18,170
What is the probability of
getting neither card a diamond

500
00:28:18,170 --> 00:28:19,420
nor a club?

501
00:28:19,420 --> 00:28:25,300

502
00:28:25,300 --> 00:28:27,615
Neither card is diamond
nor club.

503
00:28:27,615 --> 00:28:28,865
That is tricky.

504
00:28:28,865 --> 00:28:31,000

505
00:28:31,000 --> 00:28:36,080
But since we have this grid, we
can easily visualize that.

506
00:28:36,080 --> 00:28:39,360

507
00:28:39,360 --> 00:28:43,590
So if neither card is diamond
nor club, then it could have

508
00:28:43,590 --> 00:28:45,130
been only these two
values, right?

509
00:28:45,130 --> 00:28:48,740

510
00:28:48,740 --> 00:28:52,530
Which is, again, 4/16.

511
00:28:52,530 --> 00:28:54,200
So there are 4 possible cases.

512
00:28:54,200 --> 00:28:57,490

513
00:28:57,490 --> 00:28:58,740
OK?

514
00:28:58,740 --> 00:29:06,930

515
00:29:06,930 --> 00:29:09,860
So what is the summary?

516
00:29:09,860 --> 00:29:11,870
What is the take home
message here?

517
00:29:11,870 --> 00:29:18,680

518
00:29:18,680 --> 00:29:22,940
In probability, the probability
of the belief, or

519
00:29:22,940 --> 00:29:26,635
the way of expressing
the belief, of a

520
00:29:26,635 --> 00:29:29,320
particular event happening.

521
00:29:29,320 --> 00:29:32,990
Now, there could be several
possible outcomes.

522
00:29:32,990 --> 00:29:35,390
Out of those possible outcomes,
you have a certain

523
00:29:35,390 --> 00:29:37,400
number of desired outcomes.

524
00:29:37,400 --> 00:29:39,900
How can you find that?

525
00:29:39,900 --> 00:29:41,600
You can either enumerate
all of them.

526
00:29:41,600 --> 00:29:44,610
You can put them in a tree, or
you can put them in a grid.

527
00:29:44,610 --> 00:29:48,430
Or you can use some sort of Venn
diagram and come up with

528
00:29:48,430 --> 00:29:50,470
some sort of analysis.

529
00:29:50,470 --> 00:29:57,310
Here, we start with our belief
that the coin is unbiased, or

530
00:29:57,310 --> 00:29:59,350
we have a fair chance
of drawing any card

531
00:29:59,350 --> 00:30:00,820
from the deck of cards.

532
00:30:00,820 --> 00:30:06,650
So we have all these unbiased
beliefs, or beliefs about the

533
00:30:06,650 --> 00:30:09,680
characteristics of each trial.

534
00:30:09,680 --> 00:30:11,140
So we start from that.

535
00:30:11,140 --> 00:30:13,690

536
00:30:13,690 --> 00:30:19,440
Then, we find the probability
of a particular event

537
00:30:19,440 --> 00:30:22,540
happening in a certain
number of trials.

538
00:30:22,540 --> 00:30:30,230
But what if you don't have the
knowledge about the coin?

539
00:30:30,230 --> 00:30:32,250
What if you don't know whether
it's fair or not?

540
00:30:32,250 --> 00:30:37,920
What if you don't know P of A is
equal to H is equal to 1/2?

541
00:30:37,920 --> 00:30:38,910
Suppose you don't know that.

542
00:30:38,910 --> 00:30:42,380
Suppose it's P. How
can you find it?

543
00:30:42,380 --> 00:30:47,250

544
00:30:47,250 --> 00:30:50,800
What you could do is you
could simulate this.

545
00:30:50,800 --> 00:30:55,040
You can throw coin several
times and count the total

546
00:30:55,040 --> 00:30:58,910
number of heads you get, OK?

547
00:30:58,910 --> 00:31:04,140
So it could be n of heads
over n trial will

548
00:31:04,140 --> 00:31:05,550
give you the P, right?

549
00:31:05,550 --> 00:31:10,860

550
00:31:10,860 --> 00:31:14,950
This is a way of finding the
probabilities through a

551
00:31:14,950 --> 00:31:16,290
certain number of trials.

552
00:31:16,290 --> 00:31:20,150
It's like simulating
the experiments.

553
00:31:20,150 --> 00:31:21,515
It's called Monte Carlo
simulation.

554
00:31:21,515 --> 00:31:24,400

555
00:31:24,400 --> 00:31:27,950
And using that, we try
to find a particular

556
00:31:27,950 --> 00:31:29,986
parameter of the model.

557
00:31:29,986 --> 00:31:33,750
You know how they actually found
the value of pi at the

558
00:31:33,750 --> 00:31:35,240
beginning, pi?

559
00:31:35,240 --> 00:31:38,210

560
00:31:38,210 --> 00:31:40,290
It's again using a Monte
Carlo simulation.

561
00:31:40,290 --> 00:31:49,980
What you could do is for a given
radius, you can actually

562
00:31:49,980 --> 00:31:51,920
check whether it lies within
a circle or not.

563
00:31:51,920 --> 00:31:53,680
You can simulate the Monte
Carlo simulation.

564
00:31:53,680 --> 00:31:59,070
And given this radius, you can
come up with a particular

565
00:31:59,070 --> 00:32:04,520
location at random and check
whether it's within this

566
00:32:04,520 --> 00:32:07,700
boundary or not, OK?

567
00:32:07,700 --> 00:32:10,080
So then, you know the outcome.

568
00:32:10,080 --> 00:32:11,170
You know the outcomes, right?

569
00:32:11,170 --> 00:32:22,280
So suppose this is n_a, And
the total outcome is n_t.

570
00:32:22,280 --> 00:32:24,250
This gives you the
area, right?

571
00:32:24,250 --> 00:32:29,212
We know this is r-squared,
and this is pi r-squared.

572
00:32:29,212 --> 00:32:30,462
Sorry.

573
00:32:30,462 --> 00:32:36,920

574
00:32:36,920 --> 00:32:39,621
When this is 4 r-squared,
this is 2r, right?

575
00:32:39,621 --> 00:32:44,700

576
00:32:44,700 --> 00:32:47,135
So using this, you can
easily calculate pi.

577
00:32:47,135 --> 00:32:55,020

578
00:32:55,020 --> 00:32:59,270
So now, since we are going to
come up with these parameters

579
00:32:59,270 --> 00:33:05,800
through repeating the trials, we
need to have a standardized

580
00:33:05,800 --> 00:33:08,740
way of finding these
parameters.

581
00:33:08,740 --> 00:33:11,470
We can't simply say
this, right?

582
00:33:11,470 --> 00:33:13,615
Take this example.

583
00:33:13,615 --> 00:33:17,640
You know this MIT
shuttle right?

584
00:33:17,640 --> 00:33:21,380
A shuttle arriving at the
right time, or the time

585
00:33:21,380 --> 00:33:24,870
difference between the arrival
and the actual quoted time can

586
00:33:24,870 --> 00:33:27,380
be plotted in a graph.

587
00:33:27,380 --> 00:33:31,220
So if you put that it is
spread around 0, right?

588
00:33:31,220 --> 00:33:35,010
Probably, or we hope so.

589
00:33:35,010 --> 00:33:36,370
OK?

590
00:33:36,370 --> 00:33:47,840
Now, from this, we can see that
actually the mean of this

591
00:33:47,840 --> 00:33:52,950
simulation will give you the
expected difference in the

592
00:33:52,950 --> 00:33:58,330
time, the expected difference
in the arrival time from the

593
00:33:58,330 --> 00:33:59,580
actual quoted time.

594
00:33:59,580 --> 00:34:01,890

595
00:34:01,890 --> 00:34:06,830
And we hope this expectation
to be 0.

596
00:34:06,830 --> 00:34:09,150
We call that mean.

597
00:34:09,150 --> 00:34:10,400
Means is taking the average.

598
00:34:10,400 --> 00:34:20,550

599
00:34:20,550 --> 00:34:26,389
But this distribution might
actually give you some

600
00:34:26,389 --> 00:34:29,650
information, some extra
information, as well.

601
00:34:29,650 --> 00:34:34,150
That is, how well we can
actually believe this, how

602
00:34:34,150 --> 00:34:35,700
much we can rely on this.

603
00:34:35,700 --> 00:34:41,340
If the spread is greater,
something like this, then

604
00:34:41,340 --> 00:34:44,449
probably you might actually not
trust the system, right?

605
00:34:44,449 --> 00:34:47,340
Although the mean is 0,
it's going to come

606
00:34:47,340 --> 00:34:48,449
early or late, right?

607
00:34:48,449 --> 00:34:49,699
Which means it's useless.

608
00:34:49,699 --> 00:34:52,750

609
00:34:52,750 --> 00:35:00,090
Similarly, in this case, we have
a spread around mean 0.

610
00:35:00,090 --> 00:35:10,280
But if you take the score, the
marks you get for 600, it

611
00:35:10,280 --> 00:35:11,290
could be something like this.

612
00:35:11,290 --> 00:35:13,730
It's not centered
around 0, right?

613
00:35:13,730 --> 00:35:14,790
Hopefully.

614
00:35:14,790 --> 00:35:18,570
It's probably, say, 50.

615
00:35:18,570 --> 00:35:22,850
Then, we actually want the
spread to be small or large?

616
00:35:22,850 --> 00:35:25,420

617
00:35:25,420 --> 00:35:31,290
We want the spread to be large
because we want to distinguish

618
00:35:31,290 --> 00:35:32,580
the levels, right?

619
00:35:32,580 --> 00:35:34,360
The students' level
of understanding.

620
00:35:34,360 --> 00:35:40,270
600.

621
00:35:40,270 --> 00:35:44,930
Anyway, so the spread determines
what is the

622
00:35:44,930 --> 00:35:50,650
variation percent in their
distribution of the scores?

623
00:35:50,650 --> 00:35:53,305
We measure that by a variable
called standard deviation.

624
00:35:53,305 --> 00:35:59,340

625
00:35:59,340 --> 00:36:05,630
In this case, this particular
sample will be different from

626
00:36:05,630 --> 00:36:10,770
its mean by a particular
value, right?

627
00:36:10,770 --> 00:36:17,980
We can express that as
x_i minus its mean.

628
00:36:17,980 --> 00:36:19,318
Let's call the mean mu.

629
00:36:19,318 --> 00:36:22,070

630
00:36:22,070 --> 00:36:24,440
So this would be
the difference.

631
00:36:24,440 --> 00:36:29,400
Standard deviation is summing
up all the differences.

632
00:36:29,400 --> 00:36:32,210
But the problem is, when you sum
up the differences, it'll

633
00:36:32,210 --> 00:36:34,210
be 0, right?

634
00:36:34,210 --> 00:36:36,890
The total summation of the
differences will be 0 if

635
00:36:36,890 --> 00:36:42,380
that's how you get the mean
because if you expand this,

636
00:36:42,380 --> 00:36:43,760
it'll be something
like this, right?

637
00:36:43,760 --> 00:36:49,000

638
00:36:49,000 --> 00:36:50,250
Which will be n mu.

639
00:36:50,250 --> 00:37:03,030

640
00:37:03,030 --> 00:37:05,160
Should be equal to 0.

641
00:37:05,160 --> 00:37:08,690
So we have to sum, or
actually take the

642
00:37:08,690 --> 00:37:10,490
differences into account.

643
00:37:10,490 --> 00:37:12,540
So, let's square this.

644
00:37:12,540 --> 00:37:17,350
So now, it will no
longer be 0.

645
00:37:17,350 --> 00:37:21,142
Now, this gives 0,
the differences.

646
00:37:21,142 --> 00:37:24,330
It's the squared sum of the
differences averaged across

647
00:37:24,330 --> 00:37:25,580
all the samples.

648
00:37:25,580 --> 00:37:27,820

649
00:37:27,820 --> 00:37:29,315
We call this variance.

650
00:37:29,315 --> 00:37:32,280

651
00:37:32,280 --> 00:37:33,370
And the square root of

652
00:37:33,370 --> 00:37:36,555
variance is standard deviation.

653
00:37:36,555 --> 00:37:45,530

654
00:37:45,530 --> 00:37:47,320
OK?

655
00:37:47,320 --> 00:37:51,930
Now, having a standard
deviation--

656
00:37:51,930 --> 00:37:54,650

657
00:37:54,650 --> 00:37:57,050
so we know the standard
deviation tells you how spread

658
00:37:57,050 --> 00:37:59,910
the distribution is.

659
00:37:59,910 --> 00:38:04,280
But can we actually rely only
on the standard deviation to

660
00:38:04,280 --> 00:38:09,230
determine the consistency
of some event?

661
00:38:09,230 --> 00:38:11,390
Can we?

662
00:38:11,390 --> 00:38:12,070
Probably not.

663
00:38:12,070 --> 00:38:19,050
Suppose take two examples,
one is the scores, 50.

664
00:38:19,050 --> 00:38:20,920
And suppose the standard
deviation is minus

665
00:38:20,920 --> 00:38:24,270
10, plus 10, OK?

666
00:38:24,270 --> 00:38:26,290
So the standard deviation
is 10 here.

667
00:38:26,290 --> 00:38:29,720
Suppose it lies in this form.

668
00:38:29,720 --> 00:38:34,420
Consider another example, the
weight, the weight of the

669
00:38:34,420 --> 00:38:38,360
people, like say at MIT.

670
00:38:38,360 --> 00:38:44,850
And suppose it's centered
around 150.

671
00:38:44,850 --> 00:38:50,120
Now, if the standard deviation
is, say, 10, then the standard

672
00:38:50,120 --> 00:38:53,780
deviation 10 here and the
standard deviation 10 here

673
00:38:53,780 --> 00:38:59,640
don't convey the same
message, OK?

674
00:38:59,640 --> 00:39:07,110
So we need to have a different
way of expressing the

675
00:39:07,110 --> 00:39:10,650
consistency of a distribution.

676
00:39:10,650 --> 00:39:29,110
So we represent it by
coefficient of variation,

677
00:39:29,110 --> 00:39:37,690
which is equal to the standard
deviation divided by mean.

678
00:39:37,690 --> 00:39:42,810

679
00:39:42,810 --> 00:39:47,240
Now here, it will be 10/150.

680
00:39:47,240 --> 00:39:50,810
Here, it will be 10/50.

681
00:39:50,810 --> 00:39:55,100
So we know this is more
consistent than this.

682
00:39:55,100 --> 00:40:01,110
The weights of the students at
MIT, it's more consistent than

683
00:40:01,110 --> 00:40:05,215
the marks you might get,
or you get, for 600.

684
00:40:05,215 --> 00:40:06,465
It might be true.

685
00:40:06,465 --> 00:40:11,120

686
00:40:11,120 --> 00:40:16,530
Now, what is for the use of
the standard deviation?

687
00:40:16,530 --> 00:40:17,780
How can we use that?

688
00:40:17,780 --> 00:40:20,610

689
00:40:20,610 --> 00:40:28,220
Let's look at this graph where
suppose the mean is 0 and the

690
00:40:28,220 --> 00:40:31,150
standard deviation is, say, 5.

691
00:40:31,150 --> 00:40:34,370

692
00:40:34,370 --> 00:40:37,460
Consider another example where
standard deviation is 10.

693
00:40:37,460 --> 00:40:43,890

694
00:40:43,890 --> 00:40:46,510
It might have been
like this, OK?

695
00:40:46,510 --> 00:40:58,680
Now, before that, let me sort
of digress a little bit so I

696
00:40:58,680 --> 00:41:00,030
can explain this better.

697
00:41:00,030 --> 00:41:03,120

698
00:41:03,120 --> 00:41:08,106
We can take the outcome of a
particular event as a sample

699
00:41:08,106 --> 00:41:10,440
in our distribution.

700
00:41:10,440 --> 00:41:12,920
So suppose you're
throwing a die.

701
00:41:12,920 --> 00:41:15,560
So you get an outcome.

702
00:41:15,560 --> 00:41:21,650
You can represent that outcome
as a distribution, OK?

703
00:41:21,650 --> 00:41:29,810
So here, there's x, which
can take 1 to, say, 6.

704
00:41:29,810 --> 00:41:33,450
And we can represent x_i as
a sample point in our

705
00:41:33,450 --> 00:41:36,110
distribution.

706
00:41:36,110 --> 00:41:43,060
So I don't know, it might be
uniform, probably, we hope.

707
00:41:43,060 --> 00:41:46,750
So it's with 1/6 probability,
we always take

708
00:41:46,750 --> 00:41:47,402
one of these values.

709
00:41:47,402 --> 00:41:48,652
OK.

710
00:41:48,652 --> 00:41:50,120

711
00:41:50,120 --> 00:41:53,020
But this might not be the
case with all events.

712
00:41:53,020 --> 00:41:57,250

713
00:41:57,250 --> 00:42:02,090
OK, so what I'm trying to say
here is you can actually

714
00:42:02,090 --> 00:42:07,040
represent the outcome of the
trial in the distribution.

715
00:42:07,040 --> 00:42:10,410
Or you can also represent the
probability of something

716
00:42:10,410 --> 00:42:12,115
happening in a distribution.

717
00:42:12,115 --> 00:42:15,650

718
00:42:15,650 --> 00:42:16,700
How does it work?

719
00:42:16,700 --> 00:42:19,840
OK, in this case, we
throw our dice.

720
00:42:19,840 --> 00:42:20,930
We get an outcome.

721
00:42:20,930 --> 00:42:23,170
We go and put it
in the x-axis.

722
00:42:23,170 --> 00:42:26,140
It could be between 1 and 6.

723
00:42:26,140 --> 00:42:27,575
And it takes this
distribution.

724
00:42:27,575 --> 00:42:30,220

725
00:42:30,220 --> 00:42:34,550
In addition, what you could
do is you could

726
00:42:34,550 --> 00:42:36,780
have, say, 100 trials.

727
00:42:36,780 --> 00:42:38,810
So you throw a coin.

728
00:42:38,810 --> 00:42:40,270
You take 100 trials.

729
00:42:40,270 --> 00:42:44,570
You get the mean, you get the
probability of getting a head.

730
00:42:44,570 --> 00:42:46,850
And you have that mean, right?

731
00:42:46,850 --> 00:42:49,230
So probability of getting
a head for 100

732
00:42:49,230 --> 00:42:53,690
trials, say, 0.51.

733
00:42:53,690 --> 00:42:58,610
You do another 100 trials,
you got another one.

734
00:42:58,610 --> 00:43:00,900
So you have now another
distribution.

735
00:43:00,900 --> 00:43:03,600
So there's a distribution
of probabilities.

736
00:43:03,600 --> 00:43:05,910
So you can have a distribution
of probabilities, or you can

737
00:43:05,910 --> 00:43:08,600
have a distribution
for the events.

738
00:43:08,600 --> 00:43:12,280
We handle these two cases
in the p-set.

739
00:43:12,280 --> 00:43:16,150
So probably you should be able
to distinguish those two.

740
00:43:16,150 --> 00:43:21,410
Anyway, so here in this
particular example, let's take

741
00:43:21,410 --> 00:43:23,710
this as our mu.

742
00:43:23,710 --> 00:43:25,700
Let's take this as our
standard deviation.

743
00:43:25,700 --> 00:43:29,090
And for the first distribution,
let's take the

744
00:43:29,090 --> 00:43:30,870
standard deviation to be 5.

745
00:43:30,870 --> 00:43:32,920
When the standard deviation
is great, it's

746
00:43:32,920 --> 00:43:36,130
going to be more spread.

747
00:43:36,130 --> 00:43:39,320
It's going to be more
distributed than the former.

748
00:43:39,320 --> 00:43:41,960
So here, say the standard
deviation is 10.

749
00:43:41,960 --> 00:43:45,330

750
00:43:45,330 --> 00:43:49,200
The standard deviation is a
way of expressing how many

751
00:43:49,200 --> 00:43:53,790
items, how many samples are
going to lie between those

752
00:43:53,790 --> 00:43:56,610
particular boundaries.

753
00:43:56,610 --> 00:44:02,080
So for a normal distribution, we
know the exact area, exact

754
00:44:02,080 --> 00:44:03,950
probability of things
happening.

755
00:44:03,950 --> 00:44:07,670

756
00:44:07,670 --> 00:44:12,320
If there's no mu, we know within
the first standard

757
00:44:12,320 --> 00:44:26,710
deviation, there will be 68%
of events lie in that area.

758
00:44:26,710 --> 00:44:28,135
Within two standard
deviations--

759
00:44:28,135 --> 00:44:34,150

760
00:44:34,150 --> 00:44:39,465
OK, one standard
deviation, 68%.

761
00:44:39,465 --> 00:44:42,580

762
00:44:42,580 --> 00:44:45,520
Two standard deviations
on either side,

763
00:44:45,520 --> 00:44:47,910
it's going to be 95%.

764
00:44:47,910 --> 00:44:53,110
Three standard deviations,
it's going to be 99%.

765
00:44:53,110 --> 00:44:59,760
So suppose you conducted
so many trials.

766
00:44:59,760 --> 00:45:02,260
And you get the values.

767
00:45:02,260 --> 00:45:08,500
And in the distribution, suppose
mu, mean, is 10, and

768
00:45:08,500 --> 00:45:09,750
the standard deviation
is, say, 1.

769
00:45:09,750 --> 00:45:12,510

770
00:45:12,510 --> 00:45:19,430
So now, with 99% confidence, we
can say then the outcome of

771
00:45:19,430 --> 00:45:22,710
the next trial is going
to be between what?

772
00:45:22,710 --> 00:45:26,200

773
00:45:26,200 --> 00:45:31,250
7 and 13, right?

774
00:45:31,250 --> 00:45:34,480
So this is where finding the
distribution and standard

775
00:45:34,480 --> 00:45:40,540
deviation helps us giving
a confidence interval,

776
00:45:40,540 --> 00:45:43,340
expressing our belief of that
particular event happening.

777
00:45:43,340 --> 00:45:47,050

778
00:45:47,050 --> 00:45:51,290
We will look at a few examples
because you might need this in

779
00:45:51,290 --> 00:45:52,540
your p-set.

780
00:45:52,540 --> 00:46:18,600

781
00:46:18,600 --> 00:46:20,156
So this particular
function you have

782
00:46:20,156 --> 00:46:22,930
already seen in the lecture.

783
00:46:22,930 --> 00:46:27,310

784
00:46:27,310 --> 00:46:31,870
But we need to understand
this particular part.

785
00:46:31,870 --> 00:46:35,160

786
00:46:35,160 --> 00:46:39,150
Suppose you have a probability
of something happening.

787
00:46:39,150 --> 00:46:40,620
Suppose you estimated
the probability

788
00:46:40,620 --> 00:46:41,300
of something happening.

789
00:46:41,300 --> 00:46:46,880
Suppose you're given the
coin is biased, OK?

790
00:46:46,880 --> 00:46:47,940
Sorry, unbiased.

791
00:46:47,940 --> 00:46:51,840
So we know p of H
is equal to 1/2.

792
00:46:51,840 --> 00:46:55,210
How can we simulate
an outcome?

793
00:46:55,210 --> 00:46:57,740
How can you simulate an outcome
and see whether it's a

794
00:46:57,740 --> 00:47:01,030
head or a tail with this
particular probability?

795
00:47:01,030 --> 00:47:08,090
We do that by calling this
function, random.random(),

796
00:47:08,090 --> 00:47:12,160
which is going to give you a
random value between 0 and 1.

797
00:47:12,160 --> 00:47:14,160
And you're going to
check whether it's

798
00:47:14,160 --> 00:47:16,300
below this or not.

799
00:47:16,300 --> 00:47:19,710
If it's below this, we
can take it as head.

800
00:47:19,710 --> 00:47:21,620
If it's not, it's tail.

801
00:47:21,620 --> 00:47:25,740
And this will happen with
probability 1/2, because the

802
00:47:25,740 --> 00:47:29,780
random function is going to
return a value between 0 and 1

803
00:47:29,780 --> 00:47:31,180
with equal probabilities.

804
00:47:31,180 --> 00:47:34,270
It's uniform probabilities.

805
00:47:34,270 --> 00:47:37,970
So to simulate a head or tail,
you call that function.

806
00:47:37,970 --> 00:47:41,930
You write the expression
like that, OK?

807
00:47:41,930 --> 00:47:48,180

808
00:47:48,180 --> 00:47:53,110
Then, if you consider this
example, for a certain number

809
00:47:53,110 --> 00:47:56,890
of flips, we simulate
the event.

810
00:47:56,890 --> 00:47:58,970
And we count the number
of heads we obtain.

811
00:47:58,970 --> 00:48:03,950

812
00:48:03,950 --> 00:48:06,240
And also from that, you
can calculate the

813
00:48:06,240 --> 00:48:07,590
number of tails as well.

814
00:48:07,590 --> 00:48:11,580
If you know the total flips, you
know the number of tails.

815
00:48:11,580 --> 00:48:14,765
Using that, we are taking
two ratios.

816
00:48:14,765 --> 00:48:16,890
Now, the ratio between the
heads and tails, and the

817
00:48:16,890 --> 00:48:19,690
difference between
heads and tails.

818
00:48:19,690 --> 00:48:24,160
We are doing this for certain
number of trials.

819
00:48:24,160 --> 00:48:27,530
And we're going to take the mean
and standard deviation of

820
00:48:27,530 --> 00:48:32,220
these trials, OK?

821
00:48:32,220 --> 00:48:38,170
So here in our distribution,
what are we considering?

822
00:48:38,170 --> 00:48:42,220

823
00:48:42,220 --> 00:48:46,000
What is going to build our
distribution here?

824
00:48:46,000 --> 00:48:49,010

825
00:48:49,010 --> 00:48:50,310
The ratios, right?

826
00:48:50,310 --> 00:48:53,560
The ratios of the events.

827
00:48:53,560 --> 00:48:58,130
And we simulated certain number
of trials to get those

828
00:48:58,130 --> 00:49:01,570
events, OK?

829
00:49:01,570 --> 00:49:04,470
Only if you simulate certain
number of trials, you can

830
00:49:04,470 --> 00:49:08,240
actually summarize the outcome
of the events in mean and

831
00:49:08,240 --> 00:49:10,530
standard deviation.

832
00:49:10,530 --> 00:49:14,480
This is exactly like the
difference in the times of the

833
00:49:14,480 --> 00:49:20,350
bus arriving and the
quoted times.

834
00:49:20,350 --> 00:49:23,700
Let's check this example.

835
00:49:23,700 --> 00:49:25,010
Let's plot this and see.

836
00:49:25,010 --> 00:49:41,830

837
00:49:41,830 --> 00:49:43,080
It's going to take a while.

838
00:49:43,080 --> 00:49:48,390

839
00:49:48,390 --> 00:49:51,590
OK, that's another thing I want
to explain here because

840
00:49:51,590 --> 00:49:53,555
since you're going to
be going to plot--

841
00:49:53,555 --> 00:49:58,050
we are going to use PyLab
extensively and plot graphs.

842
00:49:58,050 --> 00:50:02,090
You'll need to put a title and
labels to all the plots you're

843
00:50:02,090 --> 00:50:03,040
generating.

844
00:50:03,040 --> 00:50:06,160
Plus, you can use this
text to actually put

845
00:50:06,160 --> 00:50:07,190
the text in the graph.

846
00:50:07,190 --> 00:50:09,720
We will show that in a while.

847
00:50:09,720 --> 00:50:10,970
Plus--

848
00:50:10,970 --> 00:50:13,250

849
00:50:13,250 --> 00:50:14,230
here, sorry.

850
00:50:14,230 --> 00:50:19,310
If you want to change the axis
to log-log scale, you can call

851
00:50:19,310 --> 00:50:24,310
this comma at the end after
calling the plot.

852
00:50:24,310 --> 00:50:27,250
Because you might sometimes need
to change the axis to log

853
00:50:27,250 --> 00:50:28,913
scale in x and y-axis.

854
00:50:28,913 --> 00:50:33,930

855
00:50:33,930 --> 00:50:40,260
So this is the mean,
heads versus tails.

856
00:50:40,260 --> 00:50:45,760
And if you can see it, the mean
tends to be 1 when we

857
00:50:45,760 --> 00:50:48,870
have a large number of flips.

858
00:50:48,870 --> 00:50:52,860
So to get the consistency,
we need to simulate

859
00:50:52,860 --> 00:50:55,950
large number of trials.

860
00:50:55,950 --> 00:51:00,610
Then only it will tend to be
close to the mean, OK?

861
00:51:00,610 --> 00:51:04,000

862
00:51:04,000 --> 00:51:07,740
This is sort of a way of
checking the evolution of the

863
00:51:07,740 --> 00:51:13,540
series by actually doing it for
a certain number of flips

864
00:51:13,540 --> 00:51:14,926
at every time.

865
00:51:14,926 --> 00:51:19,280
So it's quite like
a scatter plot.

866
00:51:19,280 --> 00:51:27,250
A scatter plot is like plotting
the outcomes of our

867
00:51:27,250 --> 00:51:28,500
experiments.

868
00:51:28,500 --> 00:51:30,530

869
00:51:30,530 --> 00:51:36,230
Suppose it's x1 and
x2 in a graph.

870
00:51:36,230 --> 00:51:37,420
So we are going to say--

871
00:51:37,420 --> 00:51:42,800
so for example, suppose you
have a variable, and the

872
00:51:42,800 --> 00:51:46,090
variable causes an outcome--

873
00:51:46,090 --> 00:51:51,360
a probability of the coin flip,
so p of H. And it can

874
00:51:51,360 --> 00:51:57,290
result in a certain number of
heads appearing, say n of H.

875
00:51:57,290 --> 00:52:02,150
Now, you can do a scatter plot
between these two variables.

876
00:52:02,150 --> 00:52:04,050
And it will be probably
a spread.

877
00:52:04,050 --> 00:52:07,990
But we know that if you increase
the probability of

878
00:52:07,990 --> 00:52:11,510
heads, the number of heads is
going to increase as well.

879
00:52:11,510 --> 00:52:13,340
So it would be probably
something like this.

880
00:52:13,340 --> 00:52:18,320

881
00:52:18,320 --> 00:52:20,742
From this, we can assume
that it's linear or

882
00:52:20,742 --> 00:52:21,140
something like that.

883
00:52:21,140 --> 00:52:24,660
But the scatter plot is actually
representing the

884
00:52:24,660 --> 00:52:28,620
outcomes of the trial versus
some other variable in the

885
00:52:28,620 --> 00:52:30,877
graph and visualize it.

886
00:52:30,877 --> 00:52:33,560

887
00:52:33,560 --> 00:52:36,230
And let me show the last
graph, and we'll

888
00:52:36,230 --> 00:52:37,480
be done with that.

889
00:52:37,480 --> 00:52:52,040

890
00:52:52,040 --> 00:52:55,710
So this, again, we actually
know, instead of putting a

891
00:52:55,710 --> 00:53:00,080
scatter plot, we're actually
giving the distribution as a

892
00:53:00,080 --> 00:53:06,340
histogram and printing a
text box in the graph.

893
00:53:06,340 --> 00:53:09,970
This might be useful if you want
to display something on

894
00:53:09,970 --> 00:53:12,840
your graph.

895
00:53:12,840 --> 00:53:15,990
I guess we will be uploading
the code to the site.

896
00:53:15,990 --> 00:53:19,080
So you can check the code
if you want later, OK?

897
00:53:19,080 --> 00:53:20,510
Sure.

898
00:53:20,510 --> 00:53:21,760
See you next week.

899
00:53:21,760 --> 00:53:27,615