1
00:00:00,000 --> 00:00:00,530

2
00:00:00,530 --> 00:00:02,960
The following content is
provided under a Creative

3
00:00:02,960 --> 00:00:04,370
Commons license.

4
00:00:04,370 --> 00:00:07,410
Your support will help MIT
OpenCourseWare continue to

5
00:00:07,410 --> 00:00:11,060
offer high quality educational
resources for free.

6
00:00:11,060 --> 00:00:13,960
To make a donation or view
additional materials from

7
00:00:13,960 --> 00:00:19,790
hundreds of MIT courses, visit
MIT OpenCourseWare at

8
00:00:19,790 --> 00:00:21,040
ocw.mit.edu.

9
00:00:21,040 --> 00:00:22,760

10
00:00:22,760 --> 00:00:25,550
PROFESSOR: I wanted to give
everybody a more conceptual

11
00:00:25,550 --> 00:00:29,210
idea of what big O notation as
well as, hopefully, answer any

12
00:00:29,210 --> 00:00:31,600
lingering questions you might
have about object-oriented

13
00:00:31,600 --> 00:00:32,860
programming.

14
00:00:32,860 --> 00:00:35,820
So I have these notes, and I
type them up, and they're

15
00:00:35,820 --> 00:00:36,490
pretty detailed.

16
00:00:36,490 --> 00:00:37,650
So I'm just going to
go through some

17
00:00:37,650 --> 00:00:39,830
points kind of quickly.

18
00:00:39,830 --> 00:00:41,610
Who's still kind of unclear
about why we

19
00:00:41,610 --> 00:00:43,165
even due big O notation?

20
00:00:43,165 --> 00:00:47,550

21
00:00:47,550 --> 00:00:52,150
So who can explain why we do
big O notation quickly?

22
00:00:52,150 --> 00:00:53,960
What is it?

23
00:00:53,960 --> 00:00:54,250
Yeah?

24
00:00:54,250 --> 00:00:56,780
AUDIENCE: [INAUDIBLE].

25
00:00:56,780 --> 00:00:57,330
PROFESSOR: Right.

26
00:00:57,330 --> 00:00:57,780
Exactly.

27
00:00:57,780 --> 00:01:00,300
So big O notation gives us an
upper bound on how long

28
00:01:00,300 --> 00:01:01,670
something is going to take.

29
00:01:01,670 --> 00:01:03,870
Now, something that's important
to remember is it's

30
00:01:03,870 --> 00:01:06,380
not a time bound.

31
00:01:06,380 --> 00:01:08,510
So something that's often
confusing is that people say,

32
00:01:08,510 --> 00:01:12,660
oh, this is something that'll
tell us how long our programs

33
00:01:12,660 --> 00:01:13,420
going to run.

34
00:01:13,420 --> 00:01:14,700
That's actually not the case.

35
00:01:14,700 --> 00:01:17,910
Big O notation informs
how many steps

36
00:01:17,910 --> 00:01:19,660
something's going to take.

37
00:01:19,660 --> 00:01:20,990
And so why is that important?

38
00:01:20,990 --> 00:01:23,100
Well, I mean I look at all you
guys, a couple of you have

39
00:01:23,100 --> 00:01:25,050
laptops out.

40
00:01:25,050 --> 00:01:26,430
Everybody's computer
run something

41
00:01:26,430 --> 00:01:27,560
at a different speed.

42
00:01:27,560 --> 00:01:28,450
Right?

43
00:01:28,450 --> 00:01:34,810
But if we say something is big O
of n, what we're saying here

44
00:01:34,810 --> 00:01:39,350
is we're saying that the worst
case number of steps your

45
00:01:39,350 --> 00:01:42,180
program is going to take is
going to be linear with

46
00:01:42,180 --> 00:01:44,450
respect to the size
of the input.

47
00:01:44,450 --> 00:01:46,280
So if my computer is five
times faster than your

48
00:01:46,280 --> 00:01:48,080
computer, my computer
will probably run

49
00:01:48,080 --> 00:01:50,760
it five times faster.

50
00:01:50,760 --> 00:01:53,720
As the size of the input grows,
I'm going to expect a

51
00:01:53,720 --> 00:01:56,680
linear speedup in the amount
of time it's going to take.

52
00:01:56,680 --> 00:01:59,220

53
00:01:59,220 --> 00:02:02,350
So why is that important?

54
00:02:02,350 --> 00:02:06,750
At the bottom of page one,
big O notation, we are

55
00:02:06,750 --> 00:02:07,930
particularly concerned with the

56
00:02:07,930 --> 00:02:10,820
scalability of our functions.

57
00:02:10,820 --> 00:02:15,260
So what the O notation does is
it might not predict what's

58
00:02:15,260 --> 00:02:19,600
going to be the fastest for
really small inputs, for an

59
00:02:19,600 --> 00:02:22,740
array size 10.

60
00:02:22,740 --> 00:02:26,750
You guys know a little bit
about graphs, right?

61
00:02:26,750 --> 00:02:29,640
We have a graph of
x-squared, and we

62
00:02:29,640 --> 00:02:30,890
have a graph of x-cubed.

63
00:02:30,890 --> 00:02:33,270

64
00:02:33,270 --> 00:02:36,030
There's a portion of time where
the graph of x-squared

65
00:02:36,030 --> 00:02:39,270
is actually bigger
than x-cubed.

66
00:02:39,270 --> 00:02:41,870
But then all of a sudden,
there's a point where x-cubed

67
00:02:41,870 --> 00:02:44,910
just goes, whoosh, way bigger
than x-squared.

68
00:02:44,910 --> 00:02:47,960
So if we're in some really small
amount of input, big O

69
00:02:47,960 --> 00:02:51,930
notation might not tell us
what's the best function.

70
00:02:51,930 --> 00:02:53,170
But in big O notation,
we're not

71
00:02:53,170 --> 00:02:54,530
concerned about small inputs.

72
00:02:54,530 --> 00:02:56,480
We're concerned about
really big inputs.

73
00:02:56,480 --> 00:02:58,630
We're concerned about filtering
the genome.

74
00:02:58,630 --> 00:03:02,080
We're concerned about analyzing
data from Hubble,

75
00:03:02,080 --> 00:03:06,690
really huge blocks of data.

76
00:03:06,690 --> 00:03:10,580
So if we're looking at a program
that analyzes the

77
00:03:10,580 --> 00:03:14,500
human genome, like three million
base pairs, some

78
00:03:14,500 --> 00:03:17,250
segment that we're looking at,
and we have two algorithms.

79
00:03:17,250 --> 00:03:20,380
One runs in order n time,
and one runs in

80
00:03:20,380 --> 00:03:21,630
order n-cubed time.

81
00:03:21,630 --> 00:03:24,190

82
00:03:24,190 --> 00:03:26,820
What this means is regardless
of the machine that we're

83
00:03:26,820 --> 00:03:30,650
running on, so this is algorithm
1, this is algorithm

84
00:03:30,650 --> 00:03:34,060
2, regardless of the machine
that we're running on, we'd

85
00:03:34,060 --> 00:03:39,830
expect algorithm 2 to run
approximately n-cubed over n

86
00:03:39,830 --> 00:03:42,950
approximately n-squared
slower.

87
00:03:42,950 --> 00:03:45,830
So with big O notation, you can
compare two algorithms by

88
00:03:45,830 --> 00:03:50,160
just looking at the ratio
of their big O run time.

89
00:03:50,160 --> 00:03:53,040
So if I'm looking at something
that has an array of size two

90
00:03:53,040 --> 00:03:56,070
million as its input, is it
clear that this is going to be

91
00:03:56,070 --> 00:03:57,320
a much better choice?

92
00:03:57,320 --> 00:04:00,240

93
00:04:00,240 --> 00:04:00,320
Ok.

94
00:04:00,320 --> 00:04:02,120
So you'll run into that,
especially a lot of you guys

95
00:04:02,120 --> 00:04:03,510
are taking this for
the purposes

96
00:04:03,510 --> 00:04:04,600
of scientific computing.

97
00:04:04,600 --> 00:04:06,720
So you'll run into big
O notation a lot.

98
00:04:06,720 --> 00:04:08,790
It's important to have a
grasp of what it means.

99
00:04:08,790 --> 00:04:12,170

100
00:04:12,170 --> 00:04:15,270
On the second page of the
handout, I have some common

101
00:04:15,270 --> 00:04:16,940
ones that you'll see.

102
00:04:16,940 --> 00:04:18,870
The first one is
constant time.

103
00:04:18,870 --> 00:04:21,839
We denote constant
time as order 1.

104
00:04:21,839 --> 00:04:25,100
But you'll notice that I have
here is order 1 is equal to

105
00:04:25,100 --> 00:04:28,880
order 10 is equal to order
2 to the 100th.

106
00:04:28,880 --> 00:04:31,520
That's unexpected to a lot of
people who are learning about

107
00:04:31,520 --> 00:04:32,770
big O notation.

108
00:04:32,770 --> 00:04:35,810

109
00:04:35,810 --> 00:04:37,060
Why is this true?

110
00:04:37,060 --> 00:04:40,690

111
00:04:40,690 --> 00:04:42,160
That seems kind of ridiculous.

112
00:04:42,160 --> 00:04:43,270
This is a really big number.

113
00:04:43,270 --> 00:04:45,392
This is really small number.

114
00:04:45,392 --> 00:04:45,843
Yeah?

115
00:04:45,843 --> 00:04:47,196
AUDIENCE: [INAUDIBLE].

116
00:04:47,196 --> 00:04:48,530
PROFESSOR: Yeah.

117
00:04:48,530 --> 00:04:49,010
Exactly.

118
00:04:49,010 --> 00:04:54,480
So we look at a graph of 1 and
a graph of 2 to the 100th.

119
00:04:54,480 --> 00:05:01,860

120
00:05:01,860 --> 00:05:01,940
Ok.

121
00:05:01,940 --> 00:05:04,770
We'll see that even though 2 to
the 100th is much higher,

122
00:05:04,770 --> 00:05:12,920
much bigger than 1, if this is
our input size, as the size of

123
00:05:12,920 --> 00:05:19,030
our input grows, do we see any
change in these two graphs?

124
00:05:19,030 --> 00:05:20,590
No.

125
00:05:20,590 --> 00:05:21,840
They're completely constant.

126
00:05:21,840 --> 00:05:24,270

127
00:05:24,270 --> 00:05:26,520
When you're doing big O
notation, if you run across an

128
00:05:26,520 --> 00:05:31,450
algorithm that does not depend
on the size of the input, OK,

129
00:05:31,450 --> 00:05:33,270
it's always going
to be order 1.

130
00:05:33,270 --> 00:05:36,330
Even if it's like 2 to the
100th steps, if it's a

131
00:05:36,330 --> 00:05:39,710
constant number of times
regardless of the size of the

132
00:05:39,710 --> 00:05:41,820
input, it's constant time.

133
00:05:41,820 --> 00:05:45,280
Other ones you'll see are
logarithmic time.

134
00:05:45,280 --> 00:05:49,550
Any base for logarithmic time
is about the same order.

135
00:05:49,550 --> 00:05:54,280
So order log base 2 of n is
order log base 10 of n.

136
00:05:54,280 --> 00:05:56,900
This is the fastest time
bound for search.

137
00:05:56,900 --> 00:05:59,000
Does anybody know what type
of search we'd be doing in

138
00:05:59,000 --> 00:06:01,780
logarithmic time?

139
00:06:01,780 --> 00:06:03,015
Something maybe we--

140
00:06:03,015 --> 00:06:03,825
AUDIENCE: Bisection time

141
00:06:03,825 --> 00:06:04,630
PROFESSOR: Yeah.

142
00:06:04,630 --> 00:06:05,040
Exactly.

143
00:06:05,040 --> 00:06:06,880
Bisection search is
logarithmic time.

144
00:06:06,880 --> 00:06:08,270
Because we take our input.

145
00:06:08,270 --> 00:06:11,650
And at every step, we cut in
half, cut in half, cut in

146
00:06:11,650 --> 00:06:13,630
half, and that's the fastest
search we can do.

147
00:06:13,630 --> 00:06:16,850

148
00:06:16,850 --> 00:06:19,610
The order n is linear time.

149
00:06:19,610 --> 00:06:24,790
Order n log n is the fastest
time bound we have for sort.

150
00:06:24,790 --> 00:06:28,040
We'll be talking about sort
in a couple of weeks.

151
00:06:28,040 --> 00:06:30,900
And order n-squared
is quadratic time.

152
00:06:30,900 --> 00:06:37,140
Anything that is order n to
some variable, so order

153
00:06:37,140 --> 00:06:41,110
n-squared, order n-cubed, order
n-fourth, all of that is

154
00:06:41,110 --> 00:06:48,060
going to be less than order
something to the power of n.

155
00:06:48,060 --> 00:06:52,040
So if we have something that's
order 2 to the n, that's

156
00:06:52,040 --> 00:06:54,280
ridiculous.

157
00:06:54,280 --> 00:06:56,400
That's a computationally very
intensive algorithm.

158
00:06:56,400 --> 00:06:59,340

159
00:06:59,340 --> 00:07:02,040
So on page two, I have some
questions for you.

160
00:07:02,040 --> 00:07:03,180
(1), (2), (3).

161
00:07:03,180 --> 00:07:05,930
Does order 100 n-squared
equal order n-squared.

162
00:07:05,930 --> 00:07:08,810

163
00:07:08,810 --> 00:07:10,666
Who says yes?

164
00:07:10,666 --> 00:07:11,170
All right.

165
00:07:11,170 --> 00:07:11,630
Very good.

166
00:07:11,630 --> 00:07:14,040
How about does order
one quarter n-cubed

167
00:07:14,040 --> 00:07:15,290
equals order n-cubed?

168
00:07:15,290 --> 00:07:17,830

169
00:07:17,830 --> 00:07:23,040
Does order n plus order
n equals order n?

170
00:07:23,040 --> 00:07:25,330
The answer is yes
to all of those.

171
00:07:25,330 --> 00:07:28,830
In the intuitive sense behind
this is that big O notation

172
00:07:28,830 --> 00:07:31,640
deals with the limiting
behavior of function.

173
00:07:31,640 --> 00:07:36,365
So I made some nifty graphs
for you guys to look at.

174
00:07:36,365 --> 00:07:44,260
When we're comparing order 100
n-squared to n-squared n cubed

175
00:07:44,260 --> 00:07:48,030
and 1/4 n-cubed, what people
often think of is what I have

176
00:07:48,030 --> 00:07:50,140
here in the first figure.

177
00:07:50,140 --> 00:07:53,400
So these are the four functions
I just mentioned.

178
00:07:53,400 --> 00:07:55,630
There's a legend in the
top left-hand corner.

179
00:07:55,630 --> 00:07:59,740
And the scale of this is
up to x equals 80.

180
00:07:59,740 --> 00:08:02,120
So you'll see at this
scale, this line

181
00:08:02,120 --> 00:08:04,950
right here is 100 x-squared.

182
00:08:04,950 --> 00:08:07,280
So this is, I think, often a
tripping point is that when

183
00:08:07,280 --> 00:08:09,130
people are conceptualizing
functions, they're saying,

184
00:08:09,130 --> 00:08:12,950
well, yeah, 100 x-squared is
much bigger than x-cubed,

185
00:08:12,950 --> 00:08:16,450
which is a lot bigger
than 1/4 x-cubed.

186
00:08:16,450 --> 00:08:19,310
So for very small inputs,
yes that's true.

187
00:08:19,310 --> 00:08:26,620
But what we're concerned about
is the behavior as the input

188
00:08:26,620 --> 00:08:27,870
gets very, very large.

189
00:08:27,870 --> 00:08:30,600

190
00:08:30,600 --> 00:08:36,309
So now, we're looking at
a size of up to 1,000.

191
00:08:36,309 --> 00:08:39,770
So now we see here, x-cubed,
even though it's a little bit

192
00:08:39,770 --> 00:08:43,990
smaller than 100 x-squared in
the beginning, it shoots off.

193
00:08:43,990 --> 00:08:46,740
x-cubed is much bigger than
either of the 2 x-squared.

194
00:08:46,740 --> 00:08:50,340
And even 1/4 x-cubed is becoming
bigger than 100

195
00:08:50,340 --> 00:08:54,000
x-squared out of 1,000.

196
00:08:54,000 --> 00:08:56,970
So that's an intuitive sense why
x-cubed no matter what the

197
00:08:56,970 --> 00:09:00,190
coefficient is in front of it is
going to dominate any term

198
00:09:00,190 --> 00:09:02,550
with x-squared in it, because
x-cubed is just going to go,

199
00:09:02,550 --> 00:09:06,720
whoosh, real big like that.

200
00:09:06,720 --> 00:09:09,900
And if we go out even further,
let's go out to input size of

201
00:09:09,900 --> 00:09:16,460
50,000, we go out to an input
size of 50,000, we see that

202
00:09:16,460 --> 00:09:25,380
even 100 x-squared versus just
x-squared, alright? they're

203
00:09:25,380 --> 00:09:27,840
about the same.

204
00:09:27,840 --> 00:09:32,910
The x-cubed terms now, they're
way above x-squared.

205
00:09:32,910 --> 00:09:37,910
So the two x-squared terms, 100
versus just 1, as far as

206
00:09:37,910 --> 00:09:42,690
the coefficient goes, they're
about the same.

207
00:09:42,690 --> 00:09:46,420
So this is the scale at which
we're concerned about when

208
00:09:46,420 --> 00:09:48,550
we're talking about big O
notation, the limiting

209
00:09:48,550 --> 00:09:51,400
behavior as your input size
grows very large.

210
00:09:51,400 --> 00:09:54,830
50,000 is not even that large,
if you think about the size of

211
00:09:54,830 --> 00:09:56,460
the genome.

212
00:09:56,460 --> 00:09:59,420
I mean does anybody here bio?

213
00:09:59,420 --> 00:10:02,000
What's like the size of
the human genome.

214
00:10:02,000 --> 00:10:04,460
How many base pairs?

215
00:10:04,460 --> 00:10:08,094
Or even one gene or
one chromosome.

216
00:10:08,094 --> 00:10:10,329
AUDIENCE: [INAUDIBLE].

217
00:10:10,329 --> 00:10:12,020
PROFESSOR: What's the biggest?

218
00:10:12,020 --> 00:10:17,190

219
00:10:17,190 --> 00:10:18,610
AUDIENCE: It's over 50,000.

220
00:10:18,610 --> 00:10:20,620
PROFESSOR: Yeah, over 50,000.

221
00:10:20,620 --> 00:10:23,950
And we're talking about the
amount of data that we get

222
00:10:23,950 --> 00:10:26,530
back from the Hubble
Space Telescope.

223
00:10:26,530 --> 00:10:28,320
I mean the resolution on those
things are absolutely

224
00:10:28,320 --> 00:10:28,920
ridiculous.

225
00:10:28,920 --> 00:10:31,540
And you run all sorts of
algorithms on those images to

226
00:10:31,540 --> 00:10:34,650
try and see if there's
life in the universe.

227
00:10:34,650 --> 00:10:38,190
So we're very concerned about
the big long term behavior of

228
00:10:38,190 --> 00:10:40,380
these functions.

229
00:10:40,380 --> 00:10:41,650
How about page three?

230
00:10:41,650 --> 00:10:42,940
One last question.

231
00:10:42,940 --> 00:10:47,740
Does order 100 n-squared
plus 1/4

232
00:10:47,740 --> 00:10:50,580
n-cube equal order n-cubed?

233
00:10:50,580 --> 00:10:53,680
Who says yes?

234
00:10:53,680 --> 00:10:55,000
And so I have one more graph.

235
00:10:55,000 --> 00:11:01,080

236
00:11:01,080 --> 00:11:06,740
Down here, these red dots
are 100 x-squared.

237
00:11:06,740 --> 00:11:09,600
These blue circles
are 1/4 x-cubed.

238
00:11:09,600 --> 00:11:13,850
And this line is the sum.

239
00:11:13,850 --> 00:11:19,810
We can see that this line is a
little bit bigger than the 1/4

240
00:11:19,810 --> 00:11:20,800
x-cubed term.

241
00:11:20,800 --> 00:11:26,570
But really, this has no effect
at this far out.

242
00:11:26,570 --> 00:11:28,900
So that's why we're just going
to drop any lower order terms

243
00:11:28,900 --> 00:11:33,210
whenever you're approached with
a big O expression that

244
00:11:33,210 --> 00:11:35,850
has a bunch of constant factors,
it has all sorts of

245
00:11:35,850 --> 00:11:38,510
different powers of n and stuff,
you're always just

246
00:11:38,510 --> 00:11:40,550
going to drop all the constant
factors and just pick the

247
00:11:40,550 --> 00:11:41,630
biggest thing.

248
00:11:41,630 --> 00:11:48,501
So this line right here
is order n-cubed.

249
00:11:48,501 --> 00:11:50,750
Is that clear to everybody?

250
00:11:50,750 --> 00:11:54,450
So now I've gotten through the
basics of how we analyze this

251
00:11:54,450 --> 00:11:56,600
and why are we looking
at this.

252
00:11:56,600 --> 00:11:57,965
Let's look at some code.

253
00:11:57,965 --> 00:12:01,210

254
00:12:01,210 --> 00:12:12,500
So the first example, all of
these things right here, in

255
00:12:12,500 --> 00:12:17,050
Python, we make the assumption
that statements like this, x

256
00:12:17,050 --> 00:12:21,020
plus 1, x times y, all these
mathematical operations are

257
00:12:21,020 --> 00:12:23,600
all constant time.

258
00:12:23,600 --> 00:12:26,250
That's something that
you can just assume.

259
00:12:26,250 --> 00:12:29,130
So for this function down here,
we have constant time,

260
00:12:29,130 --> 00:12:32,730
constant time, constant time,
constant time operation.

261
00:12:32,730 --> 00:12:36,620
So we'd say, this function
bar is what?

262
00:12:36,620 --> 00:12:37,780
What's its complexity?

263
00:12:37,780 --> 00:12:39,955
AUDIENCE: [INAUDIBLE].

264
00:12:39,955 --> 00:12:40,670
PROFESSOR: Yeah.

265
00:12:40,670 --> 00:12:41,260
Constant time.

266
00:12:41,260 --> 00:12:45,460
So the complexity of all these
functions are just a 1,

267
00:12:45,460 --> 00:12:49,860
because it doesn't matter
how big the input is.

268
00:12:49,860 --> 00:12:51,280
It's all going to run
in constant time.

269
00:12:51,280 --> 00:12:57,610

270
00:12:57,610 --> 00:13:01,060
For this multiplication
function here,

271
00:13:01,060 --> 00:13:03,430
we use a for loop.

272
00:13:03,430 --> 00:13:05,470
Oftentimes, when we see for
loops that's just going

273
00:13:05,470 --> 00:13:08,500
through the input, there's a
signal to us that it's going

274
00:13:08,500 --> 00:13:11,880
to probably contain
a factor of O(n).

275
00:13:11,880 --> 00:13:13,800
Why is that?

276
00:13:13,800 --> 00:13:14,990
What do we do in
this for loop?

277
00:13:14,990 --> 00:13:18,170
We say for i in range y.

278
00:13:18,170 --> 00:13:18,970
What does that mean?

279
00:13:18,970 --> 00:13:22,950
How many times do we execute
that for loop?

280
00:13:22,950 --> 00:13:24,170
Yeah, y times.

281
00:13:24,170 --> 00:13:27,220
So if y is really small, we
execute that for loop just a

282
00:13:27,220 --> 00:13:28,460
few number of times.

283
00:13:28,460 --> 00:13:31,130
But if y is really large, we
execute that for loop a whole

284
00:13:31,130 --> 00:13:33,130
bunch of times.

285
00:13:33,130 --> 00:13:35,450
So when we're analyzing this,
we see this for loop and we

286
00:13:35,450 --> 00:13:39,430
say, ah, that for loop
must be O(y).

287
00:13:39,430 --> 00:13:42,630

288
00:13:42,630 --> 00:13:44,884
Does that make sense
to everybody?

289
00:13:44,884 --> 00:13:47,780
OK, good.

290
00:13:47,780 --> 00:13:50,501
Let's look at a factorial.

291
00:13:50,501 --> 00:13:56,015
Can anybody tell me what the
complexity of factorial is?

292
00:13:56,015 --> 00:13:57,392
AUDIENCE: [INAUDIBLE].

293
00:13:57,392 --> 00:13:58,310
PROFESSOR: Yeah.

294
00:13:58,310 --> 00:13:58,890
Order n.

295
00:13:58,890 --> 00:14:00,995
Why is it order n?

296
00:14:00,995 --> 00:14:02,830
AUDIENCE: Because it's
self for loop.

297
00:14:02,830 --> 00:14:03,710
PROFESSOR: Yeah.

298
00:14:03,710 --> 00:14:04,890
It's the exact same structure.

299
00:14:04,890 --> 00:14:07,750
We have a for loop that's
going through

300
00:14:07,750 --> 00:14:10,850
range 1 to n plus 1.

301
00:14:10,850 --> 00:14:12,560
So that's dependent
on the size of n.

302
00:14:12,560 --> 00:14:14,460
So this for loop is order n.

303
00:14:14,460 --> 00:14:16,110
And inside the for loop,
we just do a

304
00:14:16,110 --> 00:14:17,750
constant time operation.

305
00:14:17,750 --> 00:14:18,920
That's the other thing.

306
00:14:18,920 --> 00:14:21,390
Just because we have this for
loop doesn't mean that what's

307
00:14:21,390 --> 00:14:24,420
inside the for loop is
going to be constant.

308
00:14:24,420 --> 00:14:29,470
But in this case, if we have
order n times, we do a content

309
00:14:29,470 --> 00:14:30,960
time operation.

310
00:14:30,960 --> 00:14:34,560
Then this whole chunk of the
for loop is order n.

311
00:14:34,560 --> 00:14:37,660
The rest of everything else
is just constant time.

312
00:14:37,660 --> 00:14:41,490
So we have constant time plus
order n times constant time

313
00:14:41,490 --> 00:14:43,470
plus constant time, they're
going to be order n.

314
00:14:43,470 --> 00:14:48,090

315
00:14:48,090 --> 00:14:50,640
How about this one?

316
00:14:50,640 --> 00:14:53,810
Factorial 2.

317
00:14:53,810 --> 00:14:56,150
AUDIENCE: [INAUDIBLE].

318
00:14:56,150 --> 00:14:56,960
PROFESSOR: Yeah.

319
00:14:56,960 --> 00:14:57,420
Exactly.

320
00:14:57,420 --> 00:14:59,160
This is also order n.

321
00:14:59,160 --> 00:15:01,480
The only thing that's different
in this code is that

322
00:15:01,480 --> 00:15:03,370
we initialize its
count variable.

323
00:15:03,370 --> 00:15:04,750
And inside the for
loop, we also

324
00:15:04,750 --> 00:15:06,630
increment this count variable.

325
00:15:06,630 --> 00:15:10,220
But both result times equals
num and count plus equal 1,

326
00:15:10,220 --> 00:15:12,940
both of these are constant
time operations.

327
00:15:12,940 --> 00:15:17,870
So if we do n times 2 constant
times operations, that's still

328
00:15:17,870 --> 00:15:20,480
going to be order n.

329
00:15:20,480 --> 00:15:23,780
So the takeaway from these two
examples that I'm trying to

330
00:15:23,780 --> 00:15:28,530
demonstrate here is a single
line of code can generate a

331
00:15:28,530 --> 00:15:32,010
pretty complex thing.

332
00:15:32,010 --> 00:15:34,100
But a collection of lines
of code might

333
00:15:34,100 --> 00:15:35,720
still be constant time.

334
00:15:35,720 --> 00:15:37,845
So you have to look at
every line of code

335
00:15:37,845 --> 00:15:39,095
and consider that.

336
00:15:39,095 --> 00:15:44,725

337
00:15:44,725 --> 00:15:49,040
I've thrown in some
conditionals here.

338
00:15:49,040 --> 00:15:50,450
What's the complexity
of this guy?

339
00:15:50,450 --> 00:15:57,184

340
00:15:57,184 --> 00:15:58,627
AUDIENCE: [INAUDIBLE].

341
00:15:58,627 --> 00:15:59,190
PROFESSOR: Yeah.

342
00:15:59,190 --> 00:16:00,240
This is also linear.

343
00:16:00,240 --> 00:16:01,540
What's going on here?

344
00:16:01,540 --> 00:16:05,260
We initialize a variable count
that's constant time.

345
00:16:05,260 --> 00:16:07,620
We go through character
in a string.

346
00:16:07,620 --> 00:16:10,830
This is linear in the
size of a string.

347
00:16:10,830 --> 00:16:16,900
Now we say if character equal,
equal t, this character equal,

348
00:16:16,900 --> 00:16:21,070
equal t, that's also a constant
time operation.

349
00:16:21,070 --> 00:16:25,200
That's just asking if this one
thing equals this other thing.

350
00:16:25,200 --> 00:16:26,800
So we're looking at
two characters.

351
00:16:26,800 --> 00:16:28,140
We're looking at two numbers.

352
00:16:28,140 --> 00:16:30,690
Equal, equal or not equal
is generally a

353
00:16:30,690 --> 00:16:32,640
constant times operation.

354
00:16:32,640 --> 00:16:39,950
The exception to this might be
a quality of certain types,

355
00:16:39,950 --> 00:16:42,790
like if you define a class and
you define a quality method in

356
00:16:42,790 --> 00:16:45,470
your class, and the equality
method of your class is not

357
00:16:45,470 --> 00:16:48,100
constant time, then this equal,
equal check might not

358
00:16:48,100 --> 00:16:49,180
be constant time.

359
00:16:49,180 --> 00:16:53,090
But on two strings, equal,
equal is constant time.

360
00:16:53,090 --> 00:16:56,020
And this is constant
time as well.

361
00:16:56,020 --> 00:16:58,860
So linear in the size
of a string.

362
00:16:58,860 --> 00:17:01,480
Something that's important when
you're doing this for

363
00:17:01,480 --> 00:17:06,910
exams, it's a good idea to
define what n is before you

364
00:17:06,910 --> 00:17:09,000
give the complexity bound.

365
00:17:09,000 --> 00:17:13,700
So here I'm saying n is equal
to the size of a string.

366
00:17:13,700 --> 00:17:16,710
So now, I can say this
function is order n.

367
00:17:16,710 --> 00:17:19,869
What I'm saying is that it's a
linear with respect to the

368
00:17:19,869 --> 00:17:22,670
size or the length
of a string.

369
00:17:22,670 --> 00:17:28,140
Because sometimes, like in the
one where there is the input x

370
00:17:28,140 --> 00:17:33,840
and y, the running time was only
linear in the size of y.

371
00:17:33,840 --> 00:17:37,180
So you want to define that n was
equal to the size of y to

372
00:17:37,180 --> 00:17:38,690
say that it was order n.

373
00:17:38,690 --> 00:17:39,640
So always be clear.

374
00:17:39,640 --> 00:17:42,960
If it's not clear, be sure
to explicitly state

375
00:17:42,960 --> 00:17:44,210
what n is equal to.

376
00:17:44,210 --> 00:17:48,720

377
00:17:48,720 --> 00:17:51,890
This code's a little
more tricky.

378
00:17:51,890 --> 00:17:55,315
What's going on here?

379
00:17:55,315 --> 00:17:56,565
AUDIENCE: [INAUDIBLE].

380
00:17:56,565 --> 00:18:10,963

381
00:18:10,963 --> 00:18:12,520
PROFESSOR: Yeah.

382
00:18:12,520 --> 00:18:13,680
That was perfect.

383
00:18:13,680 --> 00:18:20,100
So just to reiterate, the for
loop we know is linear with

384
00:18:20,100 --> 00:18:21,410
respect to the size
of a string.

385
00:18:21,410 --> 00:18:23,710
We have to go through every
character in a string.

386
00:18:23,710 --> 00:18:29,220
Now, the second is if char in
b string, when we're looking

387
00:18:29,220 --> 00:18:31,790
at big O notation, we're worried
about the worst case

388
00:18:31,790 --> 00:18:33,830
complexity in upper bound.

389
00:18:33,830 --> 00:18:35,635
What's the worst case?

390
00:18:35,635 --> 00:18:36,615
AUDIENCE: [INAUDIBLE].

391
00:18:36,615 --> 00:18:37,740
PROFESSOR: Yeah.

392
00:18:37,740 --> 00:18:40,620
If the character is not in b
string, we have to look at

393
00:18:40,620 --> 00:18:42,760
every single character
in b string before

394
00:18:42,760 --> 00:18:44,920
we can return false.

395
00:18:44,920 --> 00:18:46,750
So that is linear.

396
00:18:46,750 --> 00:18:50,320
This one single line, if
character in b string, that

397
00:18:50,320 --> 00:18:53,150
one line is linear
with respect to

398
00:18:53,150 --> 00:18:55,330
the size of b string.

399
00:18:55,330 --> 00:18:59,440
So how do we analyze the
complexity of this?

400
00:18:59,440 --> 00:19:03,000
I want to be able to
touch the screen.

401
00:19:03,000 --> 00:19:04,920
We have this for loop.

402
00:19:04,920 --> 00:19:08,110
This for loop is executed.

403
00:19:08,110 --> 00:19:10,600
Let's call n is the length
of a string.

404
00:19:10,600 --> 00:19:13,460
This for loop is executed
n times.

405
00:19:13,460 --> 00:19:15,980
Every time we execute
this for loop, we

406
00:19:15,980 --> 00:19:18,100
execute this inner body.

407
00:19:18,100 --> 00:19:20,920
And what's the time bound
on the inner body?

408
00:19:20,920 --> 00:19:24,230
Well, if we let m equal the
length of b string, when we

409
00:19:24,230 --> 00:19:28,320
say that this check is order m
every time we run it, then we

410
00:19:28,320 --> 00:19:34,010
run an order m operation
order n times.

411
00:19:34,010 --> 00:19:35,260
So the complexity is--

412
00:19:35,260 --> 00:19:38,530

413
00:19:38,530 --> 00:19:41,850
we use something of
size m, n times.

414
00:19:41,850 --> 00:19:42,850
AUDIENCE: [INAUDIBLE].

415
00:19:42,850 --> 00:19:43,600
PROFESSOR: Yeah.

416
00:19:43,600 --> 00:19:44,850
Just order n, m.

417
00:19:44,850 --> 00:19:49,640

418
00:19:49,640 --> 00:19:53,270
So we execute an order m check
order n time, we say this

419
00:19:53,270 --> 00:19:57,340
function is order n, m.

420
00:19:57,340 --> 00:19:59,750
Does that make sense
to everybody?

421
00:19:59,750 --> 00:20:02,120
Because you'll see the
nested for loops.

422
00:20:02,120 --> 00:20:04,070
Nested for loops are very
similar to this.

423
00:20:04,070 --> 00:20:07,780

424
00:20:07,780 --> 00:20:11,960
While loops combine the best of
conditionals with the best

425
00:20:11,960 --> 00:20:14,560
of for loops.

426
00:20:14,560 --> 00:20:17,350
Because a while loop has a
chance to act like for loop,

427
00:20:17,350 --> 00:20:19,930
but a while loop can also
have a conditional.

428
00:20:19,930 --> 00:20:22,910
It's actually possible to write
a while loop that has a

429
00:20:22,910 --> 00:20:26,250
complex conditional that also
executes a number of times.

430
00:20:26,250 --> 00:20:28,790
And so you could have one single
line of code generating

431
00:20:28,790 --> 00:20:32,530
like an order n-squared
complexity.

432
00:20:32,530 --> 00:20:34,610
Let's look at factorial 3.

433
00:20:34,610 --> 00:20:37,290
Who can tell the complexity
of factorial 3?

434
00:20:37,290 --> 00:20:43,751

435
00:20:43,751 --> 00:20:45,739
AUDIENCE: [INAUDIBLE].

436
00:20:45,739 --> 00:20:46,750
PROFESSOR: Yeah.

437
00:20:46,750 --> 00:20:48,100
It's also linear.

438
00:20:48,100 --> 00:20:51,070
It's interesting that factorial
is always linear

439
00:20:51,070 --> 00:20:53,360
despite its name.

440
00:20:53,360 --> 00:20:55,070
We have constant time
operations.

441
00:20:55,070 --> 00:20:57,942
How many times does the
while loop executed?

442
00:20:57,942 --> 00:20:58,766
AUDIENCE: n times.

443
00:20:58,766 --> 00:21:00,580
PROFESSOR: Yeah, n times.

444
00:21:00,580 --> 00:21:03,960
And what's inside the body
of the while loop?

445
00:21:03,960 --> 00:21:05,770
Constant time operations.

446
00:21:05,770 --> 00:21:08,740
So we execute a bunch of
constant time operation n

447
00:21:08,740 --> 00:21:11,690
times order n.

448
00:21:11,690 --> 00:21:13,170
How about this char
split example?

449
00:21:13,170 --> 00:21:16,884

450
00:21:16,884 --> 00:21:19,640
This one's a little tricky
because you're like, well,

451
00:21:19,640 --> 00:21:21,060
what's the complexity of len?

452
00:21:21,060 --> 00:21:23,600

453
00:21:23,600 --> 00:21:26,660
In Python, len's actually a
constant time operation.

454
00:21:26,660 --> 00:21:29,110
This example's very crafted
such that all of the

455
00:21:29,110 --> 00:21:31,970
operations that are here
are constant time.

456
00:21:31,970 --> 00:21:34,660
So appending to a list
is constant time.

457
00:21:34,660 --> 00:21:38,700
And indexing a string
is constant time.

458
00:21:38,700 --> 00:21:41,300
So what's the complexity
of char split?

459
00:21:41,300 --> 00:21:47,940

460
00:21:47,940 --> 00:21:50,666
Constant time.

461
00:21:50,666 --> 00:21:51,916
AUDIENCE: [INAUDIBLE].

462
00:21:51,916 --> 00:21:57,442

463
00:21:57,442 --> 00:22:00,760
PROFESSOR: Who would agree
with constant time?

464
00:22:00,760 --> 00:22:03,302
And who would say it's
linear time?

465
00:22:03,302 --> 00:22:04,140
OK, yeah.

466
00:22:04,140 --> 00:22:04,530
Very good.

467
00:22:04,530 --> 00:22:06,090
It is linear time.

468
00:22:06,090 --> 00:22:08,150
That's a correct intuition.

469
00:22:08,150 --> 00:22:10,990
We say while the length of the
a string is not equal to the

470
00:22:10,990 --> 00:22:13,550
length of the result, these
are two constant time

471
00:22:13,550 --> 00:22:14,560
operations.

472
00:22:14,560 --> 00:22:15,360
Well, what do we do?

473
00:22:15,360 --> 00:22:18,850
We append a value to the result,
and then we add up

474
00:22:18,850 --> 00:22:19,920
this index.

475
00:22:19,920 --> 00:22:23,190
So when is this check
going to be equal?

476
00:22:23,190 --> 00:22:25,490
This check's going to be equal
when the length of the result

477
00:22:25,490 --> 00:22:27,060
is equal to the length
of a string.

478
00:22:27,060 --> 00:22:28,980
And that's only going to
happen after we've gone

479
00:22:28,980 --> 00:22:31,730
through the entire a string,
and we've added each of its

480
00:22:31,730 --> 00:22:33,650
characters to result.

481
00:22:33,650 --> 00:22:38,520
So this is linear with respect
to the size of a string.

482
00:22:38,520 --> 00:22:42,200
Something that's important to
recognize is that not all

483
00:22:42,200 --> 00:22:45,390
string in the list operations
are constant time.

484
00:22:45,390 --> 00:22:50,110
There's a website here that
first off, it says C Python if

485
00:22:50,110 --> 00:22:50,780
you go to it.

486
00:22:50,780 --> 00:22:53,710
C Python just means Python
implemented in C, which is

487
00:22:53,710 --> 00:22:56,860
actually what you're
running, C Python.

488
00:22:56,860 --> 00:22:59,200
So don't worry about that.

489
00:22:59,200 --> 00:23:01,920
There's often two time
bound complexities.

490
00:23:01,920 --> 00:23:05,380
It says the amortized time
and the worst case time.

491
00:23:05,380 --> 00:23:07,990
And so if you're looking for big
O notation, you don't want

492
00:23:07,990 --> 00:23:09,010
to use the amortized time.

493
00:23:09,010 --> 00:23:12,150
You want to use the
worst case time.

494
00:23:12,150 --> 00:23:14,870
And it's important to note that
operations like slicing

495
00:23:14,870 --> 00:23:18,250
and copying actually aren't
constant time.

496
00:23:18,250 --> 00:23:22,770
If you slice a list or a string,
the complexity of that

497
00:23:22,770 --> 00:23:25,870
operation is going to depend
on how big your slice is.

498
00:23:25,870 --> 00:23:27,670
Does that makes sense?

499
00:23:27,670 --> 00:23:29,690
Does the way that a slice works
is that walks through

500
00:23:29,690 --> 00:23:33,750
the list until it gets to the
index, and then keeps walking

501
00:23:33,750 --> 00:23:36,330
until the final index, and
then copies that and

502
00:23:36,330 --> 00:23:37,870
returns it to you.

503
00:23:37,870 --> 00:23:40,990
So slicing is not
constant time.

504
00:23:40,990 --> 00:23:43,150
Copying is similarly
not constant time.

505
00:23:43,150 --> 00:23:47,070

506
00:23:47,070 --> 00:23:50,630
For this little snippet
of code, this is just

507
00:23:50,630 --> 00:23:51,600
similar to what we--

508
00:23:51,600 --> 00:23:52,850
yeah?

509
00:23:52,850 --> 00:23:54,100
AUDIENCE: [INAUDIBLE].

510
00:23:54,100 --> 00:24:06,215

511
00:24:06,215 --> 00:24:09,140
PROFESSOR: So this is
what I was saying.

512
00:24:09,140 --> 00:24:12,420
You want to define what n is.

513
00:24:12,420 --> 00:24:15,800
So we say something like n
equals the length of a string.

514
00:24:15,800 --> 00:24:19,400
And then you can say
it's order n.

515
00:24:19,400 --> 00:24:23,180
It's important to define what
you're saying the complexity

516
00:24:23,180 --> 00:24:24,430
is related to.

517
00:24:24,430 --> 00:24:27,070

518
00:24:27,070 --> 00:24:31,280
So here, I'm saying if we let n
equal to the size of z, can

519
00:24:31,280 --> 00:24:33,280
anybody tell me what the
complexity of this

520
00:24:33,280 --> 00:24:36,870
snippet of code is?

521
00:24:36,870 --> 00:24:37,285
[UNINTELLIGIBLE].

522
00:24:37,285 --> 00:24:37,960
AUDIENCE: [INAUDIBLE].

523
00:24:37,960 --> 00:24:38,840
PROFESSOR: Yeah, precisesly.

524
00:24:38,840 --> 00:24:39,750
Order n-squared.

525
00:24:39,750 --> 00:24:40,740
Why?

526
00:24:40,740 --> 00:24:44,360
Well, because we execute
this for i for loop

527
00:24:44,360 --> 00:24:47,870
here order n times.

528
00:24:47,870 --> 00:24:50,860
Each time through this for loop,
the body of this for

529
00:24:50,860 --> 00:24:53,880
loop is, in fact, another
for loop.

530
00:24:53,880 --> 00:24:58,290
So my approach to problems like
this is just step back a

531
00:24:58,290 --> 00:25:01,150
minute and ignore
the outer loop.

532
00:25:01,150 --> 00:25:02,380
Just concentrate on
the inner loop.

533
00:25:02,380 --> 00:25:04,010
What's the runtime of
this inner loop?

534
00:25:04,010 --> 00:25:06,510

535
00:25:06,510 --> 00:25:06,740
Yeah.

536
00:25:06,740 --> 00:25:07,620
This is order n.

537
00:25:07,620 --> 00:25:08,830
We go through this.

538
00:25:08,830 --> 00:25:10,760
Now, go to the outer loop.

539
00:25:10,760 --> 00:25:12,630
Just ignore the body
since we've already

540
00:25:12,630 --> 00:25:13,550
analyzed the body.

541
00:25:13,550 --> 00:25:14,500
Ignore it.

542
00:25:14,500 --> 00:25:17,640
What's the complexity
of the outer loop?

543
00:25:17,640 --> 00:25:19,270
Also order n.

544
00:25:19,270 --> 00:25:21,200
So now you can combine
the analysis.

545
00:25:21,200 --> 00:25:26,190
You can say for order n times,
I execute this body.

546
00:25:26,190 --> 00:25:28,950
This body takes order n times.

547
00:25:28,950 --> 00:25:34,160
So if execute something that's
order n order n times, that is

548
00:25:34,160 --> 00:25:36,040
order n squared complexity.

549
00:25:36,040 --> 00:25:39,040
So we just multiply how long it
takes the outer body of the

550
00:25:39,040 --> 00:25:40,550
loop to take the inner
body of the loop.

551
00:25:40,550 --> 00:25:44,370
And so in this fashion, I could
give you now probably a

552
00:25:44,370 --> 00:25:46,240
four or five nested for loop,
and you could tell me the

553
00:25:46,240 --> 00:25:47,490
complexity of it.

554
00:25:47,490 --> 00:25:52,900

555
00:25:52,900 --> 00:25:57,550
Harder sometimes to understand
is recursion.

556
00:25:57,550 --> 00:26:00,180
I don't know how important it is
to understand this because

557
00:26:00,180 --> 00:26:01,820
I've never actually taught
this class before.

558
00:26:01,820 --> 00:26:03,440
But Mitch did tell me
to go over this.

559
00:26:03,440 --> 00:26:06,500
So I'd like to touch on it.

560
00:26:06,500 --> 00:26:09,080
So consider recursive
factorial.

561
00:26:09,080 --> 00:26:10,630
What's the time complexity
of this?

562
00:26:10,630 --> 00:26:13,740
How can we figure out
the time complexity

563
00:26:13,740 --> 00:26:14,990
over a recursive function?

564
00:26:14,990 --> 00:26:23,020

565
00:26:23,020 --> 00:26:24,950
The way we want to figure out
the time complexity of a

566
00:26:24,950 --> 00:26:27,530
recursive function is just to
figure out how many times

567
00:26:27,530 --> 00:26:30,570
we're executing said
recursive function.

568
00:26:30,570 --> 00:26:35,290
So here I have recursive
factorial of n.

569
00:26:35,290 --> 00:26:39,750
When I make a call to
this, what do I do?

570
00:26:39,750 --> 00:26:45,850
I make a call to recursive
factorial n minus 1.

571
00:26:45,850 --> 00:26:47,330
And then what does this do?

572
00:26:47,330 --> 00:26:51,430
This calls recursive factorial
on a sub problem the

573
00:26:51,430 --> 00:26:54,340
size n minus 2.

574
00:26:54,340 --> 00:27:00,180
So oftentimes, when you're
dealing with recursive

575
00:27:00,180 --> 00:27:02,620
problems to figure out the
complexity, what you need to

576
00:27:02,620 --> 00:27:06,190
do is you need to figure out how
many times you're going to

577
00:27:06,190 --> 00:27:10,040
make a recursive call before
a result is returned.

578
00:27:10,040 --> 00:27:12,590
Intuitively, we can start
to see a pattern.

579
00:27:12,590 --> 00:27:16,630
We can say, I called on n, and
then n minus 1, and then n

580
00:27:16,630 --> 00:27:22,850
minus 2, and I keep calling
recursive factorial until n is

581
00:27:22,850 --> 00:27:24,740
less than or equal to 0.

582
00:27:24,740 --> 00:27:27,180
When is n going to be less
than or equal to 0?

583
00:27:27,180 --> 00:27:28,740
Well, when I get n minus n.

584
00:27:28,740 --> 00:27:31,360

585
00:27:31,360 --> 00:27:34,626
So how many calls is that?

586
00:27:34,626 --> 00:27:35,594
AUDIENCE: [INAUDIBLE].

587
00:27:35,594 --> 00:27:36,195
PROFESSOR: Yeah.

588
00:27:36,195 --> 00:27:38,030
This is n calls.

589
00:27:38,030 --> 00:27:43,530
So it's a good practice to get
into being able to draw this

590
00:27:43,530 --> 00:27:46,430
out and work yourself through
how many times you're running

591
00:27:46,430 --> 00:27:47,680
the recursion.

592
00:27:47,680 --> 00:27:51,000
And we see we're making n calls,
we can say, oh, this

593
00:27:51,000 --> 00:27:52,250
must be linear in time.

594
00:27:52,250 --> 00:27:56,720

595
00:27:56,720 --> 00:27:58,815
How about this one,
this foo function?

596
00:27:58,815 --> 00:28:06,410

597
00:28:06,410 --> 00:28:09,720
This one's a little
harder to see.

598
00:28:09,720 --> 00:28:12,330
But what are we doing?

599
00:28:12,330 --> 00:28:20,480
We call foo on input of size n,
which then makes a call to

600
00:28:20,480 --> 00:28:24,370
sub problem the size n/2, which
makes the call to a sub

601
00:28:24,370 --> 00:28:34,200
problem of size n/4 and so on
until I make a call to sub

602
00:28:34,200 --> 00:28:35,760
problem of some size.

603
00:28:35,760 --> 00:28:38,260
So this is n.

604
00:28:38,260 --> 00:28:40,840
This is 2 to the 1st.

605
00:28:40,840 --> 00:28:43,150
This is 2-squared.

606
00:28:43,150 --> 00:28:44,350
We start to see a pattern--

607
00:28:44,350 --> 00:28:47,610
2-squared, 2-cubed,
2 to the fourth.

608
00:28:47,610 --> 00:28:50,570
So we're going to keep making
calls on a smaller, and

609
00:28:50,570 --> 00:28:52,820
smaller, and smaller
sub problem size.

610
00:28:52,820 --> 00:28:56,770
But instead of being linear like
before, we're decreasing

611
00:28:56,770 --> 00:28:58,100
at an exponential rate.

612
00:28:58,100 --> 00:29:01,630

613
00:29:01,630 --> 00:29:03,470
There's a bunch of different
ways to try and work this out

614
00:29:03,470 --> 00:29:04,140
in your head.

615
00:29:04,140 --> 00:29:06,160
I wrote up one possible
description.

616
00:29:06,160 --> 00:29:10,330
But when we're decreasing at
this exponential rate, what's

617
00:29:10,330 --> 00:29:15,360
going to end up happening is
that this recursive problem

618
00:29:15,360 --> 00:29:21,900
where we make a recursive call
in the form to sub problem of

619
00:29:21,900 --> 00:29:28,310
size n/b, the complexity of that
is always going to be log

620
00:29:28,310 --> 00:29:30,450
base b of n.

621
00:29:30,450 --> 00:29:33,840
So this is just like bisection
search, where bisection

622
00:29:33,840 --> 00:29:36,620
search, we essentially do
in bisection search.

623
00:29:36,620 --> 00:29:39,450
We restrict the problem size
by half every time.

624
00:29:39,450 --> 00:29:41,950
And that leads to logarithmic
time, actually

625
00:29:41,950 --> 00:29:43,610
log base 2 of n.

626
00:29:43,610 --> 00:29:46,540
This problem is also
log base 2 of n.

627
00:29:46,540 --> 00:29:54,520
If we change this recursive call
from n/2 to n/6, we get a

628
00:29:54,520 --> 00:29:58,590
cut time complexity of
log base 6 of n.

629
00:29:58,590 --> 00:30:00,120
So try and work that through.

630
00:30:00,120 --> 00:30:02,610
You can read this
closer later.

631
00:30:02,610 --> 00:30:06,280
Definitely ask me if you need
more help on that one.

632
00:30:06,280 --> 00:30:09,460
The last one is how do we deal
time complexity of something

633
00:30:09,460 --> 00:30:10,710
like Fibonacci?

634
00:30:10,710 --> 00:30:13,250

635
00:30:13,250 --> 00:30:19,260
Fibonacci, fib n minus 1 plus
fib n minus 2, initially, that

636
00:30:19,260 --> 00:30:20,360
kind of looks linear.

637
00:30:20,360 --> 00:30:20,700
Right?

638
00:30:20,700 --> 00:30:24,960
We just went over the recursive
factorial, and it

639
00:30:24,960 --> 00:30:28,520
made the call to a sub problem
the size n minus 1.

640
00:30:28,520 --> 00:30:31,280
And that was linear.

641
00:30:31,280 --> 00:30:33,170
Fibonacci's a little
bit different.

642
00:30:33,170 --> 00:30:36,870
If you actually draw out in a
tree, you start to see like at

643
00:30:36,870 --> 00:30:44,090
every level of the tree, we
expand the call by 2.

644
00:30:44,090 --> 00:30:47,690
Now imagine this is just
for Fibonacci of 6.

645
00:30:47,690 --> 00:30:49,610
Whenever you're doing big O
complexity, you want to

646
00:30:49,610 --> 00:30:53,516
imagine it and put
100,000, 50,000.

647
00:30:53,516 --> 00:30:55,585
And you could imagine how
big that tree grows.

648
00:30:55,585 --> 00:30:58,710

649
00:30:58,710 --> 00:31:03,640
Intuitively, the point to see
here is that they're going to

650
00:31:03,640 --> 00:31:10,340
be about n levels to get
down to 1 from your

651
00:31:10,340 --> 00:31:12,500
initial input of 6.

652
00:31:12,500 --> 00:31:15,150
So to get down to 1 from an
initial input of size n is

653
00:31:15,150 --> 00:31:17,260
going to take about n levels.

654
00:31:17,260 --> 00:31:21,790
The branching factor of this
tree at each level is 2.

655
00:31:21,790 --> 00:31:26,460
So if we have n levels, and at
each level, we increase our

656
00:31:26,460 --> 00:31:30,780
branching factor by another 2,
we can say that a loose bound

657
00:31:30,780 --> 00:31:32,840
on the complexity of this
is actually 2 to the n.

658
00:31:32,840 --> 00:31:35,700

659
00:31:35,700 --> 00:31:39,450
This is something that's even
less intuitive, I think, than

660
00:31:39,450 --> 00:31:41,430
what we did before with
the logarithms.

661
00:31:41,430 --> 00:31:43,640
So try and work through
it again.

662
00:31:43,640 --> 00:31:45,020
Play with it a little bit.

663
00:31:45,020 --> 00:31:47,370
There's actually a tighter bound
on this, which is like

664
00:31:47,370 --> 00:31:51,170
1.62 to the n, which is a lot
more complicated math that you

665
00:31:51,170 --> 00:31:53,130
could look up.

666
00:31:53,130 --> 00:31:56,530
But for the purposes of this
class, it's sufficient to say

667
00:31:56,530 --> 00:31:58,610
that Fibonacci is order
2 to the n.

668
00:31:58,610 --> 00:32:02,900

669
00:32:02,900 --> 00:32:07,870
So does that roughly clear up
some time complexities stuff

670
00:32:07,870 --> 00:32:08,910
for you guys?

671
00:32:08,910 --> 00:32:09,400
OK, awesome.

672
00:32:09,400 --> 00:32:10,320
Does anybody have the time?

673
00:32:10,320 --> 00:32:11,700
I forgot my watch today.

674
00:32:11,700 --> 00:32:12,660
AUDIENCE: 12:42.

675
00:32:12,660 --> 00:32:15,350
PROFESSOR: OK, excellent.

676
00:32:15,350 --> 00:32:17,590
That gives us a little bit
of time to talk about

677
00:32:17,590 --> 00:32:18,555
object-oriented programming.

678
00:32:18,555 --> 00:32:21,650
Does anybody had any specific
questions that object-oriented

679
00:32:21,650 --> 00:32:24,500
programming?

680
00:32:24,500 --> 00:32:24,990
How about this?

681
00:32:24,990 --> 00:32:27,106
How many of you guys finished
the problem set and turned it

682
00:32:27,106 --> 00:32:28,330
in already?

683
00:32:28,330 --> 00:32:32,890
Or did any of you guys not turn
in the problem set yet?

684
00:32:32,890 --> 00:32:36,640
I'll talk loosely about it then,
not too specifically.

685
00:32:36,640 --> 00:32:39,120
Does anybody have any questions
from, I guess, at

686
00:32:39,120 --> 00:32:40,210
least the first part?

687
00:32:40,210 --> 00:32:43,720
We're making some classes,
making some trigger classes.

688
00:32:43,720 --> 00:32:44,593
Yeah?

689
00:32:44,593 --> 00:32:45,843
AUDIENCE: [INAUDIBLE]?

690
00:32:45,843 --> 00:32:50,251

691
00:32:50,251 --> 00:32:51,535
PROFESSOR: Self dot what?

692
00:32:51,535 --> 00:32:52,785
AUDIENCE: [INAUDIBLE].

693
00:32:52,785 --> 00:32:56,415

694
00:32:56,415 --> 00:32:58,050
PROFESSOR: When we
have like self--

695
00:32:58,050 --> 00:33:00,120
we have like the
getter methods.

696
00:33:00,120 --> 00:33:02,505
So what's important
about that?

697
00:33:02,505 --> 00:33:04,870
I'll Tell you what's important
about that.

698
00:33:04,870 --> 00:33:07,600
So we have a class.

699
00:33:07,600 --> 00:33:08,930
Let's say we have
a class person.

700
00:33:08,930 --> 00:33:16,490

701
00:33:16,490 --> 00:33:21,900
So we define our INIT method
to just take a name.

702
00:33:21,900 --> 00:33:32,230

703
00:33:32,230 --> 00:33:34,600
And so now, what the problems
that ask you to do was to

704
00:33:34,600 --> 00:33:36,310
define a getter method.

705
00:33:36,310 --> 00:33:44,410
Define a getter method
called get_name that

706
00:33:44,410 --> 00:33:45,660
just returns the attribute.

707
00:33:45,660 --> 00:33:50,690

708
00:33:50,690 --> 00:33:52,500
So what's the point of this?

709
00:33:52,500 --> 00:33:58,333
Because I can just say
Sally equals person.

710
00:33:58,333 --> 00:34:07,550

711
00:34:07,550 --> 00:34:09,639
So here, I defined a
person named Sally.

712
00:34:09,639 --> 00:34:15,580
And I initialized a person
with the string Sally.

713
00:34:15,580 --> 00:34:19,730
If I just look at sally.name,
that's going to just directly

714
00:34:19,730 --> 00:34:21,630
print the attribute.

715
00:34:21,630 --> 00:34:25,010
So why do we need this
get name function?

716
00:34:25,010 --> 00:34:27,989
What's the point of this
additional getter method?

717
00:34:27,989 --> 00:34:30,590
Does anybody know why that is?

718
00:34:30,590 --> 00:34:31,840
AUDIENCE: [INAUDIBLE].

719
00:34:31,840 --> 00:34:34,510

720
00:34:34,510 --> 00:34:35,239
PROFESSOR: Right.

721
00:34:35,239 --> 00:34:36,070
So that's what it does.

722
00:34:36,070 --> 00:34:38,380
This get_name does return
the attribute name.

723
00:34:38,380 --> 00:34:42,130
But we don't need this method
to just look at

724
00:34:42,130 --> 00:34:43,380
the attribute name.

725
00:34:43,380 --> 00:34:46,070

726
00:34:46,070 --> 00:34:47,320
Let's actually code this up.

727
00:34:47,320 --> 00:34:58,970

728
00:34:58,970 --> 00:35:00,220
So we have class person.

729
00:35:00,220 --> 00:35:18,760

730
00:35:18,760 --> 00:35:22,470
So if we run this code, and
over here in the shell, we

731
00:35:22,470 --> 00:35:28,210
define Sally equals person
with the name Sally.

732
00:35:28,210 --> 00:35:31,110

733
00:35:31,110 --> 00:35:37,200
If I just print sally.name,
it prints the attribute.

734
00:35:37,200 --> 00:35:42,970
So why did I need to provide
this getter method called

735
00:35:42,970 --> 00:35:46,660
get_name that does
the same thing?

736
00:35:46,660 --> 00:35:47,890
That's the question.

737
00:35:47,890 --> 00:35:51,200
That seems sort of redundant.

738
00:35:51,200 --> 00:35:54,610
But there's actually a pretty
big important reason for it.

739
00:35:54,610 --> 00:35:59,200
Let's say we set s name equal
to the attribute sally.name.

740
00:35:59,200 --> 00:36:02,930

741
00:36:02,930 --> 00:36:07,300
If we look at s name,
we see Sally.

742
00:36:07,300 --> 00:36:09,202
Now if I say--

743
00:36:09,202 --> 00:36:11,830
actually, I'm not sure if this
is the correct reasoning.

744
00:36:11,830 --> 00:36:41,770

745
00:36:41,770 --> 00:36:43,970
This is going to be better.

746
00:36:43,970 --> 00:36:49,910
Let's say Sally equals
a person Sally

747
00:36:49,910 --> 00:36:51,600
who's taking what?

748
00:36:51,600 --> 00:36:59,560
1803, 605, 11.1.

749
00:36:59,560 --> 00:37:04,460
So now I can look at the
attribute classes to show

750
00:37:04,460 --> 00:37:08,020
Sally's classes, which
are weird flows.

751
00:37:08,020 --> 00:37:12,610
And I can also use
sally.getclasses to look at

752
00:37:12,610 --> 00:37:15,300
Sally's classes.

753
00:37:15,300 --> 00:37:21,740
If I set a variable s classes
equal to sally.classes, this

754
00:37:21,740 --> 00:37:25,240
binds this variable s classes
to the attribute

755
00:37:25,240 --> 00:37:26,960
sally.classes.

756
00:37:26,960 --> 00:37:38,290
Now if I say sclasses.append
1401, if I now look at the

757
00:37:38,290 --> 00:37:46,990
attribute sally.classes,
it now has 1401 in it.

758
00:37:46,990 --> 00:37:48,140
This is not safe.

759
00:37:48,140 --> 00:37:49,500
This is not type safe.

760
00:37:49,500 --> 00:37:52,970
Because the reason for that is
if you define a class, and you

761
00:37:52,970 --> 00:37:56,310
access the classes' attributes
directly instead of through a

762
00:37:56,310 --> 00:37:58,910
getter method, you
can then do this.

763
00:37:58,910 --> 00:38:01,250
And sometimes, it's
accidental.

764
00:38:01,250 --> 00:38:05,080
You'll set some variable equal
to some attribute of a class.

765
00:38:05,080 --> 00:38:10,130
Then later on in your code,
you'll alter that variable.

766
00:38:10,130 --> 00:38:14,010
But that variable is not a
copy of the attribute.

767
00:38:14,010 --> 00:38:17,250
Yes, you can make copies of that
attribute and stuff, but

768
00:38:17,250 --> 00:38:20,680
the overall takeaway is that in
programming, we try to do

769
00:38:20,680 --> 00:38:22,610
something called defensive
programming.

770
00:38:22,610 --> 00:38:23,900
This isn't defensive.

771
00:38:23,900 --> 00:38:29,770
Because it is possible if you
code it incorrectly to alter

772
00:38:29,770 --> 00:38:33,570
the attribute the instance
of the class.

773
00:38:33,570 --> 00:38:36,030
But if we use the getter
method, if instead of

774
00:38:36,030 --> 00:38:38,470
sally.classes, instead of
directly accessing the

775
00:38:38,470 --> 00:38:41,160
attribute here, we have
set s classes equal to

776
00:38:41,160 --> 00:38:42,930
sally.getclasses.

777
00:38:42,930 --> 00:38:45,780
And then, we had changed
s classes around.

778
00:38:45,780 --> 00:38:50,180
That wouldn't have happened,
because the getter method, it

779
00:38:50,180 --> 00:38:51,900
does return self.classes.

780
00:38:51,900 --> 00:38:55,830
But in the way that Python is
scoped and when we return

781
00:38:55,830 --> 00:39:00,630
something, we're not returning
the exact same thing, the

782
00:39:00,630 --> 00:39:02,850
reference that we're returning
a copy of it.

783
00:39:02,850 --> 00:39:04,075
Does that make sense?

784
00:39:04,075 --> 00:39:04,670
All right.

785
00:39:04,670 --> 00:39:05,730
Cool.

786
00:39:05,730 --> 00:39:07,360
Other questions about classes?

787
00:39:07,360 --> 00:39:09,770
We have a little class appear
if there's like some basic

788
00:39:09,770 --> 00:39:12,820
stuff that you'd like
explained again.

789
00:39:12,820 --> 00:39:14,050
Now's the time.

790
00:39:14,050 --> 00:39:15,300
AUDIENCE: [INAUDIBLE].

791
00:39:15,300 --> 00:39:20,290

792
00:39:20,290 --> 00:39:23,900
PROFESSOR: So here, I'm setting
just some variable s

793
00:39:23,900 --> 00:39:28,620
classes equal to the attribute
sally classes.

794
00:39:28,620 --> 00:39:31,800
it's just like setting any sort
of variable equal to some

795
00:39:31,800 --> 00:39:32,370
other quantity.

796
00:39:32,370 --> 00:39:35,230
AUDIENCE: So you appending the
variable, but it also appended

797
00:39:35,230 --> 00:39:38,282
like the attribute of Sally?

798
00:39:38,282 --> 00:39:42,540
PROFESSOR: So what I did here
was I set the variable s

799
00:39:42,540 --> 00:39:46,710
classes equal to this attribute
sallly.classes.

800
00:39:46,710 --> 00:39:49,730
And then, because I know this is
a list, I appended another

801
00:39:49,730 --> 00:39:51,550
value to it.

802
00:39:51,550 --> 00:39:54,440
But this is the same as when
we have two lists.

803
00:39:54,440 --> 00:39:58,500
If we have a list called a, and
we say a is equal to 1, 2,

804
00:39:58,500 --> 00:40:02,440
3, then I say b is equal to a.

805
00:40:02,440 --> 00:40:04,894
What is b?

806
00:40:04,894 --> 00:40:12,910
Now If I say b.append 1401,
what does b look like?

807
00:40:12,910 --> 00:40:15,500
What does a look like?

808
00:40:15,500 --> 00:40:17,430
Because they're aliases
of each other.

809
00:40:17,430 --> 00:40:20,900
So what I did here, when I set
s classes directly equal to

810
00:40:20,900 --> 00:40:24,170
the attribute sally.classes,
I made s classes an

811
00:40:24,170 --> 00:40:26,180
alias of the attribute.

812
00:40:26,180 --> 00:40:30,440
But the problem with that is
that then I can change them.

813
00:40:30,440 --> 00:40:31,770
And because they're
aliases, the

814
00:40:31,770 --> 00:40:33,720
attribute itself has changed.

815
00:40:33,720 --> 00:40:36,100
And we don't want to do that
in object-oriented

816
00:40:36,100 --> 00:40:36,680
programming.

817
00:40:36,680 --> 00:40:38,300
We need to find an object.

818
00:40:38,300 --> 00:40:41,520
The only way you should be able
to change an attribute is

819
00:40:41,520 --> 00:40:44,330
through some method of the
class that allows you to

820
00:40:44,330 --> 00:40:46,760
change that attribute.

821
00:40:46,760 --> 00:40:49,970
So if I want to be able to add
a class to Sally's class

822
00:40:49,970 --> 00:40:59,700
lists, I should define a method
called define add class

823
00:40:59,700 --> 00:41:04,260
that does self.classes.append
new class.

824
00:41:04,260 --> 00:41:08,280

825
00:41:08,280 --> 00:41:11,740
While technically, it's possible
to directly access an

826
00:41:11,740 --> 00:41:15,540
attribute, it's really bad
practice to do so simply

827
00:41:15,540 --> 00:41:18,260
because this unexpected
behavior can result.

828
00:41:18,260 --> 00:41:21,340
And also because if you say,
oh, well, it's not going to

829
00:41:21,340 --> 00:41:23,240
matter for this one time,
I'll remember how to

830
00:41:23,240 --> 00:41:24,420
do the right thing.

831
00:41:24,420 --> 00:41:26,750
The problem with that is it's
often the case that you're not

832
00:41:26,750 --> 00:41:29,110
the only person using
your code.

833
00:41:29,110 --> 00:41:32,320
So it's a better practice to
provide all the sorts of

834
00:41:32,320 --> 00:41:35,230
methods that you would need to
do with the class in order to

835
00:41:35,230 --> 00:41:38,570
get an access and change
attributes as

836
00:41:38,570 --> 00:41:41,930
methods within the class.

837
00:41:41,930 --> 00:41:43,180
Does that make sense?

838
00:41:43,180 --> 00:41:46,060

839
00:41:46,060 --> 00:41:48,380
So yeah, this is maybe our one
violation if you guys have

840
00:41:48,380 --> 00:41:50,990
been attending my recitation.

841
00:41:50,990 --> 00:41:53,910
Our mantra of programmers
are lazy.

842
00:41:53,910 --> 00:41:56,320
This is less lazy than just
directly accessing the

843
00:41:56,320 --> 00:41:57,240
attributes.

844
00:41:57,240 --> 00:41:59,690
But even though we know that
programmers are super, super

845
00:41:59,690 --> 00:42:02,950
lazy, programmers also like
to be super, super safe.

846
00:42:02,950 --> 00:42:05,580
So when there's a trade off
between defensive programming

847
00:42:05,580 --> 00:42:07,780
and being lazy, always pick
defensive programming.

848
00:42:07,780 --> 00:42:11,882