1
00:00:00,000 --> 00:00:00,530

2
00:00:00,530 --> 00:00:02,960
The following content is
provided under a Creative

3
00:00:02,960 --> 00:00:04,370
Commons license.

4
00:00:04,370 --> 00:00:07,410
Your support will help MIT
OpenCourseWare continue to

5
00:00:07,410 --> 00:00:11,060
offer high quality educational
resources for free.

6
00:00:11,060 --> 00:00:13,960
To make a donation or view
additional materials from

7
00:00:13,960 --> 00:00:19,790
hundreds of MIT courses, visit
MIT OpenCourseWare at

8
00:00:19,790 --> 00:00:21,040
ocw.mit.edu.

9
00:00:21,040 --> 00:00:22,925

10
00:00:22,925 --> 00:00:25,470
PROFESSOR: So we're going to
pick up where we left off last

11
00:00:25,470 --> 00:00:26,720
Friday on recursion.

12
00:00:26,720 --> 00:00:30,420

13
00:00:30,420 --> 00:00:34,240
Well first of all, can anyone
tell me what recursion is or

14
00:00:34,240 --> 00:00:35,640
what a recursive function is?

15
00:00:35,640 --> 00:00:41,615

16
00:00:41,615 --> 00:00:43,180
No one knows.

17
00:00:43,180 --> 00:00:43,652
OK.

18
00:00:43,652 --> 00:00:45,540
AUDIENCE: To divide
and conquer.

19
00:00:45,540 --> 00:00:46,490
PROFESSOR: OK.

20
00:00:46,490 --> 00:00:49,710
It's a divide-and-conquer
technique.

21
00:00:49,710 --> 00:00:53,350
How does it do it?

22
00:00:53,350 --> 00:00:56,228
It's a recursive function.

23
00:00:56,228 --> 00:00:58,618
AUDIENCE: You have
a base case.

24
00:00:58,618 --> 00:01:01,804
And usually a return function,
you return

25
00:01:01,804 --> 00:01:03,398
the value of something.

26
00:01:03,398 --> 00:01:05,788
But because you keep returning
the function back.

27
00:01:05,788 --> 00:01:06,640
PROFESSOR: Right.

28
00:01:06,640 --> 00:01:10,590
So it's a function that
calls itself.

29
00:01:10,590 --> 00:01:14,510
And it works by one, identifying
a base case, which

30
00:01:14,510 --> 00:01:18,010
is the smallest sub-problem
possible.

31
00:01:18,010 --> 00:01:21,610
And then in the other case, the
recursive case, it tries

32
00:01:21,610 --> 00:01:25,910
to chunk the problem down into
a smaller sub-problem that it

33
00:01:25,910 --> 00:01:27,575
then solves by calling itself.

34
00:01:27,575 --> 00:01:31,300

35
00:01:31,300 --> 00:01:33,630
So we went over a
few examples.

36
00:01:33,630 --> 00:01:41,500
And one of the things that we
wanted to talk about is that

37
00:01:41,500 --> 00:01:44,310
for many problems, a recursive
function can be written

38
00:01:44,310 --> 00:01:49,360
iteratively, or actually
for most problems.

39
00:01:49,360 --> 00:01:56,070
So there usually is some sort of
a subjective choice in how

40
00:01:56,070 --> 00:01:57,540
to write a function.

41
00:01:57,540 --> 00:02:00,440
And it really comes down to
ease of understanding.

42
00:02:00,440 --> 00:02:03,840
So what we've done is we've
taken a couple of the

43
00:02:03,840 --> 00:02:06,810
algorithms that we
showed last week.

44
00:02:06,810 --> 00:02:09,130
And we've written them
iteratively.

45
00:02:09,130 --> 00:02:10,440
We also have the recursive
version.

46
00:02:10,440 --> 00:02:17,230
We'll compare and contrast and
see where we would want to use

47
00:02:17,230 --> 00:02:18,930
a recursive function.

48
00:02:18,930 --> 00:02:24,900
So on the screen on the
left here, we have the

49
00:02:24,900 --> 00:02:28,670
multiplication version, the
recursive multiplication, that

50
00:02:28,670 --> 00:02:30,440
we showed you last week.

51
00:02:30,440 --> 00:02:33,310
It's a little different because
it turns out there's a

52
00:02:33,310 --> 00:02:35,150
couple of simplifications
you can make.

53
00:02:35,150 --> 00:02:40,930

54
00:02:40,930 --> 00:02:42,870
Can someone walk me through
how this works?

55
00:02:42,870 --> 00:02:46,470
So what's my base case first?

56
00:02:46,470 --> 00:02:50,646

57
00:02:50,646 --> 00:02:51,574
AUDIENCE: 0

58
00:02:51,574 --> 00:02:53,060
PROFESSOR: 0, right?

59
00:02:53,060 --> 00:02:58,010
And so obviously when n is 0, if
we multiply by 0, then our

60
00:02:58,010 --> 00:02:59,540
result is 0.

61
00:02:59,540 --> 00:03:03,130
Now there are two
recursive cases.

62
00:03:03,130 --> 00:03:09,600
And I'm not really sure how to
explain this intuitively.

63
00:03:09,600 --> 00:03:12,840
But let's say that
my n is positive.

64
00:03:12,840 --> 00:03:15,210
So I'm multiplying
by a positive n.

65
00:03:15,210 --> 00:03:21,630
Well, then all I'm going to do
is take m and just add it to

66
00:03:21,630 --> 00:03:26,280
the recursive version of
itself n minus 1 times.

67
00:03:26,280 --> 00:03:29,610
That's how to read that.

68
00:03:29,610 --> 00:03:36,120
And then analogously, if n is
less than or equal to 1, then

69
00:03:36,120 --> 00:03:39,210
I'm going to take negative m and
add the recursive result

70
00:03:39,210 --> 00:03:43,060
of n plus 1 times m.

71
00:03:43,060 --> 00:03:45,560

72
00:03:45,560 --> 00:03:48,100
It is not too intuitive right?

73
00:03:48,100 --> 00:03:53,506
So if we implement it
iteratively though, I think

74
00:03:53,506 --> 00:03:55,830
it's a little easier
to understand.

75
00:03:55,830 --> 00:03:58,490
Now this is also a subjective
judgment,

76
00:03:58,490 --> 00:03:59,810
so you might disagree.

77
00:03:59,810 --> 00:04:01,060
You're free to.

78
00:04:01,060 --> 00:04:02,940

79
00:04:02,940 --> 00:04:05,940
Here's our base case again. n is
equal to 0 or m is equal to

80
00:04:05,940 --> 00:04:08,240
0, return 0.

81
00:04:08,240 --> 00:04:12,750
In this case though, if we don't
have the base case, then

82
00:04:12,750 --> 00:04:15,560
we're going to initialize
the result variable.

83
00:04:15,560 --> 00:04:20,480
And then for n is greater than
or equal to 1, we're just

84
00:04:20,480 --> 00:04:23,000
going to enter a while
loop and keep adding

85
00:04:23,000 --> 00:04:25,535
m to result n times.

86
00:04:25,535 --> 00:04:28,670

87
00:04:28,670 --> 00:04:31,630
It's a little bit easier to
understand I think than the

88
00:04:31,630 --> 00:04:33,620
recursive version.

89
00:04:33,620 --> 00:04:36,200
And then same thing for--

90
00:04:36,200 --> 00:04:37,950
Oh I have a bug here.

91
00:04:37,950 --> 00:04:44,790

92
00:04:44,790 --> 00:04:49,020
Same thing for n is less than
equal to negative 1.

93
00:04:49,020 --> 00:04:55,020
So I'm going to run the two
versions of this function.

94
00:04:55,020 --> 00:05:00,250
So here's the recursive
version and here's the

95
00:05:00,250 --> 00:05:01,330
iterative function.

96
00:05:01,330 --> 00:05:03,870
They both return the
same exact thing.

97
00:05:03,870 --> 00:05:07,000
They both work in generally
the same manner.

98
00:05:07,000 --> 00:05:10,260
It's just that in one case we're
using recursion to solve

99
00:05:10,260 --> 00:05:15,350
it, which I don't find
too intuitive.

100
00:05:15,350 --> 00:05:19,430
And the other case, we're
using WHILE loops.

101
00:05:19,430 --> 00:05:20,580
All right.

102
00:05:20,580 --> 00:05:26,040
So in this case in my opinion,
writing this iteratively, it

103
00:05:26,040 --> 00:05:28,650
makes a little bit more
intuitive sense.

104
00:05:28,650 --> 00:05:47,260
But in other cases, let's say
good old Fibonacci, we can

105
00:05:47,260 --> 00:05:49,460
write the recursive
version and then

106
00:05:49,460 --> 00:05:50,710
the iterative version.

107
00:05:50,710 --> 00:05:53,480

108
00:05:53,480 --> 00:05:55,260
So here's recursive Fibonacci.

109
00:05:55,260 --> 00:05:58,975
We have our base
case or cases.

110
00:05:58,975 --> 00:06:02,360
And then we have our
recursive case.

111
00:06:02,360 --> 00:06:05,990
You can almost rewrite
the mathematical

112
00:06:05,990 --> 00:06:07,430
formula from this directly.

113
00:06:07,430 --> 00:06:18,342

114
00:06:18,342 --> 00:06:19,592
All right now.

115
00:06:19,592 --> 00:06:22,438

116
00:06:22,438 --> 00:06:23,765
Now what was it?

117
00:06:23,765 --> 00:06:34,080

118
00:06:34,080 --> 00:06:36,250
So here's the iterative
version of Fibonacci.

119
00:06:36,250 --> 00:06:41,480

120
00:06:41,480 --> 00:06:45,070
We still have our base case.

121
00:06:45,070 --> 00:06:50,210
But when we get to what was the
recursive case, we have to

122
00:06:50,210 --> 00:06:53,310
do a lot of bookkeeping.

123
00:06:53,310 --> 00:06:58,000
We have to save off the previous
Fibonacci what we're

124
00:06:58,000 --> 00:07:00,340
currently computing.

125
00:07:00,340 --> 00:07:07,360
And then we have to iterate,
get the next Fibonacci and

126
00:07:07,360 --> 00:07:13,200
then save off the prior
versions of it.

127
00:07:13,200 --> 00:07:16,760
This is all stuff that in the
recursive version gets done

128
00:07:16,760 --> 00:07:19,820
for us by virtue of just calling
another function.

129
00:07:19,820 --> 00:07:22,470

130
00:07:22,470 --> 00:07:26,730
So this is an example of a
case where your recursive

131
00:07:26,730 --> 00:07:31,370
version is actually a little
bit easier to understand.

132
00:07:31,370 --> 00:07:33,400
Doesn't mean that it's
more efficient.

133
00:07:33,400 --> 00:07:36,250
And later on in the class we'll
actually use this to

134
00:07:36,250 --> 00:07:40,150
talk about complexity.

135
00:07:40,150 --> 00:07:44,090
But the left version I think is
easier to understand than

136
00:07:44,090 --> 00:07:45,170
the right version.

137
00:07:45,170 --> 00:07:46,575
Are there any disagreements?

138
00:07:46,575 --> 00:07:50,870

139
00:07:50,870 --> 00:07:53,310
If you disagree, I'm not
going to bite you.

140
00:07:53,310 --> 00:07:56,260
So anyway, we can run this.

141
00:07:56,260 --> 00:07:58,860
And we can see that the
output's identical.

142
00:07:58,860 --> 00:08:01,845

143
00:08:01,845 --> 00:08:03,673
AUDIENCE: What's x-range?

144
00:08:03,673 --> 00:08:04,923
PROFESSOR: x-range?

145
00:08:04,923 --> 00:08:07,445

146
00:08:07,445 --> 00:08:09,510
Probably something I shouldn't
have put in there.

147
00:08:09,510 --> 00:08:15,470
x-range is like range, except
that it returns what's known

148
00:08:15,470 --> 00:08:21,330
as a generator object that
you can iterate over.

149
00:08:21,330 --> 00:08:26,110
So that I don't have to explain
that right now--

150
00:08:26,110 --> 00:08:27,870
Well actually, we'll probably
talk about it

151
00:08:27,870 --> 00:08:30,760
later in the semester.

152
00:08:30,760 --> 00:08:35,669

153
00:08:35,669 --> 00:08:38,580
The difference is efficiency.

154
00:08:38,580 --> 00:08:41,970
Range will return an
entire list to you.

155
00:08:41,970 --> 00:08:45,260
Whereas x-range is a little bit
more conservative in how

156
00:08:45,260 --> 00:08:49,680
it manages it's memory
for these purposes.

157
00:08:49,680 --> 00:08:53,630
But changing it won't make a
difference in the program.

158
00:08:53,630 --> 00:08:59,680
And for a program as simple as
this, range is perfectly fine.

159
00:08:59,680 --> 00:09:03,350
I just used x-range
out of habit.

160
00:09:03,350 --> 00:09:05,150
So we'll do one last example.

161
00:09:05,150 --> 00:09:06,795
And then we'll move on
to a different topic.

162
00:09:06,795 --> 00:09:10,500

163
00:09:10,500 --> 00:09:15,420
If I didn't mention it before,
in problem set 4, recursion is

164
00:09:15,420 --> 00:09:18,760
highly recommended for the
final portion of it.

165
00:09:18,760 --> 00:09:25,690
So it's kind of important you
understand what's going on.

166
00:09:25,690 --> 00:09:33,220
Anyway, so remember we
looked at bisection

167
00:09:33,220 --> 00:09:35,500
early on in the semester.

168
00:09:35,500 --> 00:09:39,960
And we showed you an iterative
version of bisection.

169
00:09:39,960 --> 00:09:42,520
This shouldn't really
be unfamiliar to

170
00:09:42,520 --> 00:09:44,490
anyone at this point.

171
00:09:44,490 --> 00:09:51,530

172
00:09:51,530 --> 00:09:55,120
So all this is doing is finding
the square root of a

173
00:09:55,120 --> 00:09:58,590
number using bisection search.

174
00:09:58,590 --> 00:10:04,900
And we set our low and our
high, get our midpoint.

175
00:10:04,900 --> 00:10:10,800
And we just keep looping until
we get a value that when we

176
00:10:10,800 --> 00:10:12,420
square it is close
enough to x.

177
00:10:12,420 --> 00:10:16,770

178
00:10:16,770 --> 00:10:21,600
And on each iteration we set
our lows and our highs,

179
00:10:21,600 --> 00:10:24,330
depending on how good
our guess was.

180
00:10:24,330 --> 00:10:26,940

181
00:10:26,940 --> 00:10:31,890
Now the recursive version
looks like this.

182
00:10:31,890 --> 00:10:35,570
It has a few more
lines of code.

183
00:10:35,570 --> 00:10:42,820
And before I launch into it,
did we explain default

184
00:10:42,820 --> 00:10:46,920
parameters to you,
for functions?

185
00:10:46,920 --> 00:10:57,640
So Python has this feature where
if you have a function

186
00:10:57,640 --> 00:11:03,400
such as rec bisection search,
you can specify that certain

187
00:11:03,400 --> 00:11:08,490
parameters are optional
or you can give

188
00:11:08,490 --> 00:11:11,170
default values to them.

189
00:11:11,170 --> 00:11:16,480
So let's just show a
simple example so I

190
00:11:16,480 --> 00:11:17,980
can get past this.

191
00:11:17,980 --> 00:11:40,610

192
00:11:40,610 --> 00:11:42,790
So if I define this function,
this one's really easy.

193
00:11:42,790 --> 00:11:45,310
All it's going to do
is print out x.

194
00:11:45,310 --> 00:11:53,540
I can call it like this, in
which case it's going to pass

195
00:11:53,540 --> 00:11:57,015
150 in and x will be 150 when
the function's executing.

196
00:11:57,015 --> 00:12:01,130

197
00:12:01,130 --> 00:12:03,530
See I'm not lying.

198
00:12:03,530 --> 00:12:04,780
Or I can call it like this.

199
00:12:04,780 --> 00:12:09,900

200
00:12:09,900 --> 00:12:12,750
And it'll be 100.

201
00:12:12,750 --> 00:12:15,720
So that's, in a nutshell, what
default parameters do for you.

202
00:12:15,720 --> 00:12:19,260
They're useful in some
instances, as in this example.

203
00:12:19,260 --> 00:12:33,740
So in this recursive version of
bisection square root, we

204
00:12:33,740 --> 00:12:36,790
have a low and a high parameter
that we specify.

205
00:12:36,790 --> 00:12:39,360
It's exactly equivalent to the
low and the high parameter in

206
00:12:39,360 --> 00:12:40,610
this iterative version.

207
00:12:40,610 --> 00:12:46,880

208
00:12:46,880 --> 00:12:51,110
This is a common idiom for
recursive functions in Python.

209
00:12:51,110 --> 00:12:52,840
If we're calling it for the
first time, we're not going to

210
00:12:52,840 --> 00:12:55,210
specify in a low and a high.

211
00:12:55,210 --> 00:12:58,090
So low and high will be none
coming into this function.

212
00:12:58,090 --> 00:13:00,610
And then we just set
them as we did in

213
00:13:00,610 --> 00:13:02,610
this iterative version.

214
00:13:02,610 --> 00:13:03,920
And then we set the midpoint.

215
00:13:03,920 --> 00:13:07,960
And then we have slightly
different structure here.

216
00:13:07,960 --> 00:13:13,850
If the midpoint that we guess is
close enough to the square

217
00:13:13,850 --> 00:13:20,230
root of x, then we just
return the midpoint.

218
00:13:20,230 --> 00:13:24,960
On the other hand, if it's too
low of a guess, then we're

219
00:13:24,960 --> 00:13:30,230
going to recursively call
ourselves with the same x,

220
00:13:30,230 --> 00:13:32,490
same epsilon, but we're
going to use

221
00:13:32,490 --> 00:13:34,380
midpoint for the low parameter.

222
00:13:34,380 --> 00:13:39,520
So midpoint, in this
case, is here and

223
00:13:39,520 --> 00:13:42,340
the same high parameter.

224
00:13:42,340 --> 00:13:46,970
And then if we've guessed
too high, then our low

225
00:13:46,970 --> 00:13:47,820
parameter is low.

226
00:13:47,820 --> 00:13:49,570
And then our high parameter's
the midpoint.

227
00:13:49,570 --> 00:13:51,320
So it's doing the exact
same thing as

228
00:13:51,320 --> 00:13:53,390
the iterative version.

229
00:13:53,390 --> 00:13:58,870
We have recursive, iterative,
recursive, iterative.

230
00:13:58,870 --> 00:14:00,120
Same thing, just different
forms.

231
00:14:00,120 --> 00:14:02,790

232
00:14:02,790 --> 00:14:03,420
All right.

233
00:14:03,420 --> 00:14:09,310
Before I leave recursion, does
anyone have any questions, or

234
00:14:09,310 --> 00:14:15,048
want to ask anything,
or complain?

235
00:14:15,048 --> 00:14:15,544
No?

236
00:14:15,544 --> 00:14:16,040
All right.

237
00:14:16,040 --> 00:14:19,512
AUDIENCE: Do you use a lot of
recursion in your work?

238
00:14:19,512 --> 00:14:22,984
Do you normally use iterative
or recursion?

239
00:14:22,984 --> 00:14:25,040
Or is it just case by case?

240
00:14:25,040 --> 00:14:26,140
PROFESSOR: It's case by case.

241
00:14:26,140 --> 00:14:28,570
It depends on the problem.

242
00:14:28,570 --> 00:14:36,250
And what we are trying to show
here is that there are some

243
00:14:36,250 --> 00:14:38,420
problems that are better
expressed recursively and

244
00:14:38,420 --> 00:14:41,640
others that are better expressed
iteratively.

245
00:14:41,640 --> 00:14:46,650
And by better, it's a very
subjective term.

246
00:14:46,650 --> 00:14:48,620
In my mind, it means more
intuitive, easier to

247
00:14:48,620 --> 00:14:50,450
understand.

248
00:14:50,450 --> 00:14:54,560
It allows you to focus on
solving the problem rather

249
00:14:54,560 --> 00:14:55,850
than fiddling with code.

250
00:14:55,850 --> 00:14:59,630

251
00:14:59,630 --> 00:15:02,230
On the other hand, sometimes
efficiency comes into play.

252
00:15:02,230 --> 00:15:06,060
And we're going to be talking
about that pretty shortly.

253
00:15:06,060 --> 00:15:09,650
And in that case, you might want
to do a recursive version

254
00:15:09,650 --> 00:15:10,960
because it's easier
to understand.

255
00:15:10,960 --> 00:15:13,220
But it takes too long
to run, so you write

256
00:15:13,220 --> 00:15:14,470
an iterative version.

257
00:15:14,470 --> 00:15:17,710

258
00:15:17,710 --> 00:15:20,370
Computer programming, in a lot
of cases, actually in all

259
00:15:20,370 --> 00:15:24,990
cases, is a bunch
of trade offs.

260
00:15:24,990 --> 00:15:29,880
Often times you'll trade off
speed for memory, elegance for

261
00:15:29,880 --> 00:15:32,270
efficiency, that
sort of thing.

262
00:15:32,270 --> 00:15:35,950
And part of the skill of
becoming good computer

263
00:15:35,950 --> 00:15:38,420
programmers is figuring
out where those

264
00:15:38,420 --> 00:15:40,830
balance points are.

265
00:15:40,830 --> 00:15:43,700
And it's something that I think
comes only comes with

266
00:15:43,700 --> 00:15:45,430
experience.

267
00:15:45,430 --> 00:15:50,700
All right, so we've talked about
floating point to death.

268
00:15:50,700 --> 00:15:53,260
But we just want to really
emphasize it.

269
00:15:53,260 --> 00:15:55,540
Because it's something that
even for experienced

270
00:15:55,540 --> 00:15:59,680
programmers, still
trips us up.

271
00:15:59,680 --> 00:16:04,310
So the thing that we want you to
understand is that floating

272
00:16:04,310 --> 00:16:05,730
point is inexact.

273
00:16:05,730 --> 00:16:12,430
So you shouldn't compare
for exact equality.

274
00:16:12,430 --> 00:16:16,250
So looking at the code here,
I have to find a variable

275
00:16:16,250 --> 00:16:20,610
10/100, which is just 10 over
100, and 1/100, which is just

276
00:16:20,610 --> 00:16:25,370
1 over 100, and then 9/100,
which is 9 over 100.

277
00:16:25,370 --> 00:16:31,330
And so in real math, this
condition would be true.

278
00:16:31,330 --> 00:16:34,910
I add 1/100 and 9/100.

279
00:16:34,910 --> 00:16:36,810
I should get 10/100.

280
00:16:36,810 --> 00:16:40,260
So if we were not dealing
in computer land,

281
00:16:40,260 --> 00:16:43,310
this would print out.

282
00:16:43,310 --> 00:16:49,580
But because we are dealing in
computer-land, we get that.

283
00:16:49,580 --> 00:16:52,190

284
00:16:52,190 --> 00:16:58,460
And the reason is because of
Python's representation.

285
00:16:58,460 --> 00:17:02,510
Now when you write print x, if
x is a float variable, Python

286
00:17:02,510 --> 00:17:05,194
does a little bit of nice
formatting for you.

287
00:17:05,194 --> 00:17:07,490
It kind of saves you from its
internal representation.

288
00:17:07,490 --> 00:17:11,740

289
00:17:11,740 --> 00:17:16,780
So here is 10/100, as you
just printed out.

290
00:17:16,780 --> 00:17:19,069
It's what you would
expect it to be.

291
00:17:19,069 --> 00:17:21,898
But this is what Python sees
when it does its math.

292
00:17:21,898 --> 00:17:25,119

293
00:17:25,119 --> 00:17:26,290
And it's not just Python.

294
00:17:26,290 --> 00:17:29,650
This applies for anything
on a binary computer.

295
00:17:29,650 --> 00:17:32,290
It's an inherent limitation.

296
00:17:32,290 --> 00:17:35,660
And you know we can
get arbitrarily

297
00:17:35,660 --> 00:17:36,910
close, but never exact.

298
00:17:36,910 --> 00:17:41,300

299
00:17:41,300 --> 00:17:48,050
And then again, we have 1/100
and 9/100, Python

300
00:17:48,050 --> 00:17:49,950
will show us 0.1.

301
00:17:49,950 --> 00:17:53,420
So when you print these out,
they'll look fine.

302
00:17:53,420 --> 00:17:56,480
If you were writing debugging
code and you were wondering

303
00:17:56,480 --> 00:18:01,040
why if you compared x to y, it
wasn't exactly equal, you

304
00:18:01,040 --> 00:18:03,610
would naturally print out
x and then print out y.

305
00:18:03,610 --> 00:18:04,780
But it would look
equal to you.

306
00:18:04,780 --> 00:18:08,100
But the code wouldn't
be working properly.

307
00:18:08,100 --> 00:18:11,370
Well the reason is, is that the
internal application that

308
00:18:11,370 --> 00:18:14,890
Python is using to compare
them is that.

309
00:18:14,890 --> 00:18:17,410

310
00:18:17,410 --> 00:18:20,050
So what's the solution?

311
00:18:20,050 --> 00:18:24,620

312
00:18:24,620 --> 00:18:25,490
It's a natural question.

313
00:18:25,490 --> 00:18:26,740
I don't know the answer.

314
00:18:26,740 --> 00:18:30,220

315
00:18:30,220 --> 00:18:33,545
AUDIENCE: If they're close
enough, then it would be

316
00:18:33,545 --> 00:18:34,970
inside variance--

317
00:18:34,970 --> 00:18:37,440
PROFESSOR: Right.

318
00:18:37,440 --> 00:18:39,781
We're going to say
good enough.

319
00:18:39,781 --> 00:18:43,470
And the traditional way of
representing that is epsilon.

320
00:18:43,470 --> 00:18:47,130
Epsilon, you've seen in
your problem sets.

321
00:18:47,130 --> 00:18:48,480
And you've seen it
in code before.

322
00:18:48,480 --> 00:18:50,600
And if you've come to office
hours, someone's probably

323
00:18:50,600 --> 00:18:52,320
explained it to you.

324
00:18:52,320 --> 00:18:56,740
Epsilon is the amount of error
we're willing to tolerate in

325
00:18:56,740 --> 00:18:58,360
our calculations.

326
00:18:58,360 --> 00:19:07,000
So in Python-land, you can
have arbitrary precision.

327
00:19:07,000 --> 00:19:08,910
Don't quote me on that though.

328
00:19:08,910 --> 00:19:13,290
But for purposes of this class,
if you're using an

329
00:19:13,290 --> 00:19:17,390
epsilon it's like 0.0001,
we're not

330
00:19:17,390 --> 00:19:18,640
going to get too upset.

331
00:19:18,640 --> 00:19:21,750

332
00:19:21,750 --> 00:19:24,530
All this function does, and this
is a handy function to

333
00:19:24,530 --> 00:19:31,220
keep around, is it just tests to
see if the distance between

334
00:19:31,220 --> 00:19:33,800
x and y are less than epsilon.

335
00:19:33,800 --> 00:19:35,970
If it is, then we say they're
close enough to each other to

336
00:19:35,970 --> 00:19:37,220
be considered equal.

337
00:19:37,220 --> 00:19:40,350

338
00:19:40,350 --> 00:19:42,740
So I don't like the function
named compare.

339
00:19:42,740 --> 00:19:43,990
I don't think it's intuitive.

340
00:19:43,990 --> 00:19:49,810

341
00:19:49,810 --> 00:19:51,060
Close enough is probably
better.

342
00:19:51,060 --> 00:19:55,450

343
00:19:55,450 --> 00:19:57,250
But it's also going
to break my code.

344
00:19:57,250 --> 00:20:07,590

345
00:20:07,590 --> 00:20:09,030
Uh oh.

346
00:20:09,030 --> 00:20:10,280
This is an actual bug.

347
00:20:10,280 --> 00:20:14,390

348
00:20:14,390 --> 00:20:16,670
Line 203.

349
00:20:16,670 --> 00:20:17,920
What did I do to myself?

350
00:20:17,920 --> 00:20:23,130

351
00:20:23,130 --> 00:20:25,253
I commented out my definition
is what I did.

352
00:20:25,253 --> 00:20:32,020

353
00:20:32,020 --> 00:20:32,990
All right.

354
00:20:32,990 --> 00:20:37,000
So if we compare the two values,
10/100 and 1/100 plus

355
00:20:37,000 --> 00:20:43,720
9/100 and we use our close
enough, our compare function,

356
00:20:43,720 --> 00:20:45,360
then yeah it's within epsilon.

357
00:20:45,360 --> 00:20:49,060

358
00:20:49,060 --> 00:20:51,780
Again, notice here that we're
using a default parameter.

359
00:20:51,780 --> 00:20:54,480
So if we don't pass in
something explicitly.

360
00:20:54,480 --> 00:20:56,653
So I can say something
like this.

361
00:20:56,653 --> 00:21:01,560

362
00:21:01,560 --> 00:21:02,980
Let's make epsilon
really tiny.

363
00:21:02,980 --> 00:21:23,140

364
00:21:23,140 --> 00:21:26,770
So if I make epsilon really,
really tiny, then it's

365
00:21:26,770 --> 00:21:27,540
going to say no.

366
00:21:27,540 --> 00:21:34,470
So how you determine epsilon
really depends on your

367
00:21:34,470 --> 00:21:37,520
specific application.

368
00:21:37,520 --> 00:21:42,590
If you're doing high precision
mathematics, you're modeling

369
00:21:42,590 --> 00:21:45,400
faults on a bridge or something,
probably I want to

370
00:21:45,400 --> 00:21:47,850
be pretty precise.

371
00:21:47,850 --> 00:21:50,790
Because if you have the wrong
epsilon, then you might have

372
00:21:50,790 --> 00:21:53,630
cars falling of the bridge
or the bridge collapsing.

373
00:21:53,630 --> 00:21:56,970
And it would just
be a bad day.

374
00:21:56,970 --> 00:22:03,080
So are there any questions so
far about floating point?

375
00:22:03,080 --> 00:22:03,520
No.

376
00:22:03,520 --> 00:22:06,075
AUDIENCE: You normally
go as close as the

377
00:22:06,075 --> 00:22:09,130
areas will let you.

378
00:22:09,130 --> 00:22:10,291
PROFESSOR: What's that?

379
00:22:10,291 --> 00:22:12,215
AUDIENCE: You can't get
as close anymore.

380
00:22:12,215 --> 00:22:16,063
How can you make the
error that small.

381
00:22:16,063 --> 00:22:17,987
Because it's not going
to get that close.

382
00:22:17,987 --> 00:22:19,580
PROFESSOR: Well yeah,
that's what I mean.

383
00:22:19,580 --> 00:22:23,700
So there is a limit to how
close you can get.

384
00:22:23,700 --> 00:22:26,110
And it depends on the
language and it also

385
00:22:26,110 --> 00:22:27,360
depends on the hardware.

386
00:22:27,360 --> 00:22:29,390

387
00:22:29,390 --> 00:22:33,030
There are, and this group is
getting a little bit more

388
00:22:33,030 --> 00:22:38,470
technical than I want, but you
can define pretty precisely

389
00:22:38,470 --> 00:22:43,580
the smallest value that
epsilon can be.

390
00:22:43,580 --> 00:22:52,570
In a language like C, its
defined as the minimum

391
00:22:52,570 --> 00:22:56,440
difference between two floating
point variables

392
00:22:56,440 --> 00:23:00,640
that's representable on the
host machine's hardware.

393
00:23:00,640 --> 00:23:05,530
So yeah, there is a limit.

394
00:23:05,530 --> 00:23:09,550
There are some math packages
though, and we'll be using

395
00:23:09,550 --> 00:23:13,930
something called NumPy later
on the semester, that allow

396
00:23:13,930 --> 00:23:16,033
you to do pretty high precision
mathematics.

397
00:23:16,033 --> 00:23:19,450

398
00:23:19,450 --> 00:23:20,730
Keep that in the back
of your mind.

399
00:23:20,730 --> 00:23:22,180
But yeah you're right.

400
00:23:22,180 --> 00:23:23,620
You do eventually hit a limit.

401
00:23:23,620 --> 00:23:27,380

402
00:23:27,380 --> 00:23:31,790
OK so the last thing that I
want to cover on floating

403
00:23:31,790 --> 00:23:40,690
point is that even though it's
inexact, it's consistent.

404
00:23:40,690 --> 00:23:47,870
So let's say I define a variable
9/100 plus 1/100.

405
00:23:47,870 --> 00:23:51,310
And it's exactly what it
says, 9/100 plus 1/100.

406
00:23:51,310 --> 00:23:54,570
Now we know that this is not
going to equal 10/100, right.

407
00:23:54,570 --> 00:23:58,020
We just demonstrated
that ad nauseum.

408
00:23:58,020 --> 00:24:02,230
And also, yeah, still defined.

409
00:24:02,230 --> 00:24:07,630
The question though is, if I
subtract 1/100 from this

410
00:24:07,630 --> 00:24:14,380
variable that I've defined, this
9/100 plus 1/100, will

411
00:24:14,380 --> 00:24:19,760
9/100 now be equal to
9/100 plus 1/100?

412
00:24:19,760 --> 00:24:24,390
So in other words, will
this be true?

413
00:24:24,390 --> 00:24:29,100
And the answer is yes.

414
00:24:29,100 --> 00:24:36,510
And the reason is that if I'm
adding or subtracting, even

415
00:24:36,510 --> 00:24:40,080
though 1/100 we know is an
inexact representation, it's

416
00:24:40,080 --> 00:24:42,490
still the same.

417
00:24:42,490 --> 00:24:45,020
And so when we do the
subtraction, we're subtracting

418
00:24:45,020 --> 00:24:47,160
the same inexact value.

419
00:24:47,160 --> 00:24:57,605
So this appeared as a quiz
question at one point.

420
00:24:57,605 --> 00:24:59,360
It probably won't
this semester.

421
00:24:59,360 --> 00:25:04,890
But it's something
to keep in mind.

422
00:25:04,890 --> 00:25:11,780
So any questions on
floating point?

423
00:25:11,780 --> 00:25:21,680
If you're a mathy type and want
to, look up IEEE 754.

424
00:25:21,680 --> 00:25:24,060
And this will give you all
the gory details about

425
00:25:24,060 --> 00:25:27,100
representation and mathematical
operations on

426
00:25:27,100 --> 00:25:29,230
floating point.

427
00:25:29,230 --> 00:25:31,670
And if you don't, then
don't worry about it.

428
00:25:31,670 --> 00:25:34,450
It's not required
for the class.

429
00:25:34,450 --> 00:25:35,460
OK.

430
00:25:35,460 --> 00:25:41,730
So the next topic we want
to cover is pseudocode.

431
00:25:41,730 --> 00:25:45,801
So can someone take a stab at
defining pseudocode for me?

432
00:25:45,801 --> 00:25:50,220
AUDIENCE: From what I gathered,
it's basically

433
00:25:50,220 --> 00:25:54,148
you're writing out what you're
planning on doing in just

434
00:25:54,148 --> 00:25:55,621
normal English.

435
00:25:55,621 --> 00:25:57,890
PROFESSOR: I wouldn't just
say normal English.

436
00:25:57,890 --> 00:25:59,895
But it's an English of sorts.

437
00:25:59,895 --> 00:26:04,000

438
00:26:04,000 --> 00:26:07,190
And a lot of the difficulty
that programmers have with

439
00:26:07,190 --> 00:26:12,310
writing programs or new programs
is that we don't

440
00:26:12,310 --> 00:26:14,023
naturally think in computer
languages.

441
00:26:14,023 --> 00:26:16,540

442
00:26:16,540 --> 00:26:18,350
You think in English.

443
00:26:18,350 --> 00:26:22,400
Or well, you think in
a human language.

444
00:26:22,400 --> 00:26:28,080
So what pseudocode allows us to
do is to kind of be in the

445
00:26:28,080 --> 00:26:29,040
intermediate.

446
00:26:29,040 --> 00:26:32,160
We still want to develop a
step by step process for

447
00:26:32,160 --> 00:26:35,280
solving a problem, but we want
to be able to describe it in

448
00:26:35,280 --> 00:26:42,670
words and not variables
and syntax.

449
00:26:42,670 --> 00:26:45,000
Sometimes what'll happen is
programmers will get so

450
00:26:45,000 --> 00:26:47,680
wrapped around kind of getting
the syntax right that they'll

451
00:26:47,680 --> 00:26:49,710
forget the problem that they're

452
00:26:49,710 --> 00:26:51,730
actually trying to solve.

453
00:26:51,730 --> 00:26:58,150
So let's walk through
an example.

454
00:26:58,150 --> 00:27:00,050
Let's talk about pseudocode
for Hangman.

455
00:27:00,050 --> 00:27:03,590

456
00:27:03,590 --> 00:27:06,500
And because you've all done
this on the problem set, I

457
00:27:06,500 --> 00:27:08,430
don't have to explain
the rules right.

458
00:27:08,430 --> 00:27:10,930
So what would be a good
kind of English

459
00:27:10,930 --> 00:27:12,300
first step for Hangman?

460
00:27:12,300 --> 00:27:19,715

461
00:27:19,715 --> 00:27:22,100
AUDIENCE: You have
to choose a word.

462
00:27:22,100 --> 00:27:22,670
PROFESSOR: Right.

463
00:27:22,670 --> 00:27:26,175
So let's not be too specific.

464
00:27:26,175 --> 00:27:30,410
We'll just say, select
random word.

465
00:27:30,410 --> 00:27:35,170

466
00:27:35,170 --> 00:27:36,180
OK.

467
00:27:36,180 --> 00:27:38,993
Now what would be another
good step, next step.

468
00:27:38,993 --> 00:27:42,065

469
00:27:42,065 --> 00:27:46,871
AUDIENCE: Display the amount
of spaces maybe?

470
00:27:46,871 --> 00:27:50,352
PROFESSOR: So display a masked
version of the word.

471
00:27:50,352 --> 00:27:51,013
AUDIENCE: Exactly.

472
00:27:51,013 --> 00:27:53,328
Hide the word but display it.

473
00:27:53,328 --> 00:27:56,422
PROFESSOR: Hide the word
but display it.

474
00:27:56,422 --> 00:27:58,912
AUDIENCE: Well, display
the amount of spaces.

475
00:27:58,912 --> 00:28:03,892

476
00:28:03,892 --> 00:28:07,378
You probably want to state how
many letters are in the word

477
00:28:07,378 --> 00:28:07,890
at some point.

478
00:28:07,890 --> 00:28:09,450
PROFESSOR: Ah, that's
a good point.

479
00:28:09,450 --> 00:28:10,380
Where should that go?

480
00:28:10,380 --> 00:28:12,675
AUDIENCE: That should probably
be before the display.

481
00:28:12,675 --> 00:28:14,052
PROFESSOR: OK.

482
00:28:14,052 --> 00:28:17,630
So tell how many letters.

483
00:28:17,630 --> 00:28:23,430

484
00:28:23,430 --> 00:28:24,080
All right.

485
00:28:24,080 --> 00:28:25,977
Now what would come
after this?

486
00:28:25,977 --> 00:28:26,955
AUDIENCE: After display?

487
00:28:26,955 --> 00:28:27,933
PROFESSOR: Yeah.

488
00:28:27,933 --> 00:28:30,378
AUDIENCE: See how many letters
you have to choose from.

489
00:28:30,378 --> 00:28:31,628
PROFESSOR: OK.

490
00:28:31,628 --> 00:28:37,713

491
00:28:37,713 --> 00:28:40,647
AUDIENCE: First time
you don't.

492
00:28:40,647 --> 00:28:41,930
PROFESSOR: At first
time you don't.

493
00:28:41,930 --> 00:28:45,840
But the nice thing about
pseudocode is that we can barf

494
00:28:45,840 --> 00:28:48,895
things onto paper and then
rearrange them as we--

495
00:28:48,895 --> 00:28:51,070
It's sort of like brainstorming
in a sense.

496
00:28:51,070 --> 00:28:54,820

497
00:28:54,820 --> 00:28:57,580
You're trying to derive
the structure.

498
00:28:57,580 --> 00:29:01,480
And it's easier to do like
this than to try

499
00:29:01,480 --> 00:29:03,660
and do it in code.

500
00:29:03,660 --> 00:29:04,292
But yeah, you're right.

501
00:29:04,292 --> 00:29:06,580
You don't have to.

502
00:29:06,580 --> 00:29:10,540
AUDIENCE: You would ask the
person to put a letter.

503
00:29:10,540 --> 00:29:12,520
PROFESSOR: OK.

504
00:29:12,520 --> 00:29:14,995
For a letter.

505
00:29:14,995 --> 00:29:17,965
And then what would
come after that?

506
00:29:17,965 --> 00:29:23,410
AUDIENCE: So then you want
to check if it's the --

507
00:29:23,410 --> 00:29:24,895
PROFESSOR: Check if
it's in the word.

508
00:29:24,895 --> 00:29:32,320

509
00:29:32,320 --> 00:29:33,310
All right.

510
00:29:33,310 --> 00:29:33,850
And?

511
00:29:33,850 --> 00:29:34,590
AUDIENCE: If it is--

512
00:29:34,590 --> 00:29:37,721
PROFESSOR: If it is?

513
00:29:37,721 --> 00:29:39,589
AUDIENCE: Add it to the word.

514
00:29:39,589 --> 00:29:43,030
PROFESSOR: Add, let's say,
to correct letters guess.

515
00:29:43,030 --> 00:29:44,280
AUDIENCE: Yeah.

516
00:29:44,280 --> 00:29:50,825

517
00:29:50,825 --> 00:29:51,811
PROFESSOR: OK.

518
00:29:51,811 --> 00:29:52,797
And if it isn't?

519
00:29:52,797 --> 00:29:55,262
AUDIENCE: If it isn't,
reject it.

520
00:29:55,262 --> 00:29:58,940

521
00:29:58,940 --> 00:30:00,305
PROFESSOR: Let's say--

522
00:30:00,305 --> 00:30:04,462
AUDIENCE: You want to remove
it from the options.

523
00:30:04,462 --> 00:30:06,960
PROFESSOR: So if it's not, then
we're going to remove

524
00:30:06,960 --> 00:30:08,210
from options.

525
00:30:08,210 --> 00:30:11,219

526
00:30:11,219 --> 00:30:12,650
So letters remaining.

527
00:30:12,650 --> 00:30:16,470

528
00:30:16,470 --> 00:30:20,370
Probably want to tell the
user they're wrong too.

529
00:30:20,370 --> 00:30:22,080
AUDIENCE: And use up a turn.

530
00:30:22,080 --> 00:30:22,970
PROFESSOR: What's that?

531
00:30:22,970 --> 00:30:25,492
AUDIENCE: You want
to use up a turn.

532
00:30:25,492 --> 00:30:26,500
PROFESSOR: I'm sorry.

533
00:30:26,500 --> 00:30:29,218
AUDIENCE: Then you
use up a turn.

534
00:30:29,218 --> 00:30:31,030
If you had a set amount
of turns.

535
00:30:31,030 --> 00:30:32,284
PROFESSOR: OK, so we're
actually going get

536
00:30:32,284 --> 00:30:32,934
to that in a second.

537
00:30:32,934 --> 00:30:34,395
AUDIENCE: I sent last week.

538
00:30:34,395 --> 00:30:36,343
[INAUDIBLE] game.

539
00:30:36,343 --> 00:30:38,930
PROFESSOR: I actually played
all of your Hangman games.

540
00:30:38,930 --> 00:30:40,412
It was quite fun.

541
00:30:40,412 --> 00:30:45,350

542
00:30:45,350 --> 00:30:46,760
So again, yeah you're right.

543
00:30:46,760 --> 00:30:49,440
So we have a number of guesses
that are remaining.

544
00:30:49,440 --> 00:30:56,400
And the thing is that we know
that the user has a certain

545
00:30:56,400 --> 00:30:57,010
number of terms.

546
00:30:57,010 --> 00:30:59,760
So we're probably going to
repeat a lot of this.

547
00:30:59,760 --> 00:31:06,890
So at some point, we probably
want to have a WHILE.

548
00:31:06,890 --> 00:31:17,455
It'll just say, WHILE we have
guesses left remaining.

549
00:31:17,455 --> 00:31:21,190

550
00:31:21,190 --> 00:31:24,260
By the way, the reason why I
program computers is because

551
00:31:24,260 --> 00:31:26,090
my handwriting is horrible.

552
00:31:26,090 --> 00:31:29,560
So WHILE we have guesses
remaining, we're going to keep

553
00:31:29,560 --> 00:31:33,620
doing all this.

554
00:31:33,620 --> 00:31:34,430
All right?

555
00:31:34,430 --> 00:31:35,715
And then we're going
to remove--

556
00:31:35,715 --> 00:31:43,590

557
00:31:43,590 --> 00:31:47,410
But is this the only
stopping criteria?

558
00:31:47,410 --> 00:31:50,140
What if they win?

559
00:31:50,140 --> 00:31:55,630
So WHILE they have guesses
remaining and

560
00:31:55,630 --> 00:31:59,840
the word is not guessed.

561
00:31:59,840 --> 00:32:02,700

562
00:32:02,700 --> 00:32:07,570
So this is in essence your
Hangman program.

563
00:32:07,570 --> 00:32:09,560
It's English language.

564
00:32:09,560 --> 00:32:11,540
It's not easy to read because
that's my handwriting.

565
00:32:11,540 --> 00:32:13,370
But it's kind of easy
to understand it at

566
00:32:13,370 --> 00:32:14,620
an intuitive level.

567
00:32:14,620 --> 00:32:16,990

568
00:32:16,990 --> 00:32:19,990
And the reason we're talking
about this is because we're

569
00:32:19,990 --> 00:32:24,160
going to get to some more
complicated programs as we

570
00:32:24,160 --> 00:32:26,670
move through the semester.

571
00:32:26,670 --> 00:32:30,140
And a good starting off point
for a lot of you, when you're

572
00:32:30,140 --> 00:32:33,190
trying to do your problem sets,
is instead of trying to

573
00:32:33,190 --> 00:32:36,200
jump right into the coding
portion of it, to sit down

574
00:32:36,200 --> 00:32:39,540
with a piece of paper, index
cards, or a whiteboard and

575
00:32:39,540 --> 00:32:41,460
kind of sketch out a high level
view of the algorithm.

576
00:32:41,460 --> 00:32:46,600

577
00:32:46,600 --> 00:32:52,250
So that we can see this
in code form.

578
00:32:52,250 --> 00:32:57,760
So let's say that I want to
write a function that tests a

579
00:32:57,760 --> 00:33:01,490
number to see if it's prime.

580
00:33:01,490 --> 00:33:07,225
First question is, what
is a prime number?

581
00:33:07,225 --> 00:33:10,720
AUDIENCE: One where the only
factors are 1 and itself.

582
00:33:10,720 --> 00:33:11,370
PROFESSOR: Right.

583
00:33:11,370 --> 00:33:16,390
So a number that is only
divisible by itself and 1.

584
00:33:16,390 --> 00:33:20,040

585
00:33:20,040 --> 00:33:21,990
Are even numbers prime?

586
00:33:21,990 --> 00:33:24,040
Can they ever be prime?

587
00:33:24,040 --> 00:33:25,100
Really?

588
00:33:25,100 --> 00:33:26,220
What about 2?

589
00:33:26,220 --> 00:33:27,250
Right.

590
00:33:27,250 --> 00:33:29,630
So 2 is one of our
special cases.

591
00:33:29,630 --> 00:33:30,540
All right.

592
00:33:30,540 --> 00:33:36,550
So what would be maybe a good
starting off point for

593
00:33:36,550 --> 00:33:41,935
pseudocode to test primality,
knowing those facts.

594
00:33:41,935 --> 00:33:44,645

595
00:33:44,645 --> 00:33:47,020
AUDIENCE: [INAUDIBLE]

596
00:33:47,020 --> 00:33:49,460
PROFESSOR: All right.

597
00:33:49,460 --> 00:33:51,930
Can I erase this or should
I leave it up?

598
00:33:51,930 --> 00:33:53,180
Because I can go over there.

599
00:33:53,180 --> 00:33:59,570

600
00:33:59,570 --> 00:34:03,276
It's not like I'm erasing
any deep dark secrets.

601
00:34:03,276 --> 00:34:06,190
There's no magic here.

602
00:34:06,190 --> 00:34:19,510
All right so test number, if,
what, equal to, say what?

603
00:34:19,510 --> 00:34:21,215
2?

604
00:34:21,215 --> 00:34:21,560
Yeah.

605
00:34:21,560 --> 00:34:23,500
Why not?

606
00:34:23,500 --> 00:34:24,750
And maybe 3.

607
00:34:24,750 --> 00:34:29,030

608
00:34:29,030 --> 00:34:31,679
Now what do I do if it is?

609
00:34:31,679 --> 00:34:37,595

610
00:34:37,595 --> 00:34:38,845
Are they prime?

611
00:34:38,845 --> 00:34:40,920

612
00:34:40,920 --> 00:34:42,210
So I'm done right.

613
00:34:42,210 --> 00:34:46,095
So I'm going to return true.

614
00:34:46,095 --> 00:34:48,630

615
00:34:48,630 --> 00:34:54,774
Now what do I do if the number
given is not 2 or 3?

616
00:34:54,774 --> 00:34:57,204
AUDIENCE: [INAUDIBLE]

617
00:34:57,204 --> 00:34:59,710
PROFESSOR: You're talking about
the module operator.

618
00:34:59,710 --> 00:35:00,090
Right.

619
00:35:00,090 --> 00:35:02,510
So we will use that.

620
00:35:02,510 --> 00:35:06,200
That will tell us whether or not
an integer divides evenly

621
00:35:06,200 --> 00:35:09,232
into another integer or the
remainder after an integer is

622
00:35:09,232 --> 00:35:10,594
divided into another integer.

623
00:35:10,594 --> 00:35:13,290

624
00:35:13,290 --> 00:35:14,810
Let me ask you another
question.

625
00:35:14,810 --> 00:35:19,165
What is the maximum value of
an integer divisor for a

626
00:35:19,165 --> 00:35:20,320
non-prime number?

627
00:35:20,320 --> 00:35:22,122
So for a composite number.

628
00:35:22,122 --> 00:35:23,550
AUDIENCE: So itself.

629
00:35:23,550 --> 00:35:24,555
PROFESSOR: What's that?

630
00:35:24,555 --> 00:35:27,861
AUDIENCE: Number itself.

631
00:35:27,861 --> 00:35:30,060
PROFESSOR: OK, excluding
the number itself.

632
00:35:30,060 --> 00:35:33,740

633
00:35:33,740 --> 00:35:39,110
Well let's say that I have n
as the number I'm testing,

634
00:35:39,110 --> 00:35:43,300
square root of n, because I'm
not going to have a factor

635
00:35:43,300 --> 00:35:46,100
that's larger than that.

636
00:35:46,100 --> 00:35:50,130
And I ask that because there's
a loop involved.

637
00:35:50,130 --> 00:35:54,945
So how would I go about
this systematically?

638
00:35:54,945 --> 00:36:02,000
I'd probably start
at, let's say 5.

639
00:36:02,000 --> 00:36:05,030

640
00:36:05,030 --> 00:36:05,930
OK.

641
00:36:05,930 --> 00:36:18,100
And then test if n modula, let's
say 5, is equal to 0.

642
00:36:18,100 --> 00:36:23,250
Now if n is evenly divisible by
5, then that must mean that

643
00:36:23,250 --> 00:36:27,010
n is composite, because
5 is a factor.

644
00:36:27,010 --> 00:36:36,065
So if it is then return false.

645
00:36:36,065 --> 00:36:39,130

646
00:36:39,130 --> 00:36:45,690
Now what if it isn't?

647
00:36:45,690 --> 00:36:48,490
So that means that n is not
evenly divisible by 5.

648
00:36:48,490 --> 00:36:52,760
Does that mean that the number's
automatically prime?

649
00:36:52,760 --> 00:36:55,527
So after 5, what would
be a good number to

650
00:36:55,527 --> 00:36:56,777
test, to move to?

651
00:36:56,777 --> 00:37:02,450

652
00:37:02,450 --> 00:37:04,170
All right.

653
00:37:04,170 --> 00:37:05,840
6?

654
00:37:05,840 --> 00:37:06,900
No.

655
00:37:06,900 --> 00:37:08,090
That wouldn't be it.

656
00:37:08,090 --> 00:37:15,790
Because if 6 is a factor,
then obviously it's not.

657
00:37:15,790 --> 00:37:17,840
Whatever.

658
00:37:17,840 --> 00:37:19,250
So we're going to
move on to 7.

659
00:37:19,250 --> 00:37:25,370

660
00:37:25,370 --> 00:37:29,210
So basically we're going to
test all the odd numbers.

661
00:37:29,210 --> 00:37:31,540
And this is going to be
the same as that.

662
00:37:31,540 --> 00:37:34,520
So this repetition indicates
here that I

663
00:37:34,520 --> 00:37:36,730
probably need a loop.

664
00:37:36,730 --> 00:37:50,390
So instead of doing this, I want
to say x is equal to 5

665
00:37:50,390 --> 00:37:57,075
while x is less than--

666
00:37:57,075 --> 00:38:00,600

667
00:38:00,600 --> 00:38:10,340
We're going to test if
x evenly divides n.

668
00:38:10,340 --> 00:38:16,100

669
00:38:16,100 --> 00:38:23,280
And if it does, return false.

670
00:38:23,280 --> 00:38:36,740
And if it doesn't, then we just
increment x and repeat.

671
00:38:36,740 --> 00:38:40,440
And what happens when x
becomes greater than

672
00:38:40,440 --> 00:38:43,420
square root of n?

673
00:38:43,420 --> 00:38:45,330
Well the WHILE loop's
going to stop.

674
00:38:45,330 --> 00:38:48,510
And that also means that if I've
made it to that point,

675
00:38:48,510 --> 00:38:52,360
then I've not found any numbers
between 5 and square

676
00:38:52,360 --> 00:38:55,960
root of n that will
evenly divide n.

677
00:38:55,960 --> 00:39:03,160
So that means that n is prime.

678
00:39:03,160 --> 00:39:28,020
So if I translate this into
code, it would look

679
00:39:28,020 --> 00:39:29,270
something like this.

680
00:39:29,270 --> 00:39:33,350

681
00:39:33,350 --> 00:39:38,700
Now I see.

682
00:39:38,700 --> 00:39:42,420

683
00:39:42,420 --> 00:39:44,490
So first we're going to
check if n is less

684
00:39:44,490 --> 00:39:46,800
than or equal to 3.

685
00:39:46,800 --> 00:39:50,370
If it's 2 or 3, then
we'll return true.

686
00:39:50,370 --> 00:39:54,940
If it's not 2 or 3, then
that means it's 1 or 0.

687
00:39:54,940 --> 00:39:57,040
So return false.

688
00:39:57,040 --> 00:40:00,030
So we've got those cases.

689
00:40:00,030 --> 00:40:01,440
And then we're going
to iterate--

690
00:40:01,440 --> 00:40:04,950
or if n is greater than 3,
we're going to iterate--

691
00:40:04,950 --> 00:40:07,554

692
00:40:07,554 --> 00:40:11,450
now why would you go from 2--

693
00:40:11,450 --> 00:40:13,410
we're going to integrate through
all the possible

694
00:40:13,410 --> 00:40:18,120
divisors and check
for divisibility.

695
00:40:18,120 --> 00:40:20,790

696
00:40:20,790 --> 00:40:23,760
And if we evenly divide
it, return false.

697
00:40:23,760 --> 00:40:25,920
And if we make it through the
loop, we'd return true.

698
00:40:25,920 --> 00:40:29,399

699
00:40:29,399 --> 00:40:33,240
AUDIENCE: Does that RETURN
stop the loop?

700
00:40:33,240 --> 00:40:36,120
PROFESSOR: Yes.

701
00:40:36,120 --> 00:40:39,000
Well think about what
RETURN is doing.

702
00:40:39,000 --> 00:40:42,800
You're in this function, test
primality, And as soon as

703
00:40:42,800 --> 00:40:47,440
Python sees return, that's
telling Python to kick out of

704
00:40:47,440 --> 00:40:49,990
the function and return
whatever is

705
00:40:49,990 --> 00:40:51,080
after the return statement.

706
00:40:51,080 --> 00:40:55,250
So this false here, it says
return false, that means that

707
00:40:55,250 --> 00:40:57,220
it doesn't matter where you are,
it's just going to kick

708
00:40:57,220 --> 00:40:59,830
out of the innermost function or
the function that encloses

709
00:40:59,830 --> 00:41:01,290
that return and return
that value.

710
00:41:01,290 --> 00:41:06,130

711
00:41:06,130 --> 00:41:07,380
Any questions?

712
00:41:07,380 --> 00:41:09,255

713
00:41:09,255 --> 00:41:10,505
All right.

714
00:41:10,505 --> 00:41:21,630

715
00:41:21,630 --> 00:41:27,220
So testing primality, 1 is
false, 2 is true, 3 is true, 4

716
00:41:27,220 --> 00:41:30,760
is false, and 5 is true.

717
00:41:30,760 --> 00:41:34,730
So it looks like the
program works.

718
00:41:34,730 --> 00:41:37,440
And if no one has any questions,
I'm going to move

719
00:41:37,440 --> 00:41:38,690
on to the last major topic.

720
00:41:38,690 --> 00:41:43,720

721
00:41:43,720 --> 00:41:45,879
Everyone's good on pseudocode?

722
00:41:45,879 --> 00:41:47,875
All right.

723
00:41:47,875 --> 00:41:51,867
AUDIENCE: What would the main
purpose of pseudocode is for

724
00:41:51,867 --> 00:41:53,863
yourself when you're writing a
program or when you want to

725
00:41:53,863 --> 00:41:55,859
explain it to other people?

726
00:41:55,859 --> 00:41:58,770
PROFESSOR: Both.

727
00:41:58,770 --> 00:42:01,670
So the question was, is writing
pseudocode useful for

728
00:42:01,670 --> 00:42:03,750
just understanding a program
yourself or for explaining it

729
00:42:03,750 --> 00:42:05,390
to other people?

730
00:42:05,390 --> 00:42:06,640
The answer is both.

731
00:42:06,640 --> 00:42:11,000

732
00:42:11,000 --> 00:42:11,640
I don't know.

733
00:42:11,640 --> 00:42:15,430
It's the difference between
showing someone the derivative

734
00:42:15,430 --> 00:42:18,060
of the function and then
explaining that what you're

735
00:42:18,060 --> 00:42:21,580
doing is finding a function that
gives you the slope of a

736
00:42:21,580 --> 00:42:23,620
function at that point.

737
00:42:23,620 --> 00:42:27,590
So it's one is more
intuitive for some

738
00:42:27,590 --> 00:42:30,415
people than the other.

739
00:42:30,415 --> 00:42:31,920
A mathematician would
understand the

740
00:42:31,920 --> 00:42:33,910
former pretty quickly.

741
00:42:33,910 --> 00:42:36,720
An English major would
understand the latter maybe.

742
00:42:36,720 --> 00:42:42,440

743
00:42:42,440 --> 00:42:47,850
So when I explain my research
to people, I don't tell them

744
00:42:47,850 --> 00:42:52,530
that I mess around with Gaussian
Mixture models and

745
00:42:52,530 --> 00:42:54,310
Hidden Markov models.

746
00:42:54,310 --> 00:42:56,980
I tell them that I'm trying
to figure out how people

747
00:42:56,980 --> 00:43:00,870
mispronounce words when they
speak foreign languages.

748
00:43:00,870 --> 00:43:04,350
A lot easier for people
to digest.

749
00:43:04,350 --> 00:43:07,500
With debugging, what are bugs?

750
00:43:07,500 --> 00:43:09,000
AUDIENCE: Mistakes.

751
00:43:09,000 --> 00:43:10,340
PROFESSOR: Mistakes.

752
00:43:10,340 --> 00:43:15,615
And if you see one bug, there
are probably many more.

753
00:43:15,615 --> 00:43:18,310

754
00:43:18,310 --> 00:43:24,520
So when you're debugging, your
goal is not to move quickly.

755
00:43:24,520 --> 00:43:29,320
This is an instance where the
maxim fast is slow and slow is

756
00:43:29,320 --> 00:43:31,890
fast comes into play.

757
00:43:31,890 --> 00:43:34,560
You want to be very deliberate
and systematic when you're

758
00:43:34,560 --> 00:43:37,380
trying to debug code.

759
00:43:37,380 --> 00:43:39,360
You want to ask the question
why your code is

760
00:43:39,360 --> 00:43:41,610
doing what it does.

761
00:43:41,610 --> 00:43:46,720
And remember, the first
recitation I said, that your

762
00:43:46,720 --> 00:43:48,110
computer's not going to do
anything that you do

763
00:43:48,110 --> 00:43:49,360
not tell it to do.

764
00:43:49,360 --> 00:43:55,420

765
00:43:55,420 --> 00:43:59,770
It's not something that
people do naturally.

766
00:43:59,770 --> 00:44:04,370
If you watch some of the TAs and
sometimes a student will

767
00:44:04,370 --> 00:44:06,860
say, how do you find
the bug so quickly?

768
00:44:06,860 --> 00:44:11,590
Well it's because I've been
programming for 18 years.

769
00:44:11,590 --> 00:44:12,860
Professor Guttag's
been programming

770
00:44:12,860 --> 00:44:14,030
for longer than that.

771
00:44:14,030 --> 00:44:16,430
So a lot of it is experience.

772
00:44:16,430 --> 00:44:21,050
And it's just when we've debug
our own programs and when we

773
00:44:21,050 --> 00:44:24,100
were learning to program, it
was as painful for us as it

774
00:44:24,100 --> 00:44:25,620
was for you.

775
00:44:25,620 --> 00:44:31,990
So that said, you want to start
with asking, how could

776
00:44:31,990 --> 00:44:36,070
your code have produced the
output that it did?

777
00:44:36,070 --> 00:44:40,200
Then you want to figure out
some experiments that are

778
00:44:40,200 --> 00:44:45,000
repeatable and that you have
an idea of what the

779
00:44:45,000 --> 00:44:47,020
output should be.

780
00:44:47,020 --> 00:44:54,920
So after you do that, then you
want to test your code one by

781
00:44:54,920 --> 00:45:00,510
one on these different test
cases and see what it does.

782
00:45:00,510 --> 00:45:02,020
And in order to see
what it does, you

783
00:45:02,020 --> 00:45:04,390
can use a print statement.

784
00:45:04,390 --> 00:45:10,460
So when you think you found a
bug and you think you have a

785
00:45:10,460 --> 00:45:15,220
solution to your code, you want
to make as few changes as

786
00:45:15,220 --> 00:45:20,500
possible at a time, it's because
as you're making

787
00:45:20,500 --> 00:45:22,190
corrections, you can still
introduce bugs.

788
00:45:22,190 --> 00:45:28,230

789
00:45:28,230 --> 00:45:29,870
Let's see.

790
00:45:29,870 --> 00:45:32,460
So a useful way to do this
is to use a test harness.

791
00:45:32,460 --> 00:45:35,340
So when we actually grade your
problem sets, a lot of the

792
00:45:35,340 --> 00:45:39,270
time the TAs will put together
a set of test

793
00:45:39,270 --> 00:45:41,840
cases for your code.

794
00:45:41,840 --> 00:45:46,020
So one of the things is a lot of
the times when you get one

795
00:45:46,020 --> 00:45:48,130
of the problems or when you look
at the problems, it'll

796
00:45:48,130 --> 00:45:51,430
have some example input
and output.

797
00:45:51,430 --> 00:45:53,250
But that doesn't necessarily
mean that we

798
00:45:53,250 --> 00:45:55,060
only test on that.

799
00:45:55,060 --> 00:45:57,250
There's additional test
cases that we use.

800
00:45:57,250 --> 00:45:58,630
And it's not to trip you up.

801
00:45:58,630 --> 00:46:00,615
It's because there's a lot
of different variations.

802
00:46:00,615 --> 00:46:03,400

803
00:46:03,400 --> 00:46:07,470
And it's also, if you read the
specification, you follow the

804
00:46:07,470 --> 00:46:12,250
specification, then
you'll be fine.

805
00:46:12,250 --> 00:46:13,420
Moving on.

806
00:46:13,420 --> 00:46:15,100
So let's look at an example.

807
00:46:15,100 --> 00:46:18,570
I have a function here,
is palindrome.

808
00:46:18,570 --> 00:46:21,580
You've seen this
before, right?

809
00:46:21,580 --> 00:46:21,880
Yes?

810
00:46:21,880 --> 00:46:22,750
Yeah.

811
00:46:22,750 --> 00:46:23,940
OK.

812
00:46:23,940 --> 00:46:28,750
So it's supposed to return
true if string s is a

813
00:46:28,750 --> 00:46:30,340
palindrome.

814
00:46:30,340 --> 00:46:32,370
And so I've written
this function.

815
00:46:32,370 --> 00:46:34,920
And I've also written
a test harness.

816
00:46:34,920 --> 00:46:38,130
Now there's a lot more code in
the test harness, but it's

817
00:46:38,130 --> 00:46:39,380
pretty simple code.

818
00:46:39,380 --> 00:46:44,040

819
00:46:44,040 --> 00:46:48,300
When you're writing functions,
you want to think of the type

820
00:46:48,300 --> 00:46:49,970
of input you could receive.

821
00:46:49,970 --> 00:46:54,680
And you want to think
of, what are the

822
00:46:54,680 --> 00:46:55,520
kind of boundary cases?

823
00:46:55,520 --> 00:46:57,940
So the extremes of input
that you can get.

824
00:46:57,940 --> 00:47:01,000
We call these boundary
cases, edge cases.

825
00:47:01,000 --> 00:47:04,150
For the is_palindrome function,
it would be like the

826
00:47:04,150 --> 00:47:07,960
empty string would be one.

827
00:47:07,960 --> 00:47:11,850
Or just a single character.

828
00:47:11,850 --> 00:47:15,600
These are the kind of minimum
we can have or

829
00:47:15,600 --> 00:47:16,880
we could think of.

830
00:47:16,880 --> 00:47:21,270
On the opposite end of the
spectrum, theoretically we

831
00:47:21,270 --> 00:47:23,920
could have an infinitely
long string.

832
00:47:23,920 --> 00:47:26,350
So we're not going to
actually test for an

833
00:47:26,350 --> 00:47:29,160
infinitely long string.

834
00:47:29,160 --> 00:47:32,710
Anyway, all we're going to do is
in our test harness, we're

835
00:47:32,710 --> 00:47:35,320
just going to run the function
on these inputs.

836
00:47:35,320 --> 00:47:38,740
And we know that an empty
string should be true.

837
00:47:38,740 --> 00:47:40,010
We know that a single character

838
00:47:40,010 --> 00:47:41,260
string should be true.

839
00:47:41,260 --> 00:47:43,640

840
00:47:43,640 --> 00:47:47,630
We know that if I have a string
that's two characters

841
00:47:47,630 --> 00:47:49,150
long and they're the
same character,

842
00:47:49,150 --> 00:47:51,140
that should be true.

843
00:47:51,140 --> 00:47:53,100
If they're two characters,
then it should be false.

844
00:47:53,100 --> 00:47:55,620

845
00:47:55,620 --> 00:47:59,694
And what I'm going to do now
is I'm looking at kind of

846
00:47:59,694 --> 00:48:01,410
expecting what we call
expected input.

847
00:48:01,410 --> 00:48:05,710
So after I've hit my edge cases,
I'm going to look at

848
00:48:05,710 --> 00:48:10,250
all the strings of an even
length and make sure that the

849
00:48:10,250 --> 00:48:11,890
function works properly.

850
00:48:11,890 --> 00:48:15,120
And then I'm going to look at
strings with an odd length.

851
00:48:15,120 --> 00:48:19,490
And then once I get to this
point where I've tested a

852
00:48:19,490 --> 00:48:22,140
number of different lengths, and
in this case, it's just 2

853
00:48:22,140 --> 00:48:24,470
through 5 or 0 through
5, if you want to

854
00:48:24,470 --> 00:48:26,630
include the edge cases.

855
00:48:26,630 --> 00:48:29,050
Then I'm going to say, well it
looks like all tests are pass.

856
00:48:29,050 --> 00:48:32,210
And I think that this function
works pretty good for anything

857
00:48:32,210 --> 00:48:35,520
we can expect it to encounter
reasonably.

858
00:48:35,520 --> 00:48:41,540
So the way that you use test
harnesses, is every time you

859
00:48:41,540 --> 00:48:44,060
make a change to your program,
you want to run the test

860
00:48:44,060 --> 00:48:47,950
harness, because that'll catch
any bugs you may have

861
00:48:47,950 --> 00:48:48,870
introduced.

862
00:48:48,870 --> 00:48:50,700
And so I'm going to finish up
with this really quickly

863
00:48:50,700 --> 00:48:53,120
because I know my time's up.

864
00:48:53,120 --> 00:48:56,000

865
00:48:56,000 --> 00:48:57,730
So I got a bug.

866
00:48:57,730 --> 00:49:01,910
It's telling me that one of
my test cases failed.

867
00:49:01,910 --> 00:49:08,340
So line 299, which is
this test case.

868
00:49:08,340 --> 00:49:16,430
So what we can do is now that we
know that it fails, we can

869
00:49:16,430 --> 00:49:26,270
say, maybe printout our input
and see what we have.

870
00:49:26,270 --> 00:49:33,150
And instead of just running, I'm
just going to run that one

871
00:49:33,150 --> 00:49:34,400
test case that failed.

872
00:49:34,400 --> 00:49:46,450

873
00:49:46,450 --> 00:49:49,330
So obviously this
should be true.

874
00:49:49,330 --> 00:49:51,760
And what we're seeing is that
on the first call to

875
00:49:51,760 --> 00:49:55,410
is_palindrome, s is abba.

876
00:49:55,410 --> 00:50:02,120
And then on the recursive call
to it, we only get bba.

877
00:50:02,120 --> 00:50:03,330
That means that we've
only chopped

878
00:50:03,330 --> 00:50:05,920
of the front character.

879
00:50:05,920 --> 00:50:07,170
So you see the bug?

880
00:50:07,170 --> 00:50:11,350

881
00:50:11,350 --> 00:50:12,600
Well here.

882
00:50:12,600 --> 00:50:15,010

883
00:50:15,010 --> 00:50:18,330
So there we go.

884
00:50:18,330 --> 00:50:23,273