Mr. White, a
teacher at Colorful Academy, gave his class a pop quiz comprised of 20 true or
false questions. In an attempt to thwart
cheating, he gave two versions of the quiz in which the questions were the same
but in a different order. As he graded one
particularly bad quiz, he noticed that 10 answers in a row matched the answer
key of the wrong version. He ran to the teacher lounge and asked his fellow
colleagues what the probability of randomly matching 10 questions in a row on a
20 question true or false quiz was, and if they thought the student had cheated.
Like a
typical faculty, no teacher could agree.
“I wouldn’t get too worked up,” blurted out Mr. Red. “Since every true or false question has a 50%
success rate, it is highly probable for a guessing student to get half of the
questions to match the wrong version.”
Mrs. Yellow
disagreed. “Since the questions are all
in a row, the 50% chance should be compounded, so there is a (½)10
or 0.1% chance of this happening, which is highly unlikely.”
“I think Mr.
Red’s guess is too high,” Mr. Orange said, “since he did not consider that the
10 questions that matched were all in a row, and I think Mrs. Yellow’s guess is
too low since she did not consider the other 10 questions that are not in a
row. So I’ll split the difference and
say that there is a 50% chance that the student was cheating, which is of some
concern but not outside the realm of likelihood.”
Mrs. Green shook
her head. “Splitting the difference is
not very mathematical,” she said. “I
agree with Mrs. Yellow that 10 questions in a row would be a (½)10
or 0.1% chance, but since this pattern could start at any number between #1 and
#11, the probability should be multiplied by 11, which means that there is
approximately an 11(½)10 or 1% chance that the student was not
cheating. In other words, there is a 99%
chance that the student was cheating.”
“Isn’t there
some formula with factorials that should be used for this?” Mrs. Blue questioned as she pulled out her
phone and started scrolling. “Here it
is: ‘For a binomial experiment consisting of n trials, the probability of
exactly k successes is nCkpk(1 – p)n – k.’ So in this case, we want 10 successes out of
20 trials with a probability of 50%, which is,” she paused as she used her
calculator app, “20C10 (0.5)10(1 – 0.5)20
– 10 which equals 1.76 x 10-19% or nearly 0%. Either way
you look at it, the student was cheating.”
“Let’s not
jump to conclusions!” cried Mr. Purple.
“While you were all talking I wrote down possible combinations on this
napkin, using an ‘O’ to note a match and an ‘X’ to note a miss. Now, it became quickly apparent that all the
combinations for a 20 question quiz would not fit on the napkin, so I wrote
down all the combinations for a 4 question quiz instead, and then underlined
all the instances of getting half or 2 matches in a row. Here’s what I found.” Mr. Purple held up a napkin with the
following scribbled on it: OOOO, OOOX,
OOXO, OOXX, OXOO, OXOX, OXXO, OXXX, XOOO, XOOX,
XOXO, XOXX, XXOO, XXOX, XXXO, XXXX. “As you can see, there are 8 instances out of
16 which meet the criteria of 2 matches in a row, which is a 50% chance. So Mr. Orange was right after all.
And so the dialogue
continued, with every teacher choosing sides and every teacher expressing his
or her own opinion. Finally, the
computer teacher, Mr. Black, volunteered to write a computer program using Mr.
Purple’s method of counting combinations, but for all 20 questions instead of just
4 questions. It was finally agreed that
the outcome of such a program would yield the correct answer.
Being a
computer teacher, Mr. Black changed Mr. Purple’s O’s to 0’s and X’s to 1’s so
that the combinations could be expressed as binary numbers instead. This gave the added bonus that he could just
have the computer program count from 0 to 2^20 in binary and be assured of getting
all the possible combinations. He could
then use a built-in substring function and count all the instances of 10 0’s in
a row. Here’s what his computer program
looked like:
01
|
#The Suspicious Quiz Solution
|
02
|
#Python 2.7.3
|
03
|
|
04
|
#Number of questions on the quiz.
|
05
|
intQuestions = 20
|
06
|
|
07
|
#Number of matching questions in a row.
|
08
|
intInARow = 10
|
09
|
|
10
|
#Number of the instances which meets the criteria
of having the
|
11
|
# correct number (or more) of matching questions
in a row,
|
12
|
# starting at 0 and to be added to later.
|
13
|
intCount = 0
|
14
|
|
15
|
#Create a string of 0's the same length as the
number of
|
16
|
# matching questions that are in a row.
|
17
|
strInARow = ""
|
18
|
for i in range(0, intInARow):
|
19
|
strInARow = strInARow + "0"
|
20
|
|
21
|
#Create an array of all possible quiz
combinations.
|
22
|
strEvents = []
|
23
|
for i in range(0, 2 **
intQuestions):
|
24
|
#Change
the number to a binary string with leading zeroes.
|
25
|
strEvents.append(bin(i)[2:].zfill(intQuestions))
|
26
|
|
27
|
#If the
string of 0's can be found in the current array
|
28
|
# item,
add it to the total count.
|
29
|
if (strInARow in strEvents[i]):
|
30
|
intCount += 1
|
31
|
|
32
|
#Display the total count and percentage.
|
33
|
print "Getting at least
" + str(intInARow) + " matches in " + (
|
34
|
"a row on a ") + str(intQuestions)
+ " question quiz: "
|
35
|
print str(intCount) +
"/" + str(2 ** intQuestions) + " (" + (
|
36
|
str(100.0 * intCount / 2 **
intQuestions)) + "%)"
|
and here is
the outcome after running the computer program:
>>
|
Getting at least 10 matches in
a row on a 20 question quiz:
|
>>
|
6144/1048576 (0.5859375%)
|
With his
computer program, Mr. Black discovered that there were 6,144 instances out of a
total of 1,048,576 true or false combinations in which there are at least 10
matching questions in a row on a 20 question quiz, or about a 0.6% probability. In other words, there was about a 99.4%
chance that the student had cheated. Mr.
White decided to investigate the matter further.
The above
probability of matching at least 10 questions in a row to the wrong answer key for
20 true or false questions is the same for a few other situations as well, such
as the probability of correctly answering at least 10 questions in a row on 20
true or false questions by randomly guessing, or getting at least 10 heads in a
row on 20 coin tosses.
Unfortunately,
the formula is limited and complex. If r
is the number of successes in a row, and k is the total number of trials, and c
is the number of outcomes each trial has, then the probability p can be
represented as:
Therefore,
in our above scenario, r = 10, k = 20, and c
= 2, so p = (1 + ∑n=110(2∙2n–1∙1 + (n –
1)2n – 2∙12)) / 220, or 6144/1048576 ≈ 0.6%. For Mr. Purple’s napkin calculation, r = 2, k
= 4, and c = 2, so p = (1 + ∑n=12(2∙2n–1∙1 +
(n – 1)2n – 2∙12)) / 24 = (1 + 2 + 5)/16 =
8/16 = 50%.
To derive
the above formula, it is helpful to first look at the number of successful
occurrences of exactly all trials, then the number of successful occurrences of
exactly all but one trial, then exactly all but two, then exactly all but
three, and so on, and generalize a pattern.
The number of successful occurrences of exactly all trials, or exactly 20
in a row out of 20, is, of course, 1:
OOOOOOOOOOOOOOOOOOOO
Exactly all
but 1 trial, or exactly 19 in a row out of 20, looks like this:
OOOOOOOOOOOOOOOOOOOX
XOOOOOOOOOOOOOOOOOOO
where “X”
represents all choices but “O” (otherwise it would be more than 19 in a row). The two X’s can be represented as 2(c –
1). When there are 2 choices, like on a
true or false question, c = 2, and exactly 19 in a row out of 20 would be 2(2 –
1) = 2 successful occurrences.
Exactly all
but 2 trials, or exactly 18 in a row out of 20, looks like this:
OOOOOOOOOOOOOOOOOOXX
OOOOOOOOOOOOOOOOOOXO
XOOOOOOOOOOOOOOOOOOX
OXOOOOOOOOOOOOOOOOOO
XXOOOOOOOOOOOOOOOOOO
To save
space, we will introduce the “#” symbol which can represent any choice, so
exactly 18 in a row out of 20 can be re-written as:
OOOOOOOOOOOOOOOOOOX#
XOOOOOOOOOOOOOOOOOOX
#XOOOOOOOOOOOOOOOOOO
The first
and last lines have 1 X (c – 1) and 1 # (c) each, for a total of 2c(c – 1)
choices, and the middle line has 2 X’s or (c – 1)2 choices, which is
2c(c – 1) + (c – 1)2. When c
= 2, there are 2(2)(2 – 1) + (2 – 1)2 = 5 successful occurrences.
Exactly all
but 3 trials, or exactly 17 in a row out of 20, looks like this:
OOOOOOOOOOOOOOOOOX##
XOOOOOOOOOOOOOOOOOX#
#XOOOOOOOOOOOOOOOOOX
##XOOOOOOOOOOOOOOOOO
The first
and last lines have 1 X (c – 1) and 2 #’s (c2) each, for a total of
2c2(c – 1) choices, and the middle 2 lines have 2 X’s ((c – 1)2)
and 1 # (c) each, for a total of 2c(c – 1)2 + choices, which is 2c2(c
– 1) + 2c(c – 1)2. When c =
2, there are 2(22)(2 – 1) + 2(2)(2 – 1)2 = 12 successful
occurrences.
Exactly all
but 4 trials, or exactly 16 in a row out of 20, looks like this:
OOOOOOOOOOOOOOOOX###
XOOOOOOOOOOOOOOOOX##
#XOOOOOOOOOOOOOOOOX#
##XOOOOOOOOOOOOOOOOX
###XOOOOOOOOOOOOOOOO
The first
and last lines have 1 X (c – 1) and 3 #’s (c3) each, for a total of
2c3(c – 1) choices, and the middle 3 lines have 2 X’s ((c – 1)2)
and 2 #’s (c2) each, for a total of 3c2(c – 1)2
+ choices, which is 2c3(c – 1) + 3c2(c – 1)2. When c = 2, there are 2(23)(2 – 1)
+ 3(22)(2 – 1)2 = 28 successful occurrences.
Exactly all
but 5 trials, or exactly 15 in a row out of 20, looks like this:
OOOOOOOOOOOOOOOX####
XOOOOOOOOOOOOOOOX###
#XOOOOOOOOOOOOOOOX##
##XOOOOOOOOOOOOOOOX#
###XOOOOOOOOOOOOOOOX
The first
and last lines have 1 X (c – 1) and 4 #’s (c4) each, for a total of
2c4(c – 1) choices, and the middle 4 lines have 2 X’s ((c – 1)2)
and 3 #’s (c3) each, for a total of 4c3(c – 1)2
+ choices, which is 2c4(c – 1) + 4c3(c – 1)2. When c = 2, there are 2(24)(2 – 1)
+ 4(23)(2 – 1)2 = 64 successful occurrences.
To
summarize:
All But…
|
Formula
|
0
|
1
|
1
|
2(c – 1)
|
2
|
2c(c – 1) + (c – 1)2
|
3
|
2c2(c – 1) + 2c(c – 1)2
|
4
|
2c3(c – 1) + 3c2(c – 1)2
|
5
|
2c4(c – 1) + 4c3(c – 1)2
|
Other than
the first one, the formula follows a pattern where exactly all but n results in
2cn – 1(c – 1) + (n – 1)cn – 2(c – 1)2 successful
occurrences. Considering “at least” r
occurrences in a row means adding all successful occurrences for exactly r, r +
1, r + 2, … k occurrences in a row, which is why there is a sigma in the
formula.
Unfortunately,
the formula breaks down if r < k/2, because the
technique we used counted two successful sequences in the same row twice. For example, for exactly 9 in a row out of
20, the following combination
OOOOOOOOOXXOOOOOOOOO
is counted
twice. Once for
OOOOOOOOOX##########
and once for
##########XOOOOOOOOO
A more
complex formula is therefore needed for r < k/2 (and an even more complex formula
for r < k/3, r < k/4, and so on).
Just because
Mr. Black mathematically showed that there was a 99.4% probability that Mr.
White’s student was cheating, it did not actually prove that the student was
cheating. According to National
Geographic, there is a 1 in 3000 chance of being struck by lightning in your
lifetime, or a 99.97% chance that you won’t get struck by lightning in your
lifetime. Despite these odds, there are
still people who do get struck by lightning, and we don’t discredit these
reports. The same reasoning should be
used for the suspicious quiz. Just like
lightning strikes, it is very unlikely that a student would randomly match at
least 10 true or false questions in a row to the wrong answer key, but we must
allow for the possibility that it could happen.