536
COMPUTER BASED NUMERICAL AND STATISTICAL TECHNIQUES
Similarly, we will get
b – E(b)= −
ad bc
N
–
= c – E(c); d – E(d) =
ad bc
N
–
Substituting in (2), we get
χ
2
=
ad bc
N
–
bg
2
2
1111
Ea Eb Ec Ed
af af af af
+++
L
N
M
O
Q
P
=
ad bc
N
–
bg
2
11 11
abac abbd accd bdcd++
+
++
R
S
|
T
|
U
V
|
W
|
+
++
+
++
R
S
|
T
|
U
V
|
W
|
L
N
M
M
O
Q
P
P
bgbgbgbg bgbgbgbg
=
ad bc
N
–
bg
2
bdac
abacbd
bdac
accdbd
+++
+++
+
+++
+++
L
N
M
M
O
Q
P
P
bgbgbgbgbgbg
=(ad – bc)
2
cdab
abacbdcd
+++
++++
L
N
M
M
O
Q
P
P
bgbgbgbg
=
Nad bc
abacbdcd
–
bg
bgbgbgbg
2
++++
Example 11. From the following table regarding the colour of eyes of father and son test if the colour
of son’s eye is associated with that of the father.
Eye colour of son
Light Not light
Eye colour of father Light
Not light
471 51
148 230
Sol. Null Hypothesis H
0
: The colour of son’s eye is not associated with that of the father i.e.,
they are independent.
Under H
0
, we calculate the expected frequency in each cell as
=
Product of column total and row total
whole total
TESTING OF HYPOTHESIS
537
Expected frequencies are:
Eye colour
of son Light Not light Total
Eye colour
of father
Light
619 522×
900
= 359.02
289 522×
900
= 167.62 522
Not Light
619
900
× 378
= 259.98
289
900
× 378
= 121.38 378
Total 619 289 900
χ
2
=
471 359 02
359 02
51 167 62
167 62
148 259 98
259 98
230 121 38
121 38
2222
–.
.
–.
.
–.
.
–.
.
bgbgbgbg
++ +
= 261.498.
Conclusion: At 5% level for 1 d.f., χ
2
is 3.841 (tabulated value)
Since tabulated value of χ
2
< calculated value of χ
2
. Hence H
0
is rejected.
Example 12. The following table gives the number of good and bad parts produced by each of the
three shifts in a factory:
Good parts Bad parts Total
Day shift 960 40 1000
Evening shift 940 50 990
Night shift 950 45 995
Total 2850 135 2985
Test whether or not the production of bad parts is independent of the shift on which they were
produced.
Sol. Null Hypothesis H
0
: The production of bad parts is independent of the shift on which
they were produced.
i.e., the two attributes, production and shifts are independent.
Under H
0
, χ
2
=
i
ij ij
ij
j
AB AB
AB
==
∑∑
L
N
M
M
M
M
O
Q
P
P
P
P
1
2
0
2
0
1
3
ejej
ej
–
Calculation of expected frequencies
Let A and B be the two attributes namely production and shifts. A is divided into two classes
A
1
, A
2
and B is divided into three classes B
1
, B
2
, B
3
.
538
COMPUTER BASED NUMERICAL AND STATISTICAL TECHNIQUES
(A
1
B
1
)
0
=
AB
N
12
bgbg
=
2850 1000
2985
afaf
×
= 954.77;
(A
1
B
2
)
0
=
AB
N
12
bgbg
=
2850 990
2985
afaf
×
= 945.226
(A
1
B
3
)
0
=
AB
N
13
bgbg
=
2850 995
2985
afaf
×
= 950;
(A
2
B
1
)
0
=
AB
N
21
bgbg
=
135 1000
2985
afa f
×
= 45.27
(A
2
B
2
)
0
=
AB
N
22
bgbg
=
135 990
2985
afaf
×
= 44.773;
(A
2
B
3
)
0
=
AB
N
23
bgbg
=
135 995
2985
afaf
= 45.
To calculate the value of χ
2
.
Class O
i
E
i
(O
i
– E
i
)
2
(O
i
– E
i
)
2
/E
i
(A
1
B
1
) 960 954.77 27.3529 0.02864
(A
1
B
2
) 940 954.226 27.3110 0.02889
(A
1
B
3
) 950 950 0 0
(A
2
B
1
) 40 45.27 27.7729 0.61349
(A
2
B
2
) 50 44.773 27.3215 0.61022
(A
2
B
3
)45 45 0 0
1.28126
Conclusion: The tabulated value of χ
2
at 5% level of significance for 2 degrees of freedom
(r – 1)(s –1) is 5.991. Since the calculated value of χ
2
is less than the tabulated value, we accept
H
0
. i.e., the production of bad parts is independent of the shift on which they were produced.
12.7.2 Student’s
t
-distribution
The t-distribution is used when sample size is less than equal to 30 (≤ 30) and the population
standard deviation is unknown.
Let X
i
, i = 1, 2, , n be a random sample of size n from a normal population with mean
µ and variance σ
2
. Then student’s t is defined by
t =
X
Sn
–
/
µ
~ t (n –1 d.f.)
where
X
=
1
1
n
X
i
i
n
=
∑
is the sample mean
TESTING OF HYPOTHESIS
539
S
2
=
1
1n –
i
n
=
∑
1
XX
i
–
di
2
is an unbiased estimate of the population variance σ
2
.
The t-distribution has different values for each d.f. and when the d.f. are infinitely large, the
t-distribution is equivalent to normal distribution.
Example 13. The 9 items of a sample have the following values 45, 47, 50, 52, 48, 47, 49, 53, 51.
Does the mean of these values differ significantly from the assumed mean 47.5 ?
Sol. H
0
: µ = 47.5
i.e., there is no significant difference between the sample and population mean.
H
1
: µ ≠ 47.5 (two tailed test): Given: n = 9, µ = 47.5
X 45 47 50 52 48 47 49 53 51
XX–
– 4.1 – 2.1 0.9 2.9 –1.1 –2.1 –0.1 3.9 1.9
XX–
di
2
16.81 4.41 0.81 8.41 1.21 4.41 0.01 15.21 3.61
X
=
Σx
n
=
442
9
= 49.11; Σ
XX–
di
2
= 54.89;
s
2
=
Σ XX
n
–
–
di
bg
2
1
= 6.86 ∴ s = 2.619
Applying t-test t =
X
sn
–
/
µ
=
49 1 47 5
2 619 8
.– .
.
=
16 8
2 619
.
.
af
= 1.7279
t
0.05
= 2.31 for γ = 8.
Conclusion: Since
t
< t
0.05
, the hypothesis is accepted i.e., there is no significant difference
between their mean.
Example 14. A random sample of 10 boys had the following I. Q’. s: 70, 120, 110, 101, 88, 83, 95,
98, 107, 100. Do these data support the assumption of a population mean I.Q. of 100 ? Find a reasonable
range in which most of the mean I.Q. values of samples of 10 boys lie.
Sol. Null hypothesis, H
0
: The data are consistent with the assumption of a mean I.Q. of 100
in the population, i.e., µ = 100.
Alternative hypothesis: H
1
:
µ≠ 100
Test Statistic. Under H
0
, the test statistic is:
t =
x
Sn
–
/
µ
di
2
∼ t
(n –1)
where
x
and S
2
are to be computed from the sample values of I.Q.’s.
540
COMPUTER BASED NUMERICAL AND STATISTICAL TECHNIQUES
Calculation for Sample Mean and S.D.
X
Xx–
di
Xx–
di
2
70 –27.2 739.84
120 22.8 519.84
110 12.8 163.84
101 3.8 14.44
88 –9.2 84.64
83 –14.2 201.64
95 –2.2 4.84
98 0.8 0.64
107 9.8 96.04
100 2.8 7.84
Total 972 1833.60
Hence n = 10,
x
=
972
10
= 97.2 and S
2
=
1833 60
9
.
= 203.73
∴
t
=
97 2 100
203 73 10
.–
./
=
28
20 37
.
.
=
28
4514
.
.
= 0.62
Tabulated t
0.05
for (10 – 1) i.e., 9 d.f. for two-tailed test is 2.262.
Conclusion: Since calculated t is less than tabulated t
0.05
for 9 d.f., H
0
may be accepted at
5% level of significance and we may conclude that the data are consistent with the assumption
of mean I.Q. of 100 in the population.
The 95% confidence limits within which the mean I.Q. values of samples of 10 boys will lie
are given by
x
± t
0.05
Sn/
= 97.2 ± 2.262 × 4.514 = 97.2 ± 10.21 = 107.41 and 86.99
Hence the required 95% confidence intervals is [86.99, 107.41]
Example 15. The mean weekly sales of soap bars in departmental stores was 146.3 bars per store.
After an advertising campaign the mean weekly sales in 22 stores for a typical week increased to 153.7 and
showed a standard deviation of 17.2. Was the advertising campaign successful?
Sol. We are given: n = 22,
x
= 153.7, s = 17.2.
Null Hypothesis: The advertising campaign is not successful, i.e.,
H
0
: µ = 146.3
Alternative Hypothesis: H
1
: µ > 146.3. (Right-tail).
Test Statistic: Under the null hypothesis, the test statistic is:
t =
x
sn
–
/–
µ
2
1
bg
~ t
22 – 1
= t
21
Now t =
153 7 146 3
17 2 21
2
.
−
af
=
74 21
17 2
.
.
×
= 9.03
TESTING OF HYPOTHESIS
541
Conclusion: Tabulated value of t for 21 d.f. at 5% level of significance for single-tailed test
is 1.72. Since calculated value is much greater than the tabulated value, therefore it is highly
significant. Hence we reject the null hypothesis.
Example 16. A machinist is making engine parts with axle diameters of 0.700 inch. A random
sample of 10 parts shows a mean diameter of 0.742 inch with a standard deviation of 0.040 inch. Compute
the statistic you would use to test whether the work is meeting the specifications. Also state how you would
proceed further.
Sol. Here we are given:
µ = 0.700 inches,
x
= 0.742 inches, s = 0.040 inches and n = 10
Null Hypothesis, H
0
: µ = 0.700, i.e., the product is conforming to specifications.
Alternative Hypothesis, H
1
: µ ≠ 0.700
Test Statistic : Under H
0
, the test statistic is:
t =
x
sn
–
/
µ
2
=
x
sn
–
/–
µ
2
1
bg
∼ t
(n – 1)
Now, t =
9 0 742 0 700
0 040
.–.
.
bg
= 3.15
Here the test statistic ‘t’ follows Student’s t-distribution with 10 – 1 = 9 d.f. We will now
compare this calculated value with the tabulated value of t for 9 d.f. and at certain level of
significance, say 5%. Let this tabulated value be denoted by t
0
.
(i) If calculated ‘t’ viz., 3.15 > t
0
, we say that the value of t is significant. This implies that
x
differs significantly from µ and H
0
is rejected at this level of significance and we
conclude that the product is not meeting the specifications.
(ii) If calculated t < t
0
, we say that the value of t is not significant, i.e., there is no significant
difference between
x
and µ. In other words, the deviation
x – µ
di
is just due to
fluctuations of sampling and null hypothesis H
0
may be retained at 5% level of
significance, i.e., we may take the product conforming to specifications.
Example 17. A random sample of size 16 has 53 as mean. The sum of squares of the derivation from
mean is 135. Can this sample be regarded as taken from the population having 56 as mean ? Obtain 95%
and 99% confidence limits of the mean of the population.
Sol. H
0
: There is no significant difference between the sample mean and hypothetical
population mean.
H
0
: µ = 56; H
1
: µ ≠ 56 (Two tailed test)
t :
X
sn
–
/
µ
∼ t(n – 1 d.f.)
Given:
X
= 53, µ = 56, n = 16, Σ
XX–
di
2
= 135
s =
Σ XX
n
–
–
di
2
1
=
135
15
= 3; t =
53 56
316
–
/
= – 4
t
= 4, d.f. = 16 – 1 = 15.
542
COMPUTER BASED NUMERICAL AND STATISTICAL TECHNIQUES
Conclusion: t
0.05
= 1.753.
Since
t
= 4 > t
0.05
= 1.753 i.e., the calculated value of t is more than the table value. The
hypothesis is rejected. Hence, the sample mean has not come from a population having 56 as
mean.
95% confidence limits of the population mean
=
X
±
s
n
t
0.05
= 53 ±
3
16
(1.725) = 51.706; 54.293
99% confidence limits of the population mean
=
X
±
s
n
t
0.01,
= 53 ±
3
16
(2.602) = 51.048; 54.951.
(i ) t-Test of Significance for Mean of a Random Sample: To test whether the mean of a
sample drawn from a normal population deviates significantly from a stated value when
variance of the population is unknown.
H
0
: There is no significant difference between the sample mean
x
and the population mean
µ i.e., we use the statistic.
t =
X
sn
–
/
µ
where
X
is mean of the sample.
s
2
=
1
1n –
XX
i
i
n
–
di
2
1
=
∑
with degrees of freedom (n – 1).
At given level of significance α
1
and degrees of freedom (n – 1). We refer to t-table t
α
(two
tailed or one tailed). If calculated t value is such that
t
< t
α
the null hypothesis is accepted and
for
t
> t
α
, H
0
is rejected.
(ii ) t-Test For Difference of Means of Two Samples: This test is used to test whether the
two samples x
1
, x
2
,
x
n
1
, y
1
, y
2
, ,
y
n
2
of sizes n
1
, n
2
have been drawn from two
normal populations with mean µ
1
and µ
2
respectively under the assumption that the
population variance are equal. (σ
1
= σ
2
= σ).
H
0
: The samples have been drawn from the normal population with means µ
1
and µ
2
i.e.,
H
0
: µ
1
≠ µ
2
.
Let
X,
Y
be their means of the two samples.
Under this H
0
the test of statistic t is given by t =
XY
s
nn
–
di
11
12
+
– t(n
1
+ n
2
– 2 d.f.)
Also, if the two sample’s standard deviations s
1
, s
2
are given then we have s
2
=
ns ns
nn
11
2
22
2
12
2
+
+
–
.
And, if n
1
= n
2
= n, t =
XY
ss
n
–
–
1
2
2
2
1
+
can be used as a test statistic.
TESTING OF HYPOTHESIS
543
If the pairs of values are in some way associated (correlated) we can’t use the test statistic
as given in Note 2. In this case, we find the differences of the associated pairs of values and apply
for single mean i.e., t =
X
sn
–
/
µ
with degrees of freedom n – 1.
The test statistic is t =
d
sn/
or t =
d
sn/–1
, where
d
is the mean of paired difference.
i.e., d
i
= x
i
– y
i
d
i
=
XY–
, where (x
i
, y
i
) are the paired data i = 1, 2, , n.
Example 18. Samples of sizes 10 and 14 were taken from two normal populations with S.D. 3.5 and
5.2. The sample means were found to be 20.3 and 18.6. Test whether the means of the two populations are
the same at 5% level.
Sol. H
0
: µ
1
= µ
2
, i.e., the means of the two populations are the same.
H
1
: µ
1
≠µ
2
.
Given
X
= 20.3,
X
2
= 18.6; n
1
= 10, n
2
= 14, s
1
= 3.5, s
2
= 5.2
s
2
=
ns ns
nn
11
2
22
2
12
2
+
+
–
=
10 3 5 14 5 2
10 14 2
22
–
af af
+
+
= 22.775. ∴ s = 4.772
t =
XX
s
nn
12
12
11
−
+
=
20 3 18 6
1
10
1
14
4 772
.– .
.+
F
H
G
I
K
J
= 0.8604
The value of t at 5% level for 22 d.f. is t
0.05
= 2.0739.
Conclusion: Since
t
= 0.8604 < t
0.05
the hypothesis is accepted i.e., there is no significant
difference between their means.
Example 19. Two samples of sodium vapour bulbs were tested for length of life and the following
results were got:
Size mean Sample S.D.
Type I 8 1234 hrs 36 hrs
Type II 7 1036 hrs 40 hrs
Sample
Is the difference in the means significant to generalise that Type I is superior to Type II regarding
length of life ?
Sol. H
0
: µ
1
= µ
2
, i.e., two types of bulbs have same lifetime.
H
1
: µ
1
> µ
2
i.e., type I is superior to type II.
s
2
=
ns ns
nn
11
2
22
2
12
2
+
+
–
544
COMPUTER BASED NUMERICAL AND STATISTICAL TECHNIQUES
=
836 740
872
22
af af
+
+ –
= 1659.076. ∴ s = 40.7317
The t-statistic t =
XX
s
nn
12
12
11
−
+
=
1234 1036
40 7317
1
8
1
7
–
. +
= 18.1480 ~ t(n
1
+ n
2
–2d.f)
t
0.05
at d.f. 13 is 1.77 (one tailed test).
Conclusion: Since calculated
t
> t
0.05
, H
0
is rejected i.e., H
1
is accepted.
∴ Type I is definitely superior to Type II.
where
X
=
i
n
=
∑
1
1
X
n
i
i
,
Y
=
j
n
=
∑
1
2
Y
n
j
2
; s
2
=
1
2
12
nn+ –
EX X Y Y
ij
––
di
ej
2
2
+
L
N
M
O
Q
P
is an unbiased
estimate of the population variance σ
2
.
t follows t-distribution with n
1
+ n
2
– 2 degrees of freedom.
Example 20. The following figures refer to observations in live independent samples:
Sample I
Sample II
25 30 28 34 24 20 13 32 22 38
40 34 22 20 31 40 30 23 36 17
Analyse whether the samples have been drawn from the populations of equal means.
Sol. H
0
: The two samples have been drawn from the population of equal means. i.e., there
is no significant difference between their means.
i.e., µ
1
= µ
2
H
1
: µ
1
≠µ
2
(Two tailed test)
Given n
1
= Sample I size = 10; n
2
= Sample II size = 10
To calculate the two sample mean and sum of squares of deviation from mean. Let X
1
be
the Sample I and X
2
be the Sample II.
X
1
25 30 28 34 24 20 13 32 22 38
XX–
1
– 1.6 3.4 1.4 7.4 –2.6 –6.6 –13.6 5.4 4.6 11.4
XX
1
1
2
–
di
2.56 11.56 1.96 54.76 6.76 43.56 184.96 29.16 21.16 129.96
X
2
40 34 22 20 31 40 30 23 36 17
XX
2
2
–
10.7 4.7 – 7.3 – 9.3 1.7 10.7 0.7 – 6.3 6.7 –12.3
XX
2
2
2
–
di
114.49 22.09 53.29 86.49 2.89 114.49 0.49 39.67 44.89 151.29
TESTING OF HYPOTHESIS
545
X
1
=
i=
∑
1
10
X
n
1
1
= 26.6
X
2
=
i=
∑
1
10
X
n
2
2
=
293
10
= 29.3
Σ
XX
1
1
2
–
di
= 486.4 Σ
XX
2
2
2
–
di
= 630.08
s
2
=
1
2
12
nn+ –
ΣΣXX XX
1
1
2
2
2
2
––
didi
+
L
N
M
O
Q
P
=
1
10 10 2+ –
[486.4 + 630.08] = 62.026. ∴ S = 7.875
Under H
0
the test statistic is given by
t =
XX
s
nn
12
12
11
–
+
=
26 6 29 3
7 875
1
10
1
10
.– .
. +
= – 0.7666 –t(n
1
+ n
2
– 2 d.f)
t
= 0.7666.
Conclusion: The tabulated value of t at 5% level of significance for 18 d.f. is 2.1. Since the
calculated value
t
= 0.7666 < t
0.05
. H
0
is accepted.
i.e., there is no significant difference between their means.
i.e., the two samples have been drawn from the populations of equal means.
Applications of t-Distribution: The t-distribution has a wide number of applications in
statistics, some of them are:
1. To test if the sample mean
X
di
differs significantly from the hypothetical value µ of the
population mean.
2. To test the significance between two sample means.
3. To test the significance of observed partial and mutiple correlation coefficients.
4. To test the significance of an observed sample correlation co-efficient and sample regression
coefficient. Also, the critical value or significant value of t at level of significance α and
degree of freedom ν for two tailed test are given by
P[
J
> t
ν
(α)] = α
⇒ P[
J
≤ t
ν
(α)] = 1– α
The significant value of t at level of significance ‘α’ for a single tailed test can be obtained
from those of two tailed test by considering the values at level of significance ‘2α’.
12.7.3 Snedecor’s Variance Ratio Test or F-test
Suppose we want to test (i) whether two independent samples x
i
and y
j
For i = 1, 2 , n
1
and j
= 1, 2, , n
2
have been drawn from the normal populations with the same variance σ
2
, (say) or
(ii) whether two independent estimates of the population variance are homogenous or not.