Comments
Description
Download Final Presentation 1 (1)
Transcript
Presentation On
Analytical Characteristics of Bangladesh
By Division
Submitted to
M. Amir Hossain, Ph.D.
Professor, Applied Statistics
D.U.
East West University, Bangladesh.
H.M. Faisal Ahmed2010291021
Submitted By
Group C
INTRODUCTION
Data Presentation
WehavecollecteddemographicdatafromBBS
(BangladeshBureauofStatistics)website
www.bbs.gov.bd/Home.aspx.Wedecidedtocollecttwo
types of data (Qualitative & Quantitative).For Qualitative
datawehaveconsideredthedataabouttheLandArea,
andnumberofMaleandFemaleinadivisionandfor
quantitative data we have considered the data about age.
WehaveappliedthedataindifferenttypesofData
Presentation techniques.
TECHNIQUES USED
Bar Chart
Histogram
Frequency Polygon
Cumulative Frequency Curve
FACT TABLE
The Bar chart and Histogram are based on the
following fact table:
Based on Enumerated population in 2011
DIVISIONAREAMALEFEMALE
BARISAL 13,645 4,006,000 4,140,000
CHITTAGONG 33,771 13,763,000 14,361,000
DHAKA 30,989 23,814,000 22,915,000
RAJSHAHI 34,495 9,183,000 9,146,000
KHULNA 22,285 7,782,0007,781,000
SYLHET 12,596 4,882,000 4,925,000
BAR CHART
A bar chart or bar graph is a way of showing information by
the lengths of a set of bars. The bars are drawn horizontally
orvertically.Ifthebarsaredrawnvertically,thenthegraph
canbecalledacolumngraphorablockgraph.Achart
whichdisplaysasetoffrequenciesusingbarsofequal
width whose heights are proportional to the frequencies.
Inourpresentationtheheightofthebarsrepresentsthe
numberofdifferentindividuals,theXaxisrepresents
different division and Y axis the number of individuals.
BAR CHART (CONTINUED)
Chart 01: Bar Chart of Male and Female per Division
BAR CHART (CONTINUED)
14
34
31
34
22
13
0
5
10
15
20
25
30
35
40
Thousands
B
a
r
i
s
a
l
C
h
i
t
a
g
o
n
g
D
h
a
k
a
R
a
j
s
h
a
h
i
K
h
u
l
n
a
S
y
l
h
e
t
Land Area (Square
Killometer)
Chart 2: Bar Chart of Land Area per Division
HISTOGRAM
A graphical representation, similar to a bar chart in structure,
thatorganizesagroupofdatapointsintouserspecified
ranges.Thehistogramcondensesadataseriesintoan
easilyinterpretedvisualbytakingmanydatapointsand
groupingthemintologicalrangesorbins.Instatistics,a
histogramisagraphicaldisplayoftabulatedfrequencies,
shownasbars.Itshowswhatproportionofcasesfallinto
each of several categories: it is a form of data binning. The
categories are usually specified as nonoverlapping intervals
ofsomevariable.Thecategories(bars)mustbeadjacent.
The intervals are generally of the same size.
Histogramsareusedtoplotdensityofdata,andoftenfor
densityestimation:estimatingtheprobabilitydensity
function of the underlying variable.
HISTOGRAM (CONTINUED)
Chart 04: Histogram of Male & Female per Division
HISTOGRAM (CONTINUED)
0
5000
10000
15000
20000
25000
30000
35000
40000
Chittagong
Division
Dhaka Division Khulna
Division
Rajshai
Division
Sylhet Division
Chart 05: Histogram of Land Area per Division
FREQUENCY POLYGON
A frequency polygon is a graphical display of a frequency
table. The intervals are shown on the Xaxis and the number
of scores in each interval is represented by the height of a
point located above the middle of the interval (Class Mark).
The points are connected so that together with the Xaxis
they form a polygon.
In our presentation Class Marks (Class Mid Points) are
plotted through X axis and Number of individuals in that
class are plotted through Y axis.
FREQUENCY POLYGON (CONTINUED)
Frequency Distribution Table (With class Mark)
ClassClass MarkFrequency
4044427133824
4549475152206
5054524322404
5559572774265
6064622662799
6469671758685
7074721461443
ClassClass MarkFrequency
0004214465810
0509716534124
10141215704322
15191712186950
20242210688351
2529279858549
3034329363144
3539378198944
FREQUENCY POLYGON (CONTINUED)

2
4
6
8
10
12
14
16
18
 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80
Age
P
o
p
u
l
a
t
i
o
n
N
u
m
b
e
r
Millions
Chart 06: Frequency Polygon of peoples age Information of Bangladesh
CUMULATIVE FREQUENCY CURVE
Also known as an ogive, this is a curve drawn
by plotting the value of the first class on a
graph. The next plot is the sum of the first and
second values, the third plot is the sum of the
first, second, and third values, and so on. The
total of a frequency and all frequencies below it
in a frequency distribution.
In our presentation cumulative frequency of age
groups is plotted through Y axis and Class
Frequency through Class Mark is plotted
through X axis.
CUMULATIVE FREQUENCY CURVE (CONT.)
ClassClass MarkFrequency
Cumulative
Frequency
000421446581014465810
050971653412430999935
1014121570432246704257
1519171218695058891207
2024221068835169579559
252927985854979438108
303432936314488801253
353937819894497000197
CUMULATIVE FREQUENCY CURVE (CONT.)
ClassClass MarkFrequency
Cumulative
Frequency
4044427133824104134021
4549475152206109286228
5054524322404113608632
5559572774265116382897
6064622662799119045696
6469671758685120804382
7074721461443122265825
CUMULATIVE FREQUENCY CURVE (CONT.)
0
20
40
60
80
100
120
140
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80
Age
P
o
p
u
l
a
t
i
o
n
N
u
m
b
e
r
s
Millions
Chart 07: Cumulative Frequency Curve of Age Information of Bangladesh
COMMENT
The Assignment was done within short time
that‟s why there might be some errors in our
analysis but still the data will be able to
visualize the actual picture.
MEASURES OF DISPERSION
Thedescriptivestatisticsthatmeasurethe
qualityofscatterarecalledmeasuresof
dispersion.Measuresofdispersiongivea
morecompletepictureofthedataset.It
deals with spread of data. A small value of
themeasureofdispersionindicatesthat
data are clustered closely. A large value of
dispersion indicates the estimate of central
tendency is not reliable.
TYPES OF MEASURES OF
DISPERSION
There are many type of measurement of dispersion, herewe discuss as
below
Absolute Measures of Dispersion:
These measures give us an idea about the amount of dispersion in a set of
observations.Theygivetheanswersinthesameunitsastheunitsofthe
original observations. When the observations are in kilograms, the absolute
measure is also in kilograms. If we have two sets of observations, we cannot
alwaysusetheabsolutemeasurestocomparetheirdispersion.Weshall
explain later as to when the absolute measures can be used for comparison
ofdispersionintwoormorethantwosetsofdata.Theabsolutemeasures
which are commonly used are:
1. Range
2. Mean Deviation
3. Variance
4. Standard Deviation
Relative Measure of Dispersion:
These measures are calculated for the comparison of dispersion in
two or more than two sets of observations. These measures are free
of the units in which the original data is measured. If the original
data is in dollar or kilometers, we do not use these units with relative
measure of dispersion. These measures are a sort of ratio and are
called coefficients. Each absolute measure of dispersion can be
converted into its relative measure. Hear we only discuses:
1. Coefficient of Variance
TYPES OF MEASURES OF
DISPERSION
RANGE
For ungroup data: The simplest measure of dispersion is the
range.Therangeiscalculatedbysimplytakingthe
differencebetweenthemaximumandminimumvaluesin
the data set.
Range=Highest ValueLowest Value
Forgroupdata:Iftherearegroupdatathantherangeis
calculatedbytakingthedifferencebetweentheupperlimit
of the highest class and the lower limit of the lowest class.
Range= upper limit of the highest class lower
limit of the lowest class.
MEAN DEVIATION
Themeandeviationisthefirstmeasure
ofdispersionthatwewillusethatactually
uses each data value in its computation. It
is the mean of the distances between each
value and the mean. It gives us an idea of
howspreadoutfromthecenterthesetof
values is.
For ungroup data:
For group data:
MD
X X
n
=
÷ E
f
 X X  f
MD
E
÷ E
=
I I
VARIANCE
Varianceisamathematicalexpressionof
theaveragesquareddeviationsfromthe
mean.Wecansaidalso,thearithmetic
meanofthesquaresofthedeviationsof
allvaluesinasetofnumbersfromtheir
arithmetic mean.
Population Variance:
_
Sample Variance:
o
µ
2
2
=
÷ E( ) X
N
1
) (
2
2
÷
÷ E
=
n
X X
S
VARIANCE
Working formula for population variance is:
Working formula for sample variance is:
2
2
2
) (
N
X
N
X E
÷
E
= o
1
) (
S
2
2
2
÷
E
÷ E
=
n
n
X
X
RELATIVE DISPERSION
Theusualmeasureofdispersioncannotbe
usedtocomparethedispersioniftheunits
aredifferent,eventheunitaresamebutthe
means are different.
It reports variation relative to the mean.
Itisusefulforcomparingdistributionswith
different units.
Hear we only discuses:
1. Coefficient of Variation
COEFFICIENT OF VARIANCE
TheCVistheratioofthestandard
deviationtothearithmeticmean,
expressedasapercentage.Wecanalso
said, to compare the variations (dispersion)
of two different series, relative measures of
standarddeviationmustbecalculated.
This is known as coefficient of variation.
The formula of CV is given bellow:
100 × =
X
s
CV
Class IntervalFrequencyX/Midpointxf 
f
0004
0509
1014
1519
2024
2529
3034
3539
4044
4549
5054
5559
6064
6569
7074
14.46
16.53
15.70
12.18
10.68
9.85
9.36
8.19
7.13
5.15
4.32
2.77
2.66
1.75
1.46
2
7
12
17
22
27
32
37
42
47
52
57
62
67
72
28.92
115.78
188.4
207.06
234.96
265.95
299.52
303.03
299.46
242.05
224.64
157.89
164.92
117.25
105.12
22.18
17.18
12.18
7.18
2.18
2.82
7.82
12.82
17.82
22.82
27.82
32.82
37.82
42.82
47.82
22.18
17.18
12.18
7.18
2.18
2.82
7.82
12.82
17.82
22.82
27.82
32.82
37.82
42.82
47.82
320.72
283.98
191.22
87.45
23.68
27.77
73.19
104.99
127.05
117.52
120.18
90.91
100.60
74.93
69.81
491.95
295.15
148.35
51.55
4.75
7.95
61.15
164.35
317.55
520.75
773.95
1077.15
1430.35
1833.55
2286.75
7113.59
4878.82
2329.09
627.87
50.73
78.30
572.364
1346.02
2264.13
2681.86
3343.46
2983.70
3804.73
3208.71
3338.65
122.192954.95181438700.32
X X ÷   X X ÷
  X X ÷
( )
2
X X ÷
( )
2
X X f ÷
Range= 740 = 74
_
X= 2954.95/122.19= 24.18
_
Mean Deviation= = 1814/122.19=14.8457
f
  f
E
÷ E X X
EXAMPLE
Determination of the year 2011:
Figure in “Mil”
Variance,=38700.32/122.19= 316.72
Standard Deviation=
= 17.7966
Coefficient of Variance (CV)=
= (17.7966/24)X100
= 74.15%
( )
f
X X
S
E
÷ E
=
2


.

\

÷
÷ E
= =
1
) (
2
2
n
X X
S S
100 × =
X
s
CV

.

\

=
122.19
38700.32
CORRELATION
Helps to take decision and identifying the nature of business
and economic decisions
Helpful in identifying the nature of relationship among many
business and economic variables
One variable depends on another and can be determined by it
The Coefficient of Correlation (r) is a measure of the strength of
the relationship between two variables.
It requires interval or ratioscaled data (variables).
It can range from 1.00 to 1.00.
Values of 1.00 or 1.00 indicate perfect and strong correlation.
Values close to 0.0 indicate no linear correlation.
Negative values indicate an inverse relationship and positive
values indicate a direct relationship
DATA TABLE
XY
X X^YY^(XX^)(YY^)(XX^)
2
(YY^)
2
341
214284196
7 76
221424441
6 56
11111
5 78
02300529
2 43
312369144
1 34
4218416441
X^ = 5Y^ = 55
¿(XX^)(YY^)
=191
¿XX^)
2
= 34¿ (YY^)
2
= 1752
VALUEOF‘R’
r = 0.78
Comment:As, the value or „r‟ is positive , so
the variables have stronger relation between
them.
REGRESSION
Aregressionisastatisticalanalysisassessingtheassociation
betweentwovariables.Itisusedtofindtherelationship
between two variables.
General form of linear regression model
Y = a + bX + e
Where,
Y : dependent variable
a : intercept term
b : slope of the line
X : independent variable
e : error term
Want to estimate a and b such that ∑e
2
is minimum
REGRESSION ANALYSIS
XY
XẊY Ȳ(X Ẋ)(Y Ȳ)(X Ẋ)
2
341
214284
7 76221424
6 561111
5 78
02300
2 43
312369
1 34
4218416
Ẋ= 5Ȳ= 55
¿(X Ẋ)(Y Ȳ)
=191
¿(X Ẋ)2
= 34
REGRESSION ANALYSIS
So,
Here after putting the value,
= 191/34 = 5.6
a= 55  5.6(5) =27
Form the linear regression model, Y = 27 + 5.6X
Here regression coefficient is 5.6 that means if we change 1 unit of
independent variable, dependent variable will change 5.6.
THANK YOU