Measure of Central tendency

Statistics is the study of data. Data is collected resources that is translated into a meaningful information. Data is a measured values and it can be classified into four different perspectives. तथ्याङक शाश्त्र भनेको डाटाको अध्ययन गर्ने गणितको एउटा खण्ड हो। डाटा भन्नाले संकलन गरिएको कच्चा संसाधन हो जसलाई अर्थपूर्ण सुचनाको रुपमा मा प्रशोधन गर्नु पर्ने हुन्छ । समग्रमा, डाटा भन्नाले मापन गरिएको मान हो र जसलाई चार फरक दृष्टिकोणका आधारमा वर्गीकृत गर्न सकिन्छ।

Based on collection (Primary and Secondary) संग्रहमा आधारित (प्राथमिक र माध्यमिक)
Based on Scale/ Measurement (Nominal, Ordinal, Interval, Ratio) मापनमा आधारित (नाम बुझाउने, क्रम बुझाउने, अन्तराल बुझाउने, अनुपात बुझाउने)
Based on nature (Qualitative and Quantitative) प्रकृतिका आधारित (गुणात्मक र मात्रात्मक)
Based on distribution (Individual, Discrete, Continuous) बर्गिकरण \ वितरणका आधारमा (व्यक्तिगत, खण्डित, निरन्तर)

Based on these data, there are two common types of statistics.

Descriptive statistics
Inferential statistics

Descriptive statistics

A statistics that collects, organize and summarize the information is called Descriptive statistics. For example bar graph and mean.

Inferential statistics

A statistics that utilize current data and predicts it for future reference, is called inferential statistics. For example hypothesis test or regression analysis.

Measure of Central tendency

Measure of central tendency लाई केन्द्रीय प्रवृत्तिको मापन भनिन्छ । यसले तथ्याङकको केन्द्रमा हुने प्रवृत्तिको एकल मान (डाटा सेटको प्रतिनिधि मान) लाई जनाउदछ जसले डाटाको सम्पूर्ण मात्रात्मक सेटको प्रतिनिधित्व गर्दछ। यस केन्द्रीय प्रवृत्तिको मापनलाई स्थान वा स्थितिको मापन पनि भनिन्छ, यसैलाई औसत मापन पनि भनिन्छ।
The Measure of central tendency is a statistic that summarizes the entire quantitative set of data in a single value (a representative value of the data set) having a tendency to concentrate somewhere in the center of the data. Therefore, the tendency of the observations to cluster in the central part of the data is called the central tendency. It measures the central location (or position) of data set. It is also known as average.

NOTE

केन्द्रीय प्रवृत्तिको मापन जहिले पनि डाटा सेटको दायरा भित्र पर्दछ। The Measure of central tendency lies somewhere within the range of the data set
डाटा लाई फरक अर्डरमा पुनर्व्यवस्थित गर्दा केन्द्रीय प्रवृत्तिको मापनमा परिवर्तन हुदैन । The Measure of central tendency remain unchanged by a rearrangement of the data set

The most common types of such central tendencies are:

मध्ययक (Mean)
मध्यिका (Median), Quartile, Decile, Percentile
रीत (Mode)}

Mean

Mean is measure of central tendency that utilize each and every data to give a single best value. The arithmetic mean or simply mean is also knows as average, which is obtained by dividing the sum of all the observations by total number of observations (summed).
It is denoted by $\bar{X}$ and define as follows.

	Individual data	Discrete data	Continuous data
Arithmetic Mean	$\bar{X}=\frac{\sum{x}}{n}$	$\bar{X}=\frac{\sum{fx}}{n}$	$\bar{X}=\frac{\sum{fm}}{n}$
Geometric Mean	$\bar{X}= \left (\prod x \right)^{\frac{1}{n}}$	$\bar{X}=\left (\prod f x \right )^{\frac{1}{n}}$	$\bar{X}=\left( \prod f m \right )^{\frac{1}{n}}$
Harmonic Mean	$\bar{X}= \frac{n}{\sum \left( \frac{1}{x}\right )}$	$\bar{X}=\frac{n}{\sum \left( \frac{f}{x}\right )}$	$\bar{X}=\frac{n}{\sum \left( \frac{f}{m}\right )}$
Weighted Mean	$\bar{X}= \frac{\sum (w.x)}{\sum w}$	$\bar{X}= \frac{\sum (w.x)}{\sum w}$	$\bar{X}= \frac{\sum (w.x)}{\sum w}$

The common type of mean are

अंकगणितिय मध्यक (AM) Arithmetic Mean
The arithmetic mean answers the question, “if all the quantities have same value, what is the value to achieve the same total?” The answer is AM. For example, let Ram has Rs 100 and Shyam has Rs 120, then the avarage amount is AM, which is answered by
$AM =\frac{a+b}{2} =\frac{100+120}{2} =Rs 110$
In the figure below, a+b is same as AM+AM.
ज्यामितीय मध्यक (GM) Geometric Mean
The geometric mean answers the question, “if all the quantities have same value, what is the value to achieve the same product?”. The geometric mean is a useful when we expect changes in data in percentages as rate of change or ratios. It is utilised in the field of finance for the purpose of determining average growth rates, which are also known as the compounded annual growth rate. For example, let Ram deposited Rs 100 in a bank, on which 80% growth in first year and 25% growth in second year, then the average profit is GM, which is answered by
$GM =\sqrt{1.80 \times 1.25}=1.50$ , the average growth is 50%
Please note that, the situation can NOT be explained by $\frac{80+25}{2} =52.5\%$
In the figure below, a*b is same as GM*GM.
हार्मोनिक मध्यक (HM)Harmonic Mean
Harmonic Mean is used to calculate average speeds of various distances covered.For example, Let Ram traveled 100km with fuel efficiency 25KM per liter and next 100km with fuel efficiency 16KM per liter, then the average fuel efficiency is HM, which is answered by
$HM =\frac{2*25*16}{25+16}=19.51$
Please note that, the situation can NOT be explained by AM or GM
Because
The full efficiency for first 100 km is $\frac{100}{25}=4$ liter
The full efficiency for second 100 km is $\frac{100}{16}=6.25$ liter
The total fuel efficiency is
$\frac{200}{4+6.25}=19.51$
भारित मध्यक (WM)Weighted Mean}
A weighted mean is a kind of average where some data points contribute more “weight” than others. If all the weights are equal, then the weighted mean equals the arithmetic mean.

Application of Mean

The mean is calculated from all data value, so it is affected by each and every value of data set. It is applicable if the data distribution represents

Quantitative data
Closed ended
Normally distributed data

Relation between AM, GM and HM

Let a and b are two non-negative numbers, then

$GM^2=AM \times HM$
$AM \ge GM \ge HM$ Arithmetic mean is greater than geometric mean and harmonic mean, and geometric mean is greater than harmonic mean.

Let a and b are two non-negative numbers then,
$AM=\frac{a+b}{2}, GM=\sqrt{ab}, HM=\frac{2ab}{a+b}$
The proof are as follows:

Now, we have
$GM^2=ab$
or $GM^2=\frac{a+b}{2} \times \frac{2ab}{a+b}$
or $GM^2=AM \times HM$
Now, we have
$AM-GM=\frac{a+b}{2}-\sqrt{ab}$
or $AM-GM=\frac{a+b-2\sqrt{ab}}{2}$
or $AM-GM=\frac{{{\sqrt{a}}^{2}}+{{\sqrt{b}}^{2}}-2\sqrt{a}\sqrt{b}}{2}$
or $AM-GM=\frac{{{( \sqrt{a}-\sqrt{b} )}^{2}}}{2}$
or $AM\ge GM$ (1)
Similarly,
$GM-HM=\sqrt{ab}-\frac{2ab}{a+b}$
or $GM-HM=\frac{\sqrt{ab}( a+b )-2ab}{a+b}$
or $GM-HM=\frac{\sqrt{ab}( a+b )-2\sqrt{ab}\sqrt{ab}}{a+b}$
or $GM-HM=\frac{\sqrt{ab}}{a+b}( a+b-2\sqrt{ab} )$
or $GM-HM=\frac{\sqrt{ab}}{a+b}{{( \sqrt{a}-\sqrt{b} )}^{2}}$
or $GM\ge HM$ (2)
Combining (1) and (2), we get
$AM\ge GM\ge HM$

Visualization of the proof

Let us suppose that a and b are two given numbers. Now, draw a semi circle with diameter a+b.

Visualization of AM
By the property of radius and diameter, we get that
$AM =\frac{a+b}{2}$
Visualization of GM
By the mean proportionality property (squaring a rectangle), we can obtain by using the property of similarity that, DQ is the geometric mean given by
$GM =\sqrt{ab}$
Visualization of HM
By using proportionality, we get
Triangle ADQ and QDB are similar with AD=a, DB=b, so we have
$\frac{GM}{a}=\frac{b}{GM}$
or $GM= \sqrt{ab}$
Again, by using the property of similarity on OCDE, we get that, QR is the harmonic mean given by
$HM =\frac{2ab}{a+b}$
By using proportionality, we get
Triangle DRQ and ODQ are similar with QR=GM,QD= $\sqrt{ab}$ , OD= $\frac{a-b}{2}$ , so we have
$\frac{HM}{\sqrt{ab}}=\frac{\sqrt{ab}}{\frac{a+b}{2}}$
or $HM= \frac{2ab}{a+b}$

Example 1

Find the mean of the numbers 3, −7, 5, 13, −2
The sum of the numbers is
$\sum X= 3 − 7 + 5 + 13 − 2 = 12$
There are 5 numbers, so n=5.
Hence, the mean of the numbers is
$\bar{X}=\frac{\sum X}{n}=\frac{12}{5}=2.4$

Example 2

Find the mean of the wages from the following data

Wages	50	70	90	110	130	150
Number of Workers	2	4	5	6	2	1

Based on the data given above, the frequency table is prepared as below.

Wages $(X)$	Number of workers $(f)$	$f.x$
50	2	100
70	4	280
90	5	450
110	6	660
130	2	260
150	1	150
	$\sum f=n=20$	$\sum f x=1900$

Based on the formula, the mean wages is
$\bar{X}=\frac{\sum fx}{n}=\frac{1900}{20}=95$

Example 3

Find the average marks from the following data

Marks of the Students	0-20	20-40	40-60	60-80	80-100
Number of Students	20	50	55	40	15

Based on the data given above, the frequency table is prepared as below.

Marks of students $(X)$	Mid value of marks $m$	Number of students $(f)$	$f.m$
0-20	10	20	200
20-40	30	50	1500
40-60	50	55	2750
60-80	70	40	2800
80-100	90	15	1350
		$\sum f=n=180$	$\sum fm=8600$

Based on the formula, the average marks is
$\bar{X}=\frac{\sum fx}{n}=\frac{8600}{180}=47.8$

Median

Median is a measure of central tendency that utilize middle portion of the data to give a single best value. The median is the middle value of the data series when the values are placed in order of magnitude (in ascending or descending order). Therefore, Median is not affected by extreme values. It is denoted by $Md$ and define as follows.

	Individual	Discrete	Continuous
Median	$M_d=\frac{n+1}{2} \text{th item}$	$M_d=\frac{n+1}{2} \text{th item}$	$M_{d-class}=\frac{n+1}{2} \text{th item}$ with $M_d=L+\frac{\frac{N}{2}-cf}{f} \times i$

Calculating the median is also very simple. Here are the steps:

Sort the data in an ascending order.
Find the middle number of the sorted data.
If there’s an odd number of data, get the value exactly in the middle.
If there’s an even number of data, get the mean of the two middle values.

Application of Median: The median doesn’t know how far the data is. It only help to split data in two parts. It is applicable if the distribution represents

Qualitative data
Open ended or Skewed data

Example 1

Find the median of the following wages(in hundreds): $40,30,35,42,32,45,48$
Given wages (in hundreds) are
$40,30,35,42,32,45,48$
Arranging the wages in ascending order, we get

30	32	35	40	42	45	48
1st item	2nd item	3rd item	4th item	5th item	6th item	7th item

Here, the number of data are n=7, thus, based on the formula, the Median is
$M_d= \left (\frac{n+1}{2} \right )^{th}$ item
or $M_d= \left (\frac{7+1}{2} \right )^{th}$ item
or $M_d= 4^th$ item
or $M_d= 40$ hundreds

Example 2

Find the median of the wages from the following data

Wages	50	70	90	110	130	150
Number of Workers	2	4	5	6	2	1

Based on the data given above, the frequency table is prepared as below.

Wages $X$	Number of Workers $f$	Cumulative frequency $cf$
50	2	2
70	4	6
90	5	11
110	6	17
130	2	19
150	1	20

Here, the number of data are $n=20$ , thus, based on the formula, the Median is
$M_d= \left (\frac{n+1}{2} \right )^{th}$ item
or $M_d= \left (\frac{20+1}{2} \right )^{th}$ item
or $M_d= 10.5^th$ item
or $M_d= 90$

Example 3

Find the median marks from the following data

Marks of the Students	0-20	20-40	40-60	60-80	80-100
Number of Students	20	50	55	40	15

Based on the data given above, the frequency table is prepared as below.

Marks of the Students $X$	Number of Students $f$	Cumulative frequency $cf$
0-20	20	20
20-40	50	70
40-60	55	125
60-80	40	165
80-100	15	180

Here, the number of data are $n=180$ , thus, based on the formula, the Median class is
Md Class $= \left (\frac{n}{2} \right )^{th}$ item
or Md Class $= \left (\frac{180}{2} \right )^{th}$ item
or Md Class $= 90^th$ item
Here, $90^th$ item lies in the $cf$ of 125, thus
$L=40,f=55, cf=70,i=20$
Hence, the Median is
$M_d=L+\frac{\frac{N}{2}-cf}{f} \times i$
or $M_d=40+\frac{\frac{180}{2}-70}{55} \times 20=47.27$

Example 4

Find the median marks from the following data

Marks of the Students	0-20	20-40	40-60	60-80	80-100
Number of Students	2	3	5	4	6

Based on the data given above, the frequency table is prepared as below. \

Marks of the Students $X$	Number of Students $f$	Cumulative frequency $cf$
0-20	2	2
20-40	3	5
40-60	5	10
60-80	4	14
80-100	6	20

Here, the number of data are $n=20$ , thus, based on the formula, the Median class is
Md Class $= \left (\frac{n}{2} \right )^{th}$ item
or Md Class $= \left (\frac{20}{2} \right )^{th}$ item
or Md Class $= 10^th$ item
Here, $10^th$ item lies in the $cf$ of 10, thus
$L=40,f=5, cf=5,i=20$
Hence, the Median is
$M_d=L+\frac{\frac{N}{2}-cf}{f} \times i$
or $M_d=40+\frac{\frac{20}{2}-5}{5} \times 20=60$
NOTE
In the example above, student may ask that the median $60$ does not lie in the class $40-60$ as instructed for inclusive data groupings, teaches need to encourage the usual rules for computing.

Quartile, Decile and Percentile

The formula for Quartile, Decile and Percentile are similar as of Median.

	Individual	Discrete	Continuous
Quartile k=1,2,3	$Q_k=\frac{k(n+1)}{4} \text{th item}$	$Q_k=\frac{k(n+1)}{4} \text{th item}$	$Q_{k-class}=\frac{k(n)}{4} \text{th item}$ with $Q_k=L+\frac{\frac{kn}{4}-cf}{f} \times i$
Decile $k=1,2,\cdots 9$	$D_k=\frac{k(n+1)}{4} \text{th item}$	$D_k=\frac{k(n+1)}{4} \text{th item}$	$D_{k-class}=\frac{k(n)}{4} \text{th item}$ with $D_k=L+\frac{\frac{kn}{4}-cf}{f} \times i$
Percentile $k=1,2,\cdots 99$	$P_k=\frac{k(n+1)}{4} \text{th item}$	$P_k=\frac{k(n+1)}{4} \text{th item}$	$P_{k-class}=\frac{k(n)}{4} \text{th item}$ with $P_k=L+\frac{\frac{kn}{4}-cf}{f} \times i$

Mode

The concept of mode, as a measure of central tendency, is preferable when it is desired to know the most typical value, e.g., the most common size of shoes, the most common size of a ready-made garment, the most common size of income, the most common size of pocket expenditure of a college student, the most common size of a family in a locality, the most common duration of cure of viral-fever, the most popular candidate in an election, etc.
Thus, Mode is a measure of central tendency that utilize fashionable (most repeated data) information to give a single best value. So, Mode is an average value which occurs most frequently in a set of data i.e. it indicates the most frequent (common) results. It is not affected by every values. It is denoted by $Mo$ and define as follows.

	Individual	Discrete	Continuous
Mode	Repeated data	Repeated data/ Table analysis	$M_0=L+\frac{f_1-f_0}{2f_1-f_0-f_2} \times i$ $M_0=L+\frac{f_2}{f_0+f_2} \times i$

Application of Mode: The mode doesn’t know anything about any number in the collection but the one which appears most frequently. It is best applicable when concerning about

Frequency related data
Fashionable data

Example 1

Find the mode value of the following data: $3, 7, 5, 13, 20, 23, 39, 23, 40, 23, 14, 12, 56, 23, 29$
Given data set are
$3, 7, 5, 13, 20, 23, 39, 23, 40, 23, 14, 12, 56, 23, 29$
In frequency table, the data set becomes

X	3	5	7	12	13	14	20	23	29	39	40	56
f	1	1	1	1	1	1	1	4	1	1	1	1

Being highest frequency 4, the mode value is $23$ .

Example 2

Find the Mode of the wages from the following data\par

Wages	50	70	90	110	130	150
Number of Workers	2	4	5	6	2	1

Being highest frequency 6, the mode value is $110$ .

Example 3

Find the Mode of from the following data

Wages	0-10	10-20	20-30	30-40	40-50	50-60	60-70
Number of Workers	4	12	15	18	32	14	13

Being highest frequency 4, the model class is $40-50$ . Thus,
$L=40,f_0=18,f_1=32,f_2=14,i=10$
Hence, using formula, the Mode is
$M_0=L+\frac{f_1-f_0}{2f_1-f_0-f_2} \times i$
or $M_0=40+\frac{32-18}{2 \times 32-18-14} \times 10=44.47$

Analytical method to find the Mode

If the frequency distribution is regular, then mode is determined by the value corresponding to maximum frequency. There may be a situation where frequency distribution is NOT regular, means the concentration of observations around a value having maximum frequency is less than the concentration of observations around some other value. In such a situation, mode cannot be determined by the use of maximum frequency criterion. Further, there may be concentration of observations around more than one value of the variable and, accordingly, the distribution is said to be bi-modal or multi-modal depending upon whether it is around two or more than two values. In such cases, we use analytical method (also called tabular or grouping or empirical method) to find the Mode.
दिएको श्रेणिमा Mode अस्पष्ट भएमा वा तलका निम्न अवस्थामा यो बिधीको प्रयोग गरिन्छ ।

highest frequency सख्या एक भन्दा बढी भएमा
highest frequency तथ्याङकको सुरु वा अन्यतिर भएमाा
highest frequency को वरिपरि ठुला frequency भएमाा
frequency को अनियमित घटबढ भएमाा

यस अवस्थामा Mode पत्ता लगाउन Empirical Method (Mode=3 Median -2 Mean) वा analytical method प्रयोग गर्न सकिन्छ । तर यि दुबै बिधीमध्ये analytical method लाई बढी बिश्वासनिय मानिन्छ।

Example 4

Find the Mode of from the following data

Wages	10	20	30	40	50	60	70	80	90
Number of Workers	1	5	17	22	21	20	9	3	4

Here, the maximum frequency is $22$ , however three are big frequencies around 22, thus we use analytical method to find the Mode.
Hence, based on the rule, the analytic table is given as below.

Wages	$f$	1st + 2nd	2nd+ 3rd	1st+2nd+3rd	2nd+3rd+4th	3rd+4th+5th
10	1
		6
20	5			23
			22
30	17				44
		39
40	22					60
			43
50	21			63
		41
60	20				50
			29
70	9					32
		12
80	3			16
			7
90	4

Prepare a table consisting of 7 column, 1st column for X, 2nd column for frequencies of X.
In third column, add the frequencies, starting from the top and grouped in twos.
In forth column, add the frequencies, starting from the second and grouped in twos.
In fifth column, add the frequencies, starting from the top and grouped in threes .
In sixth column, add frequencies, starting from the top second and grouped in threes.
In seventh column, add the frequencies, starting from the top third and grouped in threes.
Finally, prepare frequency chart based on the analytic table

Based on the analytic table, the frequency chart is prepared as below.

Column	10	20	30	40	50	60	70	80	90
1				1
2					1	1
3				1	1
4				1	1	1
5					1	1	1
6			1	1	1
Total			2	4	5	3	1

Here, the highest frequency is aligned with 50, therefore, Mode=50.

Relation between Mean Median and Mode

A distribution in which the values of mean, median and mode coincide (i.e. mean = median = mode) is known as a symmetrical distribution.
Conversely, when values of mean, median and mode are not equal the distribution is known as asymmetrical or skewed distribution. In moderately skewed or asymmetrical distribution, a very important relationship exists among these three measures of central tendency. In such distributions
Mode = 3 Median – 2 Mean

Measure of Central tendency

Descriptive statistics

Inferential statistics

Measure of Central tendency

Mean

Application of Mean

Relation between AM, GM and HM

Visualization of the proof

Example 1

Example 2

Example 3

Median

Example 1

Example 2

Example 3

Example 4

Quartile, Decile and Percentile

Mode

Example 1

Example 2

Example 3

Analytical method to find the Mode

Example 4

Relation between Mean Median and Mode

Symmetrical Distribution

Leave a Reply Cancel reply