![]() |
Descriptive statistics is a subfield of statistics that deals with characterizing the features of known data. Descriptive statistics give summaries of either population or sample data. Aside from descriptive statistics, inferential statistics is another important discipline of statistics used to draw conclusions about population data. Descriptive statistics is divided into two categories:
In this article, we will learn about descriptive statistics, including their many categories, formulae, and examples in detail. What is Descriptive Statistics?Descriptive statistics is a branch of statistics focused on summarizing, organizing, and presenting data in a clear and understandable way. Its primary aim is to define and analyze the fundamental characteristics of a dataset without making sweeping generalizations or assumptions about the entire data set. The main purpose of descriptive statistics is to provide a straightforward and concise overview of the data, enabling researchers or analysts to gain insights and understand patterns, trends, and distributions within the dataset. Descriptive statistics typically involve measures of central tendency (such as mean, median, mode), dispersion (such as range, variance, standard deviation), and distribution shape (including skewness and kurtosis). Additionally, graphical representations like charts, graphs, and tables are commonly used to visualize and interpret the data. Histograms, bar charts, pie charts, scatter plots, and box plots are some examples of widely used graphical techniques in descriptive statistics. Descriptive Statistics Definition
Types of Descriptive StatisticsThere are three types of descriptive statistics:
Measures of Central TendencyThe central tendency is defined as a statistical measure that may be used to describe a complete distribution or dataset with a single value, known as a measure of central tendency. Any of the central tendency measures accurately describes the whole data distribution. In the following sections, we will look at the central tendency measures, their formulae, applications, and kinds in depth.
MeanMean is the sum of all the components in a group or collection divided by the number of items in that group or collection. Mean of a data collection is typically represented as x̄ (pronounced “x bar”). The formula for calculating the mean for ungrouped data to express it as the measure is given as follows: For a series of observations:
where,
Example: Weights of 7 girls in kg are 54, 32, 45, 61, 20, 66 and 50. Determine the mean weight for the provided collection of data.
MedianMedian of a data set is the value of the middle-most observation obtained after organizing the data in ascending order, which is one of the measures of central tendency. Median formula may be used to compute the median for many types of data, such as grouped and ungrouped data.
where,
Example: Weights of 7 girls in kg are 54, 32, 45, 61, 20, 66 and 50. Determine the median weight for the provided collection of data.
ModeMode is one of the measures of central tendency, defined as the value that appears the most frequently in the provided data, i.e. the observation with the highest frequency is known as the mode of data. The mode formulae provided below can be used to compute the mode for ungrouped data.
Example: Weights of 7 girls in kg are 54, 32, 45, 61, 20, 45 and 50. Determine the mode weight for the provided collection of data.
Measures of DispersionIf the variability of data within an experiment must be established, absolute measures of variability should be employed. These metrics often reflect differences in a data collection in terms of the average deviations of the observations. The most prevalent absolute measurements of deviation are mentioned below. In the following sections, we will look at the variability measures, their formulae in depth.
RangeThe range represents the spread of your data from the lowest to the highest value in the distribution. It is the most straightforward measure of variability to compute. To get the range, subtract the data set’s lowest and highest values.
Example: Calculate the range of the following data series: 5, 13, 32, 42, 15, 84
Standard DeviationStandard deviation (s or SD) represents the average level of variability in your dataset. It represents the average deviation of each score from the mean. The higher the standard deviation, the more varied the dataset is. To calculate standard deviation, follow these six steps: Step 1: Make a list of each score and calculate the mean. Step 2: Calculate deviation from the mean, by subtracting the mean from each score. Step 3: Square each of these differences. Step 4: Sum up all squared variances. Step 5: Divide the total of squared variances by N-1. Step 6: Find the square root of the number that you discovered. Example: Calculate standard deviation of the following data series: 5, 13, 32, 42, 15, 84. Solution: Step 1: First we have to calculate the mean of following series using formula: Σx / n Step 2: Now calculate the deviation from mean, subtract the mean from each series. Step 3: Squared the deviation from mean and then add all the deviation.
Step 4: Divide the squared deviation with N-1 => 4182.84 / 5 = 836.57 Step 5: √836.57 = 28.92 So, the standard deviation is 28.92 VarianceVariance is calculated as average of squared departures from the mean. Variance measures the degree of dispersion in a data collection. The more scattered the data, the larger the variance in relation to the mean. To calculate the variance, square the standard deviation.
Example: Calculate the variance of the following data series: 5, 13, 32, 42, 15, 84. Solution:
Mean DeviationMean Deviation is used to find the average of the absolute value of the data about the mean, median, or mode. Mean Deviation is some times also known as absolute deviation. The formula mean deviation is given as follows:
where,
Quartile DeviationQuartile Deviation is the Half of difference between the third and first quartile. The formula for quartile deviation is given as follows:
where,
Other measures of dispersion include the relative measures also known as the coefficients of dispersion. Measures of Frequency DistributionDatasets consist of various scores or values. Statisticians employ graphs and tables to summarize the occurrence of each possible value of a variable, often presented in percentages or numerical figures. For instance, suppose you were conducting a poll to determine people’s favorite Beatles. You would create one column listing all potential options (John, Paul, George, and Ringo) and another column indicating the number of votes each received. Statisticians represent these frequency distributions through graphs or tables Univariate Descriptive StatisticsUnivariate descriptive statistics focus on one thing at a time. We look at each thing individually and use different ways to understand it better. Programs like SPSS and Excel can help us with this. If we only look at the average (mean) of something, like how much people earn, it might not give us the true picture, especially if some people earn a lot more or less than others. Instead, we can also look at other things like the middle value (median) or the one that appears most often (mode). And to understand how spread out the values are, we use things like standard deviation and variance along with the range. Bivariate Descriptive StatisticsWhen we have information about more than one thing, we can use bivariate or multivariate descriptive statistics to see if they are related. Bivariate analysis compares two things to see if they change together. Before doing any more complicated tests, it’s important to look at how the two things compare in the middle. Multivariate analysis is similar to bivariate analysis, but it looks at more than two things at once, which helps us understand relationships even better. Representations of Data in Descriptive StatisticsDescriptive statistics use a variety of ways to summarize and present data in an understandable manner. This helps us grasp the data set’s patterns, trends, and properties. Frequency Distribution Tables: Frequency distribution tables divide data into categories or intervals and display the number of observations (frequency) that fall into each one. For example, suppose we have a class of 20 students and are tracking their test scores. We may make a frequency distribution table that contains score ranges (e.g., 0-10, 11-20) and displays how many students scored in each range. Graphs and Charts: Graphs and charts graphically display data, making it simpler to understand and analyze. For example, using the same test score data, we may generate a bar graph with the x-axis representing score ranges and the y-axis representing the number of students. Each bar on the graph represents a score range, and its height shows the number of students scoring within that range. These approaches help us summarize and visualize data, making it easier to discover trends, patterns, and outliers, which is critical for making informed decisions and reaching meaningful conclusions in a variety of sectors. Descriptive Statistics ApplicationsDescriptive statistics are used in a variety of sectors to summarize, organize, and display data in a meaningful and intelligible way. Here are a few popular applications:
Difference Between Descriptive Statistics and Inferential StatisticsDifference between Descriptive Statistics and Inferential Statistics is studied using the table added below as,
Example of Descriptive Statistics ExamplesExample 1: Calculate the Mean, Median and Mode for the following series: {4, 8, 9, 10, 6, 12, 14, 4, 5, 3, 4} Solution:
Example 2: Calculate the Range for the following series: {4, 8, 9, 10, 6, 12, 14, 4, 5, 3, 4} Solution:
Example 3: Calculate the standard deviation and variance of following data: {12, 24, 36, 48, 10, 18} Solution: First we are going to compute standard deviation. For standard deviation calculate the mean, deviation from mean and squared deviation.
Dividing squared deviation with N-1 => 1093.351 / 5 = 218.67 √(218.67) = 14.79 So, the standard deviation is 14.79. Now we are going to calculate the variance. s = 14.79 s2 = 218.744 So, the variance is 218.744 Practice Problems on Descriptive StatisticsP1) Determine the sample variance of the following series: {17, 21, 52, 28, 26, 23} P2) Determine the mean and mode of the following series: {21, 14, 56, 41, 18, 15, 18, 21, 15, 18} P3) Find the median of the following series: {7, 24, 12, 8, 6, 23, 11} P4) Find the standard deviation and variance of the following series: {17, 28, 42, 48, 36, 42, 20} FAQs of Descriptive StatisticsWhat is meant by descriptive statistics?
How is the mean computed in descriptive statistics?
What role do measures of variability play in descriptive statistics?
Can you explain the median in descriptive statistics?
How can frequency distribution measurements contribute to descriptive statistics?
How are inferential statistics distinguished from descriptive statistics?
Why are descriptive statistics necessary in data analysis?
What are the four types of descriptive statistics?
Which is an example of descriptive statistics?
|
Reffered: https://www.geeksforgeeks.org
Mathematics |
Type: | Geek |
Category: | Coding |
Sub Category: | Tutorial |
Uploaded by: | Admin |
Views: | 13 |