Browse By Unit
Sadiyya Holsey
Dalia Savy
Sadiyya Holsey
Dalia Savy
Descriptive statistics involves the use of numerical data to measure and describe the characteristics of groups, and this includes measures of central tendency and variation. We'll be focusing on descriptive statistics in this study guide! It does not involve making inferences about a population based on sample data.
Inferential statistics, on the other hand, involves using statistical methods to make inferences about a population based on data. It allows you to draw conclusions about a population based on the characteristics of a sample. Specifically, it provides a way to see validity drawn from the results of the experiment🧪🔬.
Therefore, descriptive statistics describe the data, while inferential statistics tell us what the data means.
When one has a ton of data, how do they begin to go through it? Typically, a researcher would construct and interpret a graph with their data, and they use descriptive statistics to do so. 📈
Measures of central tendency are statistical values that represent the center or typical value of a dataset. The three most commonly used measures of central tendency are the mean, median, and mode.
The easiest to spot is the mode: which value, if any, appears more often than others? Here, we can see 5 twice, so the mode of this dataset is 5.
Then, you may want to calculate the mean by adding all of these data values and dividing by the total. Since we have seven values, we have to divide by seven: (5 + 10 + 5 + 7 + 12 + 15 + 18)/7 = 10.286
The median is the middle of the data set when the numbers are in order. Make sure you always put them in order!! If you do so here, you will find that the median is 10.
Measures of variation describe how spread out or dispersed the values in a dataset are. The most commonly used measure of variation is the standard deviation, which is a measure of how much the values in a dataset deviate from the mean. It is basically used to assess how far the values are spread below and above the mean. A dataset with a low standard deviation has values that are relatively close to the mean, while a dataset with a high standard deviation has values that are more spread out.
Another, less complex, measure of variation you should be familiar with for this course is the range of a dataset. Range is just the difference between the highest and lowest values in the dataset.
The correlation coefficient is a statistical measure that describes the strength and direction of the relationship between two variables. It can range from -1 to 1. A value of -1 indicates a strong negative relationship, a value of 1 indicates a strong positive relationship, and a value of 0 indicates no relationship.
You can simply think of it as a measure of how well two variables are correlated, and the closer it is to -1 or +1, the stronger the correlation.
Positive correlation shows that as one variable increases ⬆️, the other variable increases ⬆️. For example, a positively correlated group may show that as height increases, weight increases as well.
A frequency distribution is a breakdown of how the scores fall into different categories or ranges. There are several types of frequency distributions:
95% of the data falls within two standard deviations of the mean. Since 2 standard deviations are equal to 30, the data falls between 70 and 130, or +-30 points of 100.
Another term that you should be somewhat familiar with is statistical significance, or the likelihood that something occurs by chance😲. If something is statistically significance, it did not occur by chance (some outside factor influenced the data). If something isn't statistically significant, it occurred completely by chance. To determine this, you would compare the mean of the control group and the mean of the experimental group.
The following question is taken from the College Board website (2017 AP Exam - Part B of #1).
A study was conducted to investigate the role of framing on concern for healthy eating🍏. Each participant (N = 100) was randomly assigned to one of the two conditions. In the first condition, the participants read an article indicating that obesity is a disease🦠. Participants in the second condition read an article indicating that obesity is the result of personal behaviors and decisions.
Participants were asked to indicate how important it would be for them to eat a healthy diet. Scores ranged from 1 (not very important) to 9 (very important). The results are presented in the table below.
Group | Mean Score - Concern for Healthy Eating | Standard Deviation |
Disease | 3.4 | 1.4 |
Behavior | 6.1 | 1.2 |
<< Hide Menu
Sadiyya Holsey
Dalia Savy
Sadiyya Holsey
Dalia Savy
Descriptive statistics involves the use of numerical data to measure and describe the characteristics of groups, and this includes measures of central tendency and variation. We'll be focusing on descriptive statistics in this study guide! It does not involve making inferences about a population based on sample data.
Inferential statistics, on the other hand, involves using statistical methods to make inferences about a population based on data. It allows you to draw conclusions about a population based on the characteristics of a sample. Specifically, it provides a way to see validity drawn from the results of the experiment🧪🔬.
Therefore, descriptive statistics describe the data, while inferential statistics tell us what the data means.
When one has a ton of data, how do they begin to go through it? Typically, a researcher would construct and interpret a graph with their data, and they use descriptive statistics to do so. 📈
Measures of central tendency are statistical values that represent the center or typical value of a dataset. The three most commonly used measures of central tendency are the mean, median, and mode.
The easiest to spot is the mode: which value, if any, appears more often than others? Here, we can see 5 twice, so the mode of this dataset is 5.
Then, you may want to calculate the mean by adding all of these data values and dividing by the total. Since we have seven values, we have to divide by seven: (5 + 10 + 5 + 7 + 12 + 15 + 18)/7 = 10.286
The median is the middle of the data set when the numbers are in order. Make sure you always put them in order!! If you do so here, you will find that the median is 10.
Measures of variation describe how spread out or dispersed the values in a dataset are. The most commonly used measure of variation is the standard deviation, which is a measure of how much the values in a dataset deviate from the mean. It is basically used to assess how far the values are spread below and above the mean. A dataset with a low standard deviation has values that are relatively close to the mean, while a dataset with a high standard deviation has values that are more spread out.
Another, less complex, measure of variation you should be familiar with for this course is the range of a dataset. Range is just the difference between the highest and lowest values in the dataset.
The correlation coefficient is a statistical measure that describes the strength and direction of the relationship between two variables. It can range from -1 to 1. A value of -1 indicates a strong negative relationship, a value of 1 indicates a strong positive relationship, and a value of 0 indicates no relationship.
You can simply think of it as a measure of how well two variables are correlated, and the closer it is to -1 or +1, the stronger the correlation.
Positive correlation shows that as one variable increases ⬆️, the other variable increases ⬆️. For example, a positively correlated group may show that as height increases, weight increases as well.
A frequency distribution is a breakdown of how the scores fall into different categories or ranges. There are several types of frequency distributions:
95% of the data falls within two standard deviations of the mean. Since 2 standard deviations are equal to 30, the data falls between 70 and 130, or +-30 points of 100.
Another term that you should be somewhat familiar with is statistical significance, or the likelihood that something occurs by chance😲. If something is statistically significance, it did not occur by chance (some outside factor influenced the data). If something isn't statistically significant, it occurred completely by chance. To determine this, you would compare the mean of the control group and the mean of the experimental group.
The following question is taken from the College Board website (2017 AP Exam - Part B of #1).
A study was conducted to investigate the role of framing on concern for healthy eating🍏. Each participant (N = 100) was randomly assigned to one of the two conditions. In the first condition, the participants read an article indicating that obesity is a disease🦠. Participants in the second condition read an article indicating that obesity is the result of personal behaviors and decisions.
Participants were asked to indicate how important it would be for them to eat a healthy diet. Scores ranged from 1 (not very important) to 9 (very important). The results are presented in the table below.
Group | Mean Score - Concern for Healthy Eating | Standard Deviation |
Disease | 3.4 | 1.4 |
Behavior | 6.1 | 1.2 |
© 2024 Fiveable Inc. All rights reserved.