Denunciar

Compartir

•2 recomendaciones•1,074 vistas

•2 recomendaciones•1,074 vistas

Denunciar

Compartir

Descargar para leer sin conexión

Assessment in Learning 2

- 1. LESSON 8 : ANALYSIS, INTERPRETATION, AND USE OF TEST DATA Prepared by: Group 8
- 2. DESIRED SIGNIFICANT LEARNING OUTCOME In this lesson, you are expected to: Analyze, interpret, and use test data applying a.) measures of central tendency, b.) measures of variability, c.) measures of position, and d.) measures of covariability.
- 3. SignificantCulminating PerformanceTask and Success Indicators At the end of the lesson, you are expected to analyze test scores using the measures of central tendency, variability, position and co- variability. Understanding of these measures is helpful to know how to improve teaching and learning. Your success in this performance will be determined if you have done the following. Tasks Success Indicators Calculate the mean, median, and make de of test scores from the given set of test scores. From own and actual set of test data, compute at least two measures of central tendency in two ways. Write a general statement of interpretation from computer mean, median, and mode of test data, and the relationship of the three measures and shape the skewness of the test distribution of scores. Give at least three interpretation statements of your own test work output aligned with the assessment objectives. Identify the level of measurement for a given variable. Explain in a statement or two the level of measurement that applies to variable measured by your own test data. Explain the concept of variability on test results and calculate the different measures of variability. From own test data sheet, compute the standard deviation and give a statement or two to describe the test results on the basis of such measures.
- 4. Significant Culminating Performance Task and Success Indicators Tasks Success Indicators Explain how measures of variability relate to skewness and kurtosis of frequency polygon of test data. Present empirical evidence of how measures of variation affects the shape of frequency polygon of test scores. Determine the different measures of position. Apply the different measures of position in interpreting test scores in a distribution. Convert raw scores in standard scores. Use standard scores to compare two scores in a distribution in at least two practical and real teaching-learning situations. Determine the measures of co- variability between two tests. From a given test data, compute the occurrence of co-variability of two test scores.
- 5. Prerequisite of this Lesson The discussion in this lesson, will build upon the concepts and examples presented in lesson 7, which focused on the tabular and graphical presentation and interpretation of test results. In this lesson, other ways of summarizing test data using descriptive statistics, which provides a more precise means of describing a set of scores will be discussed. The word “measures” is commonly associated with numerical and quantitative data. Hence, the prerequisite to understanding the concepts contained in this lesson is your basic knowledge of mathematics, e.g., summation of values, simple operations on integers, squaring and finding the square roots, and etc.
- 6. What are measures of central tendency? Means the central location or point of convergence of a set of values. Test scores have a tendency to converge at a central value. This value is the average of the set of scores. Three commonly-used measures of central tendency of measures of central location are the MEAN, MEDIAN, and the MODE.
- 7. What are measures of central tendency? Mean • Most preferred measure of central tendency for use with test scores. • Also referred to as the arithmetic mean. • The computation is very simple. 𝑋 That is, 𝑋 = ______ where 𝑋 = the mean, N 𝑋 = the sum of all the scores, and N = the number of scores in the set.
- 8. What are measures of central tendency? - Mean Consider again the test scores of students given in Table 8.1, which is the same set of test scores in the previous lesson. Table 8.1 Scores of 100 college student in a Final Examination The mean is the sum of all the scores from 53 down to the last score, which is 35. That is, 𝑋 𝑋 = ______ = 53+36+57+…..+60+49+35 N 53 30 21 42 33 41 42 45 32 36 51 42 49 64 46 57 35 45 57 38 49 54 61 36 53 48 52 41 58 42 43 49 51 42 50 62 33 43 37 57 35 33 50 42 62 75 66 78 52 58 45 53 40 60 46 45 79 33 46 43 47 37 33 37 36 36 46 41 43 42 47 56 50 53 49 39 52 52 50 37 53 34 43 43 57 48 43 42 42 65
- 9. What are measures of central tendency? - Mean Table 8.2 Frequency Distribution of Grouped Test Scores Class Interval Midpoint (X) f X¡ F Cumulative Frequency (cf) Cumulative Percentage 75-80 77 3 231 100 100 70-74 72 0 0 97 97 65-69 67 2 134 97 97 60-64 62 8 496 95 95 55-59 57 8 456 87 87 50-54 52 17 884 79 79 45-49 47 18 846 62 62 40-44 42 21 882 44 44 35-39 37 13 481 23 23 30-34 32 9 288 10 10 25-29 27 0 0 1 1 20-24 22 1 22 1 1 Total (N) 100 X¡ f =4720
- 10. What are measures of central tendency? - Mean In the Traditional way, it cannot be argued that you can see at a glance how the scores are distributed among the range of values in a condensed manner. You can even estimate the average of the scores by looking at the frequency in each class interval. In the absence of statistical program the mean can be computer by the following formula: 𝑋 = 𝑋𝑖𝑓 N Where 𝑋𝑖 = midpoint of the class interval 𝑓 = frequency of each class interval N = total frequency Thus, the mean on the test scores in Table 7.1 is calculated as follows: 𝑋 = 𝑋𝑖𝑓 =4720 = 47.2 N 100
- 11. What are measures of central tendency? - Median Median • The value that divides the ranked score into halves, or the middle value of the ranked scores. • If the number of scores is odd, then there is only one middle value that gives the median . However, if the number of scores in the set is an even number, then there are two middle values.
- 12. What are measures of central tendency? - Median Table 8.2 Frequency Distribution of Grouped Test Scores This formula will help you determine the median: Mdn = lower limit + size of the class interval of median class [ ( 𝑛 2 ) – cumulative frequency below the median class ] frequency of the median class Class Interval Midpoint (X) f X¡ F Cumulative Frequency(cf) Cumulative Percentage 75-80 70-74 65-69 60-64 55-59 50-54 77 72 67 62 57 52 3 0 2 8 8 17 231 0 134 496 456 884 100 97 97 95 87 79 100 97 97 95 87 79 45-49 47 18 846 62 62 40-44 35-39 30-34 25-29 20-24 42 37 32 27 22 21 13 9 0 1 882 481 288 0 22 44 23 10 1 1 44 23 10 1 1 Total (N) ∑ of X¡F= 4720
- 13. What are measures of central tendency? - Mode Mode • The easiest measure of central tendency to obtain. • It is the score or value with the highest frequency in the set of scores. • If the scores are arranged in a frequency distribution, the mode is estimated as the midpoint of the class interval which has the highest frequency. • This class interval with the highest frequency is also called the modal class.
- 14. When are mean, median, and mode appropriately used? • To appreciate comparison of the three measures of central tendency, a brief background of level of measurement is important. • The level of measurement helps you decide how to interpret the data as measures of these attributes, and this serves as the guide in determining in part the kind of descriptive statistics to apply in analyzing the test data.
- 15. Scale of Measurement There are four levels of measurement that apply to the treatment of test data: nominal, ordinal, interval, and ratio. 1.) NOMINAL The number is used for labeling or identification purposes only. Example: student’s identification number or section number. 2.) ORDINAL It is used when the values can be ranked in some order of characteristics. The numeric values used is indicate the difference in traits under consideration. Example: Academic awards are made on the basis of an order of performance: first honor, second honor, third honor, and so on.
- 16. Scale of Measurement 3.) INTERVAL Which has the properties of both nominal and ordinal scales. Attained when the values can describe the magnitude of the differences between groups or when the intervals between the numbers are equal. “Equal interval” means that the distance between the things represented by 3 and 4 is as the same distance represented by 4 and 5. Example: the most common example of interval scale is temperature readings. (the difference between the temperatures 30° and 40° is the same as that between 90° and 100°. )
- 17. Scale of Measurement 4.) RATIO It the highest level of measurement. As such, it carries the properties of the nominal, ordinal, and interval scales. Its additional advantage is the presence of a true zero point, where zero indicates the total absence of the trait being measured.
- 18. How do measures of central tendency determine skewness? SYMMETRICAL DISTRIBUTION - the mean, median, and mode have the same value, and the value of the median is between the mean and the mode. Figure 8.1. Mean, Median and Mode in a Symmetrical Distribution
- 19. How do measures of central tendency determine skewness? POSITIVELY-SKEWED DISTRIBUTION - the mode stays at the peak of the curve and its value will be smallest. - the mean will be pulled out from the peak of the distribution toward the direction of the few high scores. thus, the mean gets the largest value. The median is between the mode and the mean. Figure 8.2 Mean, Median, and Mode in a Positively- Skewed Distribution
- 20. How do measures of central tendency determine skewness? NEGATIVELY-SKEWED DISTRIBUTION the mode remains at the peak of the curve, but it will have the largest value. - the mean will have the smallest values as influenced by the extremely low scores, and the median still lies between the mode and the mean. Figure 8.3. Mean, Median, and Mode in a Negatively-Skewed Distribution
- 21. What are measures of dispersion? - Which indicates “variability”, “ spread”, or “scatter”. - Measures of variability give us the estimate too determine how the scores are compressed, which contributes to the “flatness” or “peakedness” of the distribution. - There are several indices of variability, and the most commonly used in the area of assessment are the RANGE, VARIANCE, AND STANDARD Figure 8.4. Measures of Variability of Sets of Test Scores
- 22. Indices of Variability RANGE - It is the difference between the highest scores and the lowest scores in a distribution. The simplest measures of variability but also considered as least accurate measure of dispersion. Determine the range for the following scores: 9, 9, 9, 12, 12, 13, 15, 15, 17, 17, 18, 18, 20, 20, 20. Range = Highest Score (HS) – Lowest Score (LS) = 20 – 9 = 11 Now, replace a high score in one of the scores, say, the last score and make it 50. The range becomes: Range = HS – LS = 50 – 9 = 41
- 23. Indices of Variability VARIANCE AND STANDARD DEVIATION - The most widely used measure of variability and considered as the most accurate to represent the deviations of individual scores from the mean values in the distribution. Examine the following test score distributions: ΣX = 120 ΣX = 120 ΣX = 120 x ̅ = 120 x ̅ = 120 x ̅ = 120 10 10 10 = 12 = 12 = 12 Class A Class B Class C 22 18 16 14 12 11 9 7 6 5 16 15 15 14 12 11 11 9 9 8 12 12 12 12 12 12 12 12 12 12
- 24. VARIANCE AND STANDARD DEVIATION You will note that while the distribution contain different scores, they have the same mean. If we ask how each mean represents the score in their respective distribution, there will be no doubt with the mean of distribution C because each score in the distribution is 12. How about distributions A and B? For these two distributions, the mean of 12 is a better estimate of the scores in distribution B than in distribution A. We can see that no score in B is more than 4 points away from the mean of 12. However, in distribution A, half of the 12 scores is 4 points or more away from the mean. We can also say that there is less variability of scores in B than A. However , we cannot just determine which distribution is dispersed or not by merely looking at the numbers especially when there are many scores. We need a reliable index of variability, such as variance or standard deviation, that takes into consideration all the scores. Recall that Σ (X - x ̅ ) is the sum of the deviation scores from the mean, which is equal to zero. As such, we square each deviation score, then sum up all the squared deviation scores, and divide it by the number of cases. This yields the variance. Getting its square root is the standard deviation.
- 25. VARIANCE AND STANDARD DEVIATION The measure is generally defined by the formula: σ² = Σ ( X - )² N Where σ² = population variance μ = population mean X = score in the distribution finding the square gives us this formula for the standard deviation. That is, σ = Σ ( X – μ )² N Where = population standard deviation μ = population mean X = score in distribution
- 26. VARIANCE AND STANDARD DEVIATION If we are dealing with the sample data and wish to calculate an estimate of s, the following formula is used for such statistic: s = Σ ( 𝐗 – x̅ )2 / 𝐍 𝐍 Where s = standard deviation X = raw score x ̅ = mean score N = number of scores in the distribution The value of 276 and 74 are the sum of the squared deviations of scores in Class A and Class B, respectively. If these are divided by number of scores in each class, this gives the variance (S2): 𝑆𝐴 2 = 276 = 30.67 𝑆𝐵 2 = 74 = 8.22 10 – 1 10 – 1
- 27. VARIANCE AND STANDARD DEVIATION The value above are both in squared units, while our original units of scores are not in squared units. When we find their square roots, we obtain values that are on the same scale of units as the original set of scores. These too give the respective standard deviation (S) of each class and computed as follows: SA = 𝑆𝐴 2 SB = 𝑆𝐵 2 = 30.67 = 8.22 = 5. 538 = 2. 867 Using the scores in Class A and Class B in the above dataset, we can apply the formula: Class A Class B X ( X - x ̅ ) ( X - x ̅ )2 22 22-12 100 18 18-12 36 16 16-12 16 14 14-12 4 12 12-12 0 11 11-12 1 9 9-12 9 7 7-12 25 6 6-12 36 5 5-12 49 X ( X - x ̅ ) ( X - x ̅ )2 16 22-12 16 15 15-12 9 15 15-12 9 14 14-12 4 12 12-12 0 11 11-12 1 11 11-12 1 9 9-12 9 9 9-12 9 8 8-12 16 x ̅ = 12 Σ (X - x ̅ )2 = 276 x ̅ = 12 Σ (X - x ̅ )2 = 74
- 28. VARIANCE AND STANDARD DEVIATION In addition, since the standard deviation is a measure of dispersion, it means that a large standard deviation indicates greater score variability than lower standard deviation. If the standard deviation is small, the scores are closely clustered around the mean, or the graph of the distribution is compressed even if it is symmetrical or skewed. Figure 8.5. Homogenous Test Score Distributions in Different Skewness
- 29. What are the measures of position? While measures of central tendency and measures of dispersion are used often in assessment, there are other methods of describing data distributions such as using measures of position or location. What are these measures? QUARTILE, DECILE, and PERCENTILE. QUARTILE The quartiles are the three values that divide a set scores into four equal parts, with one-fourth of the data values in each part. Quartiles are also used as a measure of the spread of data in the interquartile range (IQR), which is simply the difference between the third and first quartiles (Q3 – Q1). Half of this gives the semi-interquartile range or quartile deviation (Q). The following example illustrates the above mentioned measures.
- 30. QUARTILE Example: Given the following scores, find the 1st quartile, 3rd quartile, and quartile deviation. 90, 85, 85, 86, 100, IOS, 109.110.88, 105, 100, 112 Steps: 1. Arrange the scores in the decreasing order. Steps: 2. From the bottom, find the points below which 25% of the score value and 75 % of the score values fall. 3. Find the average of the two scores in each of these points to determine QI and Q3, respectively. 4. Find Q using the formula: Q = Q3 – Q1 2
- 31. QUARTILE Applying these steps in the above example, we have: Note that in the above example, the upper and lower 50% contains even center values, so the median in each half is the average of the two center values. Consequently, applying the formula: 𝑄3 – 𝑄1 2 gives the quartile deviation. That is, Q = 107 – 87 = 10 2
- 32. DECILE It divides the distribution into 10 equal parts. There are 9 deciles such that 10% of the distribution are equal or less than decile 1,(D1), 20% of the scores are equal or less than decile 2 (D2); and so on. A student whose mark is between the first and second deciles is in decile 2, and one whose mark is above the ninth decile belongs to decile 10. If there are small numbers of data values, decile is not appropriate to use.
- 33. PERCENTILE It divides the distribution into one hundred equal parts. In the same manner, for percentiles, there are 99 percentiles such that 1% of the scores are less than the first percentile, 2% of the scores are less than the second percentile, and so on. For example, if you scored 95 in a 100-item test, and your percentile rank is 99th, then this means that 99% of those who took the test performed lower than you. This also means that you belong to the top 1% of those who took the test. In many cases, percentiles are wrongly interpreted as percentage score.
- 34. PERCENTILE For example, 75% as a percentage score means you get 75 items correct out of a hundred items, which is a mark of grade reflecting performance level. But percentile is a measure of position such that 75th percentile as your mark means that 75% of the students who took the test got lower score than you, or your score is located at the upper 25% of the class who took the same test. For very large data set, percentile is appropriate to use for accuracy. This is one reason why percentiles are commonly used in national assessments or university entrance examinations with large dataset or scores in thousands.
- 35. What is coefficient of variation as a measure of relative dispersion? We need a measure of relative dispersion which dimensionless or “unit free”. This measure of relative dispersion is also known as coefficient of variation. This is simply the ratio of the standard deviation of a distribution and the mean of the distribution. cv= σ (100) μ • The above formula indicates that coefficient of variation is a percentage value.
- 36. What is coefficient of variation as a measure of relative dispersion? • Thus, if the mean score of the students in mathematics is 40 with a standard deviation of 10, then the coefficient of variation is computed as: cv= 10 (100) 40 = 0.25 = 25% • Suppose the mean score of students in science is 18 with standard deviation of 5. The coefficient of variation is: cv= 5 (100) 18 = 0.227 = 28% • From the computed coefficient of variations as measure of relative dispersion, we can clearly see that the scores in mathematics are more homogenous than the scores in science.
- 37. • Example question: Two versions of a test are given to students. One test has pre-set answers and a second test has randomized answers. Find the coefficient of variation. Step 1: Divide the standard deviation by the mean for the first sample: 11.2 / 50.1 = 0.22355 Step 2: Multiply Step 1 by 100: 0.22355 * 100 = 22.355% Step 3: Divide the standard deviation by the mean for the second sample: 12.9 / 45.8 = 0.28166 Step 4: Multiply Step 3 by 100: 0.28166 * 100 = 28.266% How to Find aCoefficient ofVariation? Regular Test Randomized Answers Mean 50.1 45.8 SD 11.2 12.9
- 38. • The standard deviation is viewed as the most useful measure of variability because in many distributions of scores, not only in assessment but in research as well, it approximates the percentage of scores that lie within one, two or three standard deviations from the mean. How is standard deviation applied in a normal distribution?
- 39. Special kind of symmetrical distribution that is most frequently used to compare scores. Are drawn as a smooth curve, one curve stands out, which is the bell-shaped curve. It is also called Gaussian distribution, named after Carl Friedrich Gauss. The standard deviation is used to determine the percentage of scores that fall within a certain number of standard deviations from the mean. The Normal Distribution
- 40. Figure 8.6. The Normal Curve In assessment, the area in the curve refers to the number of scores that fall within a specific standard deviation from the mean score, in other words, each portion under the curve contains a fixed percentage of cases as follows: - 68% of the scores fall between one standard deviations below and above the mean - 95% of the scores fall between two standard deviations below and above the mean - 99.77% of the scores fall between three standard deviations below and above the mean. The Normal Distribution
- 41. • The following figure further illustrates the theoretical model: Figure 8.7. The Areas under the Normal Curve From the above figure, we can state the properties of the normal distribution: 1. The mean, median, and mode are all equal. 2. The curve is symmetrical. As such, the value in a specific area on the left is equal to the value of its corresponding area on the right. 3. The curve changes from concave to convex and approaches the X-axis, but the tails do not touch the horizontal axis. 4. The total area on the curve is equal to 1. The Normal Distribution
- 42. WHAT ARE STANDARD SCORES? According to study.com, Standard Score is a set of scores that have the same mean and standard deviation so they can be compared. The most useful is the z-score, which is often used to express a raw score in relation to the mean and standard deviation. This relationship is expressed in the following formula Z= 𝑋− 𝑋 S Recall that X-X is a deviation score. With this difference, we are able to know whether your test score, say, X is above or below average score. The standard deviation helps you locate the relative position of the score in a distribution. A z-score is called a Standard Score, simply because it is a deviation score expressed in standard deviation units.
- 43. WHAT ARE STANDARD SCORES? If raw score are expressed as z-scores, we can see their relative position in their respective distribution. Moreover, if the raw scores are already converted into standard scores, we can now compare the two scores even when these scores comes from different distributions or when scores are measuring two different things. Figure 8.8. A Comparison of Score Distributions with Different Means and Standard Deviation
- 44. WHAT ARE STANDARD SCORES? In the figure, a score of 86 in English indicates better performance than a score of 90 in Physics. Let us suppose that the standard deviations in English and Physics are 3 and 2, respectively. ZE= 86 – 80 ZP= 90 – 95 3 2 = 2 = -2.5 From the above, if 86 and 90 are your scores in two subjects, you can in English say that, compared with the rest of your class, you performed better in English than in Physics. That is because in English, your performance is 2 standard deviations above the mean. While 90 is numerically higher, this score is more than half standard deviation below the average performance of the class where you belong, while 86 is above the mean and even 2 standard deviations above it.
- 45. WHAT ARE STANDARD SCORES? Figure 8.9. Different Raw Scores in one Z-score Distribution Note that in Figure 8.8, the shaded area in two graphs indicate the proportion of scores below yours. This is the same as saying that this proportion is the number of students in your class who scored lower than you. Examining the area under the normal curve, we can say that about 98% in your class scored below you in English, while the 1.2% scored below you in Physics. We assume here that the scores are normally distributed.
- 46. T-Score As you can see in the computation of the z-score, it can give you a negative number, which simply means the score is below the mean. However, communicating negative z-score as below as the mean may not be understandable to others. One option is to convert a z-score into a T-score, which is a transformed standard score. There is scaling in which a mean of 0 in a z-score is transformed into a mean of 50, and the standard deviation in z-score is multiplied by 10. The corresponding equation is: T-score = 50+10z
- 47. T-Score For example, a z-score of -2 is equivalent to a T-score of 30. That is: T-score = 50+10(-2) = 50-20 = 30 Looking back at the English score of 86, which resulted in a z-score of 2 as shown in figure 8.7, T-score equivalent is ; T-score = 50+10(2) =50+20 = 70
- 48. STANINE SCORES Another standard score is a stanine, shortened from standard nine. With nine in its name, the scores are on a nine-point scale. In a z-score distribution, the mean is 0, and the standard deviation is 1. In this scale, the mean is 5 and the standard deviation is 2. Each stanine is one-half standard deviation-wide. Like the T-Score, stanine score can be calculated from the z-score by multiplying the z-score by 2 and adding 5. Stanine=2z + 5
- 49. STANINE SCORES Going back to our example on a score of 86 in English that is equivalent to a z-score of 2, its stanine equivalent is Stanine=2(2) + 5 = 9 Example: Scores in stanine scale have some limitations. Since they are in 9-point scale and expressed as a whole number they are not precise. Z-Score T-Score Stanine 2.1 71 9 2.0 70 9 1.9 69 9
- 50. STANINE SCORES On the assumptions that stanine scores are normally distributed, the percentage of cases in each band or range of scores in the scale as follows: Stanine Score Percentage of Scores 1 Lowest 4% 2 Next Slow 7% 3 Next Slow 12% 4 Next Slow 17% 5 Middle 20% 6 Next High 17% 7 Next High 12% 8 Next High 7% 9 Highest 4%
- 51. STANINE SCORES With the above percentage distribution of scores in each stanine, you can directly convert a set of raw scores into stanine scores. Simply arrange the raw scores from the highest to lowest, and with the percentage of scores in each stanine, you can directly assign the appropriate stanine score in each raw score. Figure 8.10. The Normal Distribution and the Standard Scores
- 52. What are Measure of Covariability? • Measures of covariability tell us to a certain extent the relationship between two tests or two factors. A score one gets may not only be due to a single factor but with other factors directly or indirectly observable, which are also related to one another. This section will be limited to introducing two scores that are hypothesized to be related to one another. • Correlation between two variables - is finding the degree of relationship between two scores. • The statistical measure is the correlation coefficient, an index that ranges from -1.0 to +1.0. the value -1.0 indicates a negative perfect correlation, 0.00 no correlation at all, and 1.00 a perfect positive correlation. • Correlation coefficients did not result in exact values of 0.00, 1.0 and -1.0; instead, the correlation values are either closer to 1.0 or -1.0.
- 53. What are Measure of Covariability? • The various types of relationships are illustrated in the following scatter-plot diagrams: Figure 8.11. Various Types of Relationships
- 54. What are Measure of Covariability? • Some examples of interpreting Correlation Coefficients are as follows: • To further analyze the relationship between two scores, a statistical test to determine the correlation coefficient of two variables is needed. In actual practice, we deal with test scores, so the dataset is at interval and ratio level measurements. Hence, the commonly used statistical analysis is the Pearson Product- Moment correlation Coefficient Direction Strength r = -0.90 negative strong r = 85 positive strong r = 68 positive moderate r = -30 negative weak r = 0.05 positive very weak/negligible
- 55. • Differentversionsof Pearson Product- MomentCorrelation 1.A Deviation- score Formula 𝑟 = (x − 𝑋 )(𝑌 − 𝑌) (𝑋 − 𝑋)2(𝑌 − 𝑌)2 𝑁 = 𝑥𝑦 𝑥2 𝑦2 • where r = correlation coefficient X = raw score in variable X 𝑋 = mean score in the first variable Y = raw score in the second variable 𝑌 = mean score in the second variable • Note that in 𝑥𝑦, x and y are derivation score from 𝑋 and 𝑌, respectively.
- 56. • Different versionsof Pearson Product- MomentCorrelation 2. Raw Score Formula 𝑟 = 𝑁 (𝑋𝑌 − 𝑋 𝑌 𝑁 𝑋2 − ( 𝑋)2 𝑁 𝑌2 − ( 𝑌)2 • The raw score formula is very easy because a simple scientific calculator can provide the values for 𝑋, 𝑋2 , 𝑌 , 𝑌2.This formula has been introduced earlier in lesson 6, when you were thought on how to compute the reliability coefficient of scores. This is the same Pearson r, but this time, it is used to establish relationship between two sets of data.
- 57. • Computation for Correlation for Raw Score Data Table 8.3. Computation for Correlation for Raw Score Data X Reading Y Problem Solving x2 y2 XY 4 6 16 36 24 9 11 81 121 99 4 5 16 25 20 11 10 121 100 110 12 8 144 64 96 5 5 25 25 25 7 10 49 100 70 5 8 25 64 40 6 9 36 81 54 2 5 4 25 10 X = 65 Y = 77 x2 = 517 y2 = 461 XY = 548
- 58. • Computation for Correlation for Raw Score Data So that: Correlation coefficient (r) = 𝑁 (𝑋𝑌− 𝑋 𝑌 𝑁 𝑋2−( 𝑋)2 𝑁 𝑌2−( 𝑌)2 = 10 548 −(65)(77) 10 517 − 65 2 [10 641 −(77)2] = 5480−5005 (5170−4225)(6410−5929) = 475 (945)(481) = 475 454545 = 475 674.19 r= 0.7045 The above mathematical processes gave a correlation coefficient of 0.705 between performance scores in Reading and Problem Solving. This coefficient indicates a moderate correlation between performance in reading and problem solving.
- 59. • Precautions to be observed with regard to the computed r as correlation coefficient: - It should not be interpreted as percent. o Thus, 0.705 should not be interpreted as 70%. If we want to extend the meaning of 0.705, then compute for r2, which becomes the coefficient of determination. This coefficient explains the percentage of the variance in one variable that is associated with the other variable. With the reference to the two variables indicated in table 8.3, and the computed r of 0.70 (rounded off to the nearest hundredths), it results to an r2 of 0.49, which can be taken as 49%. This can be interpreted as 49% of the variance in Problem-Solving test scores is associated with the Reading scores. Thus, r2 helps explain the variance observed in the problems score. The total variance is equal to 1, in percent, 100%. If 49% of the variance observed in problem- solving scores was attributable to reading scores, then the other 51% of the variance in problem-solving test scores is due to other factors. This concept is concretized in figure 8.12.
- 60. • Precautions to be observed with regard to the computed r as correlation coefficient: - While the correlation coefficient shows the relationship between two variables, it should not be interpreted as causation. o Considering our example, we could not say that the scores in reading test causes 49% of the variance of the problem-solving test scores. Relationship is different from causation. Figure 8.12. Covariation of Performance in Reading and Problem-Solving
- 61. APPLY: 1. Refer to this figure below as the frequency polygons representing entrance test scores of three groups of students in different field of specialization.
- 62. APPLY: a. What is the mean score of Education students? b. What is the mean score of engineering students? c. What is the mean score of Business students? d. Which group of students had the most dispersed scores in the test? Why do you say so? e. What distribution is skewed? Why do you say so?
- 63. APPLY: 2. Determine the type of distribution depicted by the following measures: a. 𝑥 = 80.45 𝑀𝑑𝑛 = 80.78 𝑀𝑜𝑑𝑒 = 80.25 b. 𝑥 = 120 𝑀𝑑𝑛 = 130 𝑀𝑜𝑑𝑒 = 150 c. 𝑥 = 89.78 𝑀𝑑𝑛 = 82.16 𝑀𝑜𝑑𝑒 = 82.10 Note: The higher the mode from mean – negative The higher the mean from mode – positive
- 64. APPLY: 3. The following is a frequency distribution of scores of 10 persons. a.What is the mean of distribution? b.What is the median? c. What is the mode? d.What is the range? e.What is the standard deviation? X f 30 1 28 2 20 3 17 0 15 1 13 2 10 1
- 65. APPLY: 4. The following is the frequency distribution of year-end examination marks in a certain secondary school. Class Interval f Midpoint F*m cf 60-65 2 63 126 100 55-59 5 57 285 98 50-54 6 52 312 93 45-49 8 47 376 87 40-44 11 42 462 79 35-39 10 37 370 68 30-34 11 32 352 58 25-29 20 27 540 47 20-24 17 22 374 27 15-19 6 17 102 10 10-14 4 12 48 4 100 3347
- 66. APPLY: a.Compute the mean, median and mode of the frequency distribution. b. Find : 1. Third quartile or the 75th percentile (𝑃75) 2. First quartile or the 25th percentile 3. Semi – interquartile range
- 67. APPLY: 5. A common exit examination is given to 400 students in a university. The scores are normally distributed and the mean is 78 with a standard deviation of 6. Daniel had a score of 72 and Jane score 84. What are the corresponding z-score of Daniel and Jane? How many students would be expected to score between the score of Daniel and Jane? Explain your answer. 6. James obtained a score of 40 in his Mathematics test and 34 in his Reading test. The class mean score in Mathematics is 45 with a standard deviation of 4 while in Reading; the mean score is 50 with a standard deviation of 7. On which test did James do better compared of the rest of the class? Explain your work.
- 68. APPLY: Following are sets of score on two variables: X for reading comprehension and Y for Reasoning skills administered to sample of students. X: 11 9 15 7 5 9 8 4 8 11 Y: 13 8 14 9 8 7 7 5 10 12 X Y X2 Y2 XY 11 13 121 169 143 9 8 81 64 72 15 14 225 196 210 7 9 49 81 63 5 8 25 64 40 9 7 81 49 63 8 7 64 49 56 4 5 16 25 20 8 10 64 100 80 11 12 121 144 132 = 87 = 93 = 847 = 941 XY = 879
- 69. APPLY: a. Compute the Pearson Product-Moment Correlation for the above data. Pearson r? 𝒓 = 𝑵 𝒙𝒚 − 𝒙 𝒚 𝑵 𝑿𝟐 − ( 𝑿)𝟐 [𝑵 𝒀𝟐 − ( 𝒀)𝟐] b. Describe the direction and strength of the relationship between readings. c. The Coefficient Determination. Interpret the results.
- 70. EVALUATION: In each item, choose the letter which you think can best represent the answer to the given problem situation. Give a statement to justify your choice. 1. Which distribution is negatively skewed in the figure shown? a. Distribution X b. Distribution Y c. Distribution Z d. Distribution W
- 71. EVALUATION: 2. What is the preferred measure of central tendency in a test distribution where there are small number of scores that are extremely high or low? a. Median b. Mean c. Variance d. Mode 3. The following scores were obtained from a short spelling test: 10, 8, 11, 13, 12, 13, 8, 16, 11, 8, 7, 9. What is the modal score? a. 8 b. 10 c. 11 d. 13
- 72. EVALUATION: 4. What does it mean when a student got a score at the 70th percentile on a test? a. The performance of the student is above average b. The student answered 70 percent of the items correctly c. The student got at the least 70 percent of the correct answers d. The student’s score is equal or above 70 percent of the other students in the class. 5. Which best describes a normal distribution? a. Positively skewed b. Negatively skewed c. Symmetric d. Bimodal
- 73. EVALUATION: 6. What does a large standard deviation indicate? a. Scores are not normally distributed b. Scores are not widely spread, and the median is unreliable measure of central tendency. c. Scores are widely distributed where the mean may not be a reliable measure of central tendency. d. Scores are not widely distributed, and the mean is recommended as more reliable measure of central tendency 7. In a normal distribution, approximately what percentage of scores are expected to fall within three-standard deviation from the mean? a. 34% b. 68% c. 95% d. 99%
- 74. EVALUATION: 8. Which of the following is interpreted as the percentage of scores in a reference group that falls below a particular raw score? a. Standard scores b. Percentile rank c. Reference group d. T-score 9. For the data illustrated in the scatter plot below, what is the reasonable product-moment correlation coefficient? a. 1.0 b. -1.0 c. 0.90 d. -0.85
- 75. EVALUATION: 10.A Pearson test statistic yields a correlation coefficient (r) of 0.90. if X represents scores on vocabulary and Y, the reading comprehension test scores, which of the following best explains r=0.90? a. The degree of association between X and Y is 81%. b. The strength of relationship between vocabulary and reading comprehension is 90% c. There is almost perfect positive relationship between vocabulary test scores and reading comprehension. d. 81% of the variance observed in Y can be attributed to the variance observed in X.