This study explores the practice of data science by those who practice it. We surveyed over 600 data professionals to understand their data skills, team makeup and more.
A factor analysis is a data reduction technique. It is used when you have a large number of variables in your data set and would like to reduce the number of variables to a manageable size. In general, factor analysis examines the statistical relationships (e.g., correlations) among a large set of variables and tries to explain these correlations using a smaller number of variables (factors).
The results of the factor analysis are presented in tabular format called the factor pattern matrix. The factor matrix is an NxM table (N = number of original variables and M = number of underlying factors). The elements of a factor pattern matrix represent the regression coefficients (like a correlation coefficient) between each of the variables and the factors. These elements (or factor loadings) represent the strength of relationship between the variables and each of the underlying factors. The results of the factor analysis tells us two things:
number of underlying factors that describe the initial set of variables
which variables are best represented by each factor