Chapter 6 Pre-test Questionnaire- Factor analysis
In this stage, the main idea is to analyze the presence of latent variables or how the questions are grouped together in clusters. With regard to latent variables, the previously defined concepts could be useful. Thus, with factor analysis, it is expected that latent variables represented by factors could identify these concepts; and questions linked with the same concept were grouped together. One of the main objectives of the factor analysis is to associate each item with only one factor. Besides, it is expected that this factor, aggregating several items, has a relevant meaning that encompass the significance of the items involved; which in the end is the latent variable.
After considering the analysis of both, the pre-genomic test and post-genomic test questionnaire, and having identified potential items to be excluded, the factor analysis is performed to finally determine which items to hold and which to exclude. Therefore, first expectations items are analyzed in both settings the pre and post-genomic testing. Then attitudes items are evaluated, again, in both the pre and post genomic instance.
6.1 Expectations and concerns domain
6.1.1 Including all the items.
The first analysis will be done for expectations items plus concerns in order to explore the relationship between all the items. Then, the analysis will be performed excluding different candidates items to find the best combination of the whole expectations and concerns domain. In this first analysis, item 4 is considered inverted.

Figure 6.1: The list of the items are: 1.Tengo suficiente conocimiento de beneficios y riesgos para tomar decisión informada, 2.He recibido suficiente información para comprender beneficios y riesgos del análisis genómico, 3.Estoy interesado/a en aprender más, 4.Necesito visita formal con especialista consejo genético antes del test, 5.El resultado ayudará al control de mi cáncer, 6.El resultado ayudará a aumentar mi expectativa vida, 7i.Mi Dr me explicará resultados y la implicación para mi salud, 7ii.Recibiré informe escrito con el resultado, 7iii.Mi Dr cambiará mi tto de acuerdo a los resultados, 7iv.Tendré opciones de tratamiento adicionales, 7v.Podré recibir tratamientos experimentales, 8.Me preocupa que los resultados puedan no guiar mi tratamiento, 9.Me preocupa que los resultados pueden ser difíciles de comprender, 10.Me preocupa que los resultados pueden dar información del riesgo de enf que preferiría no saber, 11.Los resultados pueden preocuparme o generar ansiedad
There is some consistency between correlations, at least with most of them. For instance, between Q1 and Q2; Q5 and Q6; Q6 with Q7iii and Q7.iv; and the relationship between Q8-q11. Besides, the Q4 inverted has a moderate correlation with Q1.
As first approach the Barlett’s sphericity test is performed.
Bartlett’s sphericity test provides information about whether the correlations in the data are strong enough to use a dimension-reduction technique such as principal components or common factor analysis. The test asks whether a correlation matrix is the identity matrix, a matrix containing zeros except in the diagonal which is completed by 1s. Formally speaking, it tests whether the data are a random sample from a multivariate normal population MVN(μ, Σ) where the covariance matrix Σ is a diagonal matrix (a matrix with zeros except in its diagonal). Equivalently, the variables in the population are MVN and uncorrelated. The H0 is the covariance matrix Σ is a diagonal matrix.
Thus, the p-value = 0 and the H0 is rejected confirming the utility of applying a factor analysis to this dataset.
Considering the fact that the Bartlett’s test usually rejects the H0 since the scenario of the null hypothesis is too extreme, another measurement can be studied to determine how well the data fit the factor analysis and how useful each item is. This metric is done by the KMO analysis.
Kaiser (1970) introduced a Measure of Sampling Adequacy (MSA), later modified by Kaiser and Rice (1974). The Kaiser-Meyer-Olkin (KMO) statistic, which can vary from 0 to 1, indicates the degree to which each variable in a set is predicted without error by the other variables. Kaiser (1974) suggested that KMO > .9 were marvelous, in the .80s, meritorious, in the .70s, middling, in the .60s, mediocre, in the .50s, miserable, and less than .5, unacceptable. Hair et al. (2006) suggest accepting a value > 0.5. Values between 0.5 and 0.7 are mediocre, and values between 0.7 and 0.8 are good.
Variables with individual KMO values below 0.5 could be considered for exclusion them from the analysis (note that you would need to re-compute the KMO indices as they are dependent on the whole dataset).
## Kaiser-Meyer-Olkin factor adequacy
## Call: KMO(r = pre_test_Q_ExpConcern_facAn_corr)
## Overall MSA = 0.57
## MSA for each item =
## q1 q2 q3 q5 q6 q7_i q7_ii q7_iii q7_iv q7_v q8
## 0.43 0.62 0.44 0.61 0.63 0.46 0.54 0.62 0.67 0.58 0.58
## q9 q10 q11 q4_Inv
## 0.61 0.51 0.73 0.47
According to these results, Q1, Q3, Q7_i, and Q4_Inv are under the threshold of 0.5; besides, four items are below 0.6 Q7.ii, Q7.v, Q8 and Q10.
Despite having four items below 0.5, the analysis will be continued. In addition, some of these items were found conflicting or less relevant in the previous steps.
To explore the number of factors that could be determined different approaches are available. First, it is possible to apply PCA and evaluate the variability explained for each component. Thus, the table with the PCA results is shown below:
PC1 | PC2 | PC3 | PC4 | PC5 | PC6 | PC7 | PC8 | PC9 | PC10 | PC11 | PC12 | PC13 | PC14 | PC15 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Standard deviation | 2.2396 | 1.8402 | 1.2981 | 1.1391 | 0.9796 | 0.8921 | 0.7367 | 0.6430 | 0.5615 | 0.4988 | 0.3443 | 0.2865 | 0.2763 | 0.1809 | 0.1717 |
Proportion of Variance | 0.3344 | 0.2258 | 0.1123 | 0.0865 | 0.0640 | 0.0531 | 0.0362 | 0.0276 | 0.0210 | 0.0166 | 0.0079 | 0.0055 | 0.0051 | 0.0022 | 0.0020 |
Cumulative Proportion | 0.3344 | 0.5601 | 0.6725 | 0.7590 | 0.8230 | 0.8760 | 0.9122 | 0.9398 | 0.9608 | 0.9774 | 0.9853 | 0.9908 | 0.9958 | 0.9980 | 1.0000 |
Then, the scree plot for this PCA analysis is displayed.
According to these results, 3 factors could be the best numbers. The cumulative proportion showed that with 3 components 67% of the variance could be explained, besides, just a 9% increase was added with an additional component. While the scree plot exhibit an elbow at the third component.
Then, another approach is implemented. With this strategy several analysis are combined and depict in the same figure. The tools implemented are: the Kaiser rule (which drops the components with eigenvalues < 1), the parallel analysis, and the usual scree test (plotuScree), the acceleration factor (which indicates where the elbow of the scree plot appears).
The acceleration factor chooses the number of factors before the elbow, which was found at the third component. The first criterion is defined according to the absolute value of the eigenvalues, with the criteria of choosing the ones below a threshold (1 or here the mean). In this setting, with this criterion 4 factors are suggested as the best combination. However, the other two criteria found 3 as the best number, similarly to what was identified in the prior analysis.
Initially, three factors will be considered.
Running the factor analysis with all the items (n=15), first the communalities are explored:
## Warning in cor.smooth(mat): Matrix was not positive definite, smoothing was
## done
## In smc, smcs < 0 were set to .0
## In smc, smcs < 0 were set to .0
## In factor.stats, I could not find the RMSEA upper bound . Sorry about that
## pre_exp_preoc_q1 pre_exp_preoc_q2 pre_exp_preoc_q3
## 0.3397 0.6007 0.3557
## pre_exp_preoc_q5 pre_exp_preoc_q6 pre_exp_preoc_q7_i
## 0.8704 0.9569 0.9950
## pre_exp_preoc_q7_ii pre_exp_preoc_q7_iii pre_exp_preoc_q7_iv
## 0.5992 0.7423 0.9619
## pre_exp_preoc_q7_v pre_exp_preoc_q8 pre_exp_preoc_q9
## 0.6863 0.7485 0.6234
## pre_exp_preoc_q10 pre_exp_preoc_q11 pre_exp_preoc_q4_Inverted
## 0.8103 0.7248 0.3524
The communalities are the amount of the common variance for each item that can be explained by the factors. Thus, it is desirable to cover a considerable proportion of the it. To note, there is always an amount of variance (the unique variance) that is not explained by factors (a difference with the classical Principal Component Analysis in which all the variance is the common variance and can be explain by the principal components).
Therefore, considering the values of the communalities, again, Q1, Q3, and Q4 inv are those with lower values (<0.4)
Then, the whole output is displayed.
## Factor Analysis using method = ml
## Call: fa(r = pre_test_Q_ExpConcern_facAn, nfactors = 3, rotate = "oblimin",
## fm = "ml", cor = "poly")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML3 ML2 ML1 h2 u2 com
## pre_exp_preoc_q1 0.58 0.34 0.659 1.2
## pre_exp_preoc_q2 0.56 0.60 0.399 2.2
## pre_exp_preoc_q3 0.59 0.36 0.644 1.1
## pre_exp_preoc_q5 0.84 0.87 0.129 1.3
## pre_exp_preoc_q6 1.01 0.96 0.043 1.0
## pre_exp_preoc_q7_i 1.01 1.00 0.005 1.0
## pre_exp_preoc_q7_ii 0.75 0.60 0.401 1.0
## pre_exp_preoc_q7_iii 0.48 0.52 0.74 0.258 2.3
## pre_exp_preoc_q7_iv 0.82 0.96 0.038 1.4
## pre_exp_preoc_q7_v 0.64 0.69 0.314 1.6
## pre_exp_preoc_q8 0.85 0.75 0.251 1.2
## pre_exp_preoc_q9 0.74 0.62 0.376 1.1
## pre_exp_preoc_q10 0.89 0.81 0.190 1.1
## pre_exp_preoc_q11 0.83 0.73 0.275 1.0
## pre_exp_preoc_q4_Inverted -0.45 0.35 0.652 2.3
##
## ML3 ML2 ML1
## SS loadings 3.66 3.59 3.12
## Proportion Var 0.24 0.24 0.21
## Cumulative Var 0.24 0.48 0.69
## Proportion Explained 0.35 0.35 0.30
## Cumulative Proportion 0.35 0.70 1.00
##
## With factor correlations of
## ML3 ML2 ML1
## ML3 1.00 0.20 0.01
## ML2 0.20 1.00 0.42
## ML1 0.01 0.42 1.00
##
## Mean item complexity = 1.4
## Test of the hypothesis that 3 factors are sufficient.
##
## df null model = 105 with the objective function = 118.5 with Chi Square = 2034
## df of the model are 63 and the objective function was 105.7
##
## The root mean square of the residuals (RMSR) is 0.1
## The df corrected root mean square of the residuals is 0.13
##
## The harmonic n.obs is 24 with the empirical chi square 49.27 with prob < 0.9
## The total n.obs was 24 with Likelihood Chi Square = 1603 with prob < 2.7e-293
##
## Tucker Lewis Index of factoring reliability = -0.517
## RMSEA index = 1.008 and the 90 % confidence intervals are 0.987 NA
## BIC = 1402
## Fit based upon off diagonal values = 0.95
## Measures of factor score adequacy
## ML3 ML2 ML1
## Correlation of (regression) scores with factors 0.97 0.99 1.00
## Multiple R square of scores with factors 0.94 0.98 1.00
## Minimum correlation of possible factor scores 0.88 0.96 0.99
Exploring these results, there are three factors composed by:
* Q1, Q2, Q7.i, Q7.ii, and Q7.v.
* Q3, Q5, Q6, and Q7.iv.
* Q8, Q9, Q10, Q11, and Q4 inv.
Q7.iii is linked with two factors. Besides, the association between Q4 inv and the ML3 is just above the threshold (with a low communality as well). It is important to highlight that Q7.iii (“Mi doctor cambiará mi tratamiento inmediatamente de acuerdo a los resultados.”) was mentioned before as a complex item, since this could be different for different patients. For some that will be true, while for others receiving a treatment the genomic testing is just to have additional information for the future. According to that, it is possible to find different relationships for this item to others depending on each subgroup of patients.
On the other hand, Q3 could be associated with Q5 and Q6 perhaps because the same patients having expectations about treatment options are the same that want to learn more.
Finally, plots show the relationship between the items and the factors.
As first approach, the factor analysis is done with all the variables in order to have an exploratory view of the relationship between all the variables and the three factors. In this setting, all the variables are associated with only one factor except item 4-inverted.
These three factors are principally associated with the spheres and concepts previously described:
- Expectations of genomic results on treatment impact: item 5, 6, 7.iv, 7.v., and 7.iii.
- Expectations of results communications and the information provided: item 1, 2, 3, 4, 7.i, and 7.ii.
- Concerns: items 8, 9, 10, and 11.
Hence, the first factor is linked with the information concept; nevertheless, it does not include items 3 and 4, and has the item 7.v from treatment sphere. Q7.v ask about experimental therapies and perhaps those focused on that or looking for experimental options or clinical trials have certain information pattern. Then, the second factor is related with treatment impact; however, it includes item 3. Finally, the last factor includes all the concerns items plus Q4 inv.
Now, the same analysis is performed excluding the items previously identify as candidates to be left out.
- From descriptive and reliability analysis.
Potential items to be excluded according to the two expectations’ spheres:
- Expectations of genomic results on treatment impact: 7.iii, followed by 7.iv and 7.v. These two last items have adequate values, but they are the less critical ones in this sphere. Item 7.iv has a lower discrimination score than 7.v.
- Expectations of results communications and the information provided: 1 (if item 2 is selected), 4 due to its conflicting results along the analysis. Then item 3 and 7.ii showed conflicting results too. However, item 3 could be relevant if an interventional approach is considered.
With regard to the concerns domain, the there is not a clear difference between them. At least, item 11 seem to be relevant. The second item to be chosen can be identify when the rest of the analysis were completed.
From current factor analysis.
Considering the KMO, Q1, Q3, Q7_i, and Q4_Inv are under the threshold of 0.5.
Moreover, Q1, Q3, and Q4 inv have lower communalities. Then, Q7.iii was found related with two factors.Considering all together.
Therefore, as first approach Q1 and Q4 are the items identified in both, the first analysis and the factor analysis. Then other items, such as Q3 and Q7.iii, could be evaluated.
6.1.2 Excluding item 1 and 4.
Now, a new analysis is done considering all except Q1 and Q4.

Figure 6.2: The list of the items are: 2.He recibido suficiente información para comprender beneficios y riesgos del análisis genómico, 3.Estoy interesado/a en aprender más, 5.El resultado ayudará al control de mi cáncer, 6.El resultado ayudará a aumentar mi expectativa vida, 7i.Mi Dr me explicará resultados y la implicación para mi salud, 7ii.Recibiré informe escrito con el resultado, 7iii.Mi Dr cambiará mi tto de acuerdo a los resultados, 7iv.Tendré opciones de tratamiento adicionales, 7v.Podré recibir tratamientos experimentales, 8.Me preocupa que los resultados puedan no guiar mi tratamiento, 9.Me preocupa que los resultados pueden ser difíciles de comprender, 10.Me preocupa que los resultados pueden dar información del riesgo de enf que preferiría no saber, 11.Los resultados pueden preocuparme o generar ansiedad
The Barlett’s sphericity test is performed.
Thus, the p-value = 0 and the H0 is rejected confirming the utility of applying a factor analysis to this dataset.
## Kaiser-Meyer-Olkin factor adequacy
## Call: KMO(r = pre_test_Q_ExpConcern_facAn_2_corr)
## Overall MSA = 0.59
## MSA for each item =
## q2 q3 q5 q6 q7_i q7_ii q7_iii q7_iv q7_v q8 q9
## 0.67 0.42 0.71 0.64 0.44 0.56 0.61 0.72 0.54 0.58 0.54
## q10 q11
## 0.48 0.70
To explore the number of factors the same approach is conducted. Thus, the table with the PCA results is shown below:
PC1 | PC2 | PC3 | PC4 | PC5 | PC6 | PC7 | PC8 | PC9 | PC10 | PC11 | PC12 | PC13 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Standard deviation | 2.1985 | 1.7278 | 1.2335 | 0.9751 | 0.7983 | 0.7524 | 0.7071 | 0.6274 | 0.4759 | 0.4660 | 0.2850 | 0.2244 | 0.1919 |
Proportion of Variance | 0.3718 | 0.2296 | 0.1170 | 0.0731 | 0.0490 | 0.0436 | 0.0385 | 0.0303 | 0.0174 | 0.0167 | 0.0063 | 0.0039 | 0.0028 |
Cumulative Proportion | 0.3718 | 0.6014 | 0.7185 | 0.7916 | 0.8406 | 0.8842 | 0.9226 | 0.9529 | 0.9703 | 0.9870 | 0.9933 | 0.9972 | 1.0000 |
Then, the scree plot for this PCA analysis is displayed.
According to these results, 3 factors could be the best numbers. The cumulative proportion showed that with 3 components 72% of the variance could be explained, besides, just a 7% increase was added with an additional component. The scree plot exhibit two steps, in the second and the third component, with then an elbow at the third-four component.
Then, another approach is implemented. With this strategy several analysis are combined and depict in the same figure. The tools implemented are: the Kaiser rule (which drops the components with eigenvalues < 1), the parallel analysis, and the usual scree test (plotuScree), the acceleration factor (which indicates where the elbow of the scree plot appears).
With this approach two factors are suggested as the best strategy. Thus, both approaches will be studied considering two and three factors.
## Warning in cor.smooth(mat): Matrix was not positive definite, smoothing was
## done
## In factor.stats, I could not find the RMSEA upper bound . Sorry about that
## pre_exp_preoc_q2 pre_exp_preoc_q3 pre_exp_preoc_q5
## 0.6099 0.3795 0.8780
## pre_exp_preoc_q6 pre_exp_preoc_q7_i pre_exp_preoc_q7_ii
## 0.9640 0.9950 0.6127
## pre_exp_preoc_q7_iii pre_exp_preoc_q7_iv pre_exp_preoc_q7_v
## 0.7048 0.9950 0.6709
## pre_exp_preoc_q8 pre_exp_preoc_q9 pre_exp_preoc_q10
## 0.8428 0.6074 0.7961
## pre_exp_preoc_q11
## 0.7393
While most the items are well explained by the factors, in line with previous findings, Q3 still has a low communality result.
## Factor Analysis using method = ml
## Call: fa(r = pre_test_Q_ExpConcern_facAn_2, nfactors = 3, rotate = "oblimin",
## fm = "ml", cor = "poly")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML1 ML3 ML2 h2 u2 com
## pre_exp_preoc_q2 0.57 0.61 0.3912 2.2
## pre_exp_preoc_q3 0.61 0.37 0.6255 1.0
## pre_exp_preoc_q5 0.83 0.88 0.1224 1.4
## pre_exp_preoc_q6 1.01 0.96 0.0360 1.0
## pre_exp_preoc_q7_i 1.02 1.00 0.0050 1.0
## pre_exp_preoc_q7_ii 0.76 0.61 0.3868 1.0
## pre_exp_preoc_q7_iii 0.53 0.43 0.70 0.2951 2.3
## pre_exp_preoc_q7_iv 0.85 1.00 0.0049 1.4
## pre_exp_preoc_q7_v 0.64 0.67 0.3285 1.5
## pre_exp_preoc_q8 0.89 0.84 0.1569 1.2
## pre_exp_preoc_q9 0.72 0.61 0.3931 1.1
## pre_exp_preoc_q10 0.87 0.80 0.2042 1.1
## pre_exp_preoc_q11 0.83 0.74 0.2613 1.1
##
## ML1 ML3 ML2
## SS loadings 3.58 3.38 2.83
## Proportion Var 0.28 0.26 0.22
## Cumulative Var 0.28 0.54 0.75
## Proportion Explained 0.37 0.34 0.29
## Cumulative Proportion 0.37 0.71 1.00
##
## With factor correlations of
## ML1 ML3 ML2
## ML1 1.00 0.18 0.43
## ML3 0.18 1.00 0.04
## ML2 0.43 0.04 1.00
##
## Mean item complexity = 1.3
## Test of the hypothesis that 3 factors are sufficient.
##
## df null model = 78 with the objective function = 76.25 with Chi Square = 1360
## df of the model are 42 and the objective function was 63.28
##
## The root mean square of the residuals (RMSR) is 0.07
## The df corrected root mean square of the residuals is 0.1
##
## The harmonic n.obs is 24 with the empirical chi square 19.38 with prob < 1
## The total n.obs was 24 with Likelihood Chi Square = 1002 with prob < 1.2e-182
##
## Tucker Lewis Index of factoring reliability = -0.579
## RMSEA index = 0.975 and the 90 % confidence intervals are 0.944 NA
## BIC = 868.4
## Fit based upon off diagonal values = 0.98
## Measures of factor score adequacy
## ML1 ML3 ML2
## Correlation of (regression) scores with factors 0.99 0.98 1.00
## Multiple R square of scores with factors 0.99 0.96 1.00
## Minimum correlation of possible factor scores 0.98 0.91 0.99
There are three factors composed by:
* Q3, Q5, Q6, and Q7.iv.
* Q8, Q9, Q10, and Q11.
* Q2, Q7.i, Q7.ii, and Q7.v.
Finally, plots show the relationship between the items and the factors.
The results now are more consistent. The only issue is the finding that item Q7.iii is still belonging to two items.
A last approach will be considered excluding this item.
6.1.3 Excluding item 1, 4, and 7.iii.
The Barlett’s sphericity test is performed.
Thus, the p-value = 0 and the H0 is rejected confirming the utility of applying a factor analysis to this dataset.
## Kaiser-Meyer-Olkin factor adequacy
## Call: KMO(r = pre_test_Q_ExpConcern_facAn_3_corr)
## Overall MSA = 0.58
## MSA for each item =
## q2 q3 q5 q6 q7_i q7_ii q7_iv q7_v q8 q9 q10 q11
## 0.66 0.43 0.74 0.61 0.53 0.64 0.65 0.51 0.62 0.49 0.45 0.67
To explore the number of factors the same approach is conducted. Thus, the table with the PCA results is shown below:
PC1 | PC2 | PC3 | PC4 | PC5 | PC6 | PC7 | PC8 | PC9 | PC10 | PC11 | PC12 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Standard deviation | 2.0809 | 1.7154 | 1.2202 | 0.9222 | 0.7796 | 0.7383 | 0.7061 | 0.4865 | 0.4758 | 0.4203 | 0.2307 | 0.2081 |
Proportion of Variance | 0.3608 | 0.2452 | 0.1241 | 0.0709 | 0.0506 | 0.0454 | 0.0415 | 0.0197 | 0.0189 | 0.0147 | 0.0044 | 0.0036 |
Cumulative Proportion | 0.3608 | 0.6061 | 0.7302 | 0.8010 | 0.8517 | 0.8971 | 0.9386 | 0.9584 | 0.9772 | 0.9920 | 0.9964 | 1.0000 |
Then, the scree plot for this PCA analysis is displayed.
According to these results, 3 factors could be the best numbers. The cumulative proportion showed that with 3 components 73% of the variance could be explained, besides, just a 7% increase was added with an additional component. The scree plot exhibit an elbow at the third component.
Then, another approach is implemented. With this strategy several analysis are combined and depict in the same figure. The tools implemented are: the Kaiser rule (which drops the components with eigenvalues < 1), the parallel analysis, and the usual scree test (plotuScree), the acceleration factor (which indicates where the elbow of the scree plot appears).
According to the analysis, three factors seems to be an adequate approach.
## Warning in cor.smooth(mat): Matrix was not positive definite, smoothing was
## done
## In factor.stats, I could not find the RMSEA upper bound . Sorry about that
## pre_exp_preoc_q2 pre_exp_preoc_q3 pre_exp_preoc_q5 pre_exp_preoc_q6
## 0.6126 0.3707 0.9000 0.9730
## pre_exp_preoc_q7_i pre_exp_preoc_q7_ii pre_exp_preoc_q7_iv pre_exp_preoc_q7_v
## 0.9950 0.6978 0.9950 0.6947
## pre_exp_preoc_q8 pre_exp_preoc_q9 pre_exp_preoc_q10 pre_exp_preoc_q11
## 0.8573 0.5934 0.7752 0.7491
While most the items are well explained by the factors, in line with previous findings, Q3 still has a low communality result.
## Factor Analysis using method = ml
## Call: fa(r = pre_test_Q_ExpConcern_facAn_3, nfactors = 3, rotate = "oblimin",
## fm = "ml", cor = "poly")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML1 ML3 ML2 h2 u2 com
## pre_exp_preoc_q2 0.58 0.61 0.3855 2.1
## pre_exp_preoc_q3 0.59 0.37 0.6281 1.0
## pre_exp_preoc_q5 0.81 0.90 0.1002 1.4
## pre_exp_preoc_q6 1.01 0.97 0.0270 1.0
## pre_exp_preoc_q7_i 1.02 1.00 0.0049 1.0
## pre_exp_preoc_q7_ii 0.82 0.70 0.3012 1.0
## pre_exp_preoc_q7_iv 0.85 1.00 0.0049 1.4
## pre_exp_preoc_q7_v 0.66 0.69 0.3059 1.5
## pre_exp_preoc_q8 0.89 0.86 0.1422 1.3
## pre_exp_preoc_q9 0.71 0.59 0.4078 1.2
## pre_exp_preoc_q10 0.86 0.77 0.2256 1.1
## pre_exp_preoc_q11 0.84 0.75 0.2498 1.1
##
## ML1 ML3 ML2
## SS loadings 3.19 3.13 2.89
## Proportion Var 0.27 0.26 0.24
## Cumulative Var 0.27 0.53 0.77
## Proportion Explained 0.35 0.34 0.31
## Cumulative Proportion 0.35 0.69 1.00
##
## With factor correlations of
## ML1 ML3 ML2
## ML1 1.00 0.14 0.41
## ML3 0.14 1.00 0.01
## ML2 0.41 0.01 1.00
##
## Mean item complexity = 1.3
## Test of the hypothesis that 3 factors are sufficient.
##
## df null model = 66 with the objective function = 74.06 with Chi Square = 1345
## df of the model are 33 and the objective function was 61.64
##
## The root mean square of the residuals (RMSR) is 0.07
## The df corrected root mean square of the residuals is 0.1
##
## The harmonic n.obs is 24 with the empirical chi square 15.81 with prob < 1
## The total n.obs was 24 with Likelihood Chi Square = 996.5 with prob < 5.3e-188
##
## Tucker Lewis Index of factoring reliability = -0.703
## RMSEA index = 1.102 and the 90 % confidence intervals are 1.067 NA
## BIC = 891.6
## Fit based upon off diagonal values = 0.98
## Measures of factor score adequacy
## ML1 ML3 ML2
## Correlation of (regression) scores with factors 1.00 0.98 1.00
## Multiple R square of scores with factors 0.99 0.96 1.00
## Minimum correlation of possible factor scores 0.98 0.92 0.99
There are three factors composed by:
* Q3, Q5, Q6, and Q7.iv.
* Q8, Q9, Q10, and Q11.
* Q2, Q7.i, Q7.ii, and Q7.v.
Finally, plots show the relationship between the items and the factors.
6.1.4 Excluding items Q1, Q4, Q7_ii, Q7_iii and Q3.
The first two items seem to be conflicting and with no relevant information. Then, Q7_ii is study dependent, because it is focused on giving to the patient a written report and this could be defined on the clinical context or even inside the study (e.g., the HOPE study). On the other hand, Q7_iii and Q3 are collecting interesting data but probably they are not related with the other questions in a particular domain. Therefore, these could be excluded from the factor analysis but they could be retained inside the questionnaire.
The Barlett’s sphericity test is performed.
Thus, the p-value = 0 and the H0 is rejected confirming the utility of applying a factor analysis to this dataset.
Considering the fact that the Bartlett’s test usually rejects the H0 since the scenario of the null hypothesis is too extreme, the KMO analysis is studied to determine how well the data fit the factor analysis and how useful each item is.
## Kaiser-Meyer-Olkin factor adequacy
## Call: KMO(r = pre_test_Q_ExpConcern_facAn_4_corr)
## Overall MSA = 0.64
## MSA for each item =
## q2 q5 q6 q7_i q7_iv q7_v q8 q9 q10 q11
## 0.76 0.75 0.65 0.51 0.66 0.62 0.66 0.60 0.51 0.81
According to these results, all the items are above 0.5. Two items are below 0.6.
To explore the number of factors PCA is applied. Thus, the table with the PCA results is shown below:
PC1 | PC2 | PC3 | PC4 | PC5 | PC6 | PC7 | PC8 | PC9 | PC10 | |
---|---|---|---|---|---|---|---|---|---|---|
Standard deviation | 1.9700 | 1.6480 | 1.0821 | 0.7451 | 0.7248 | 0.6501 | 0.5577 | 0.4424 | 0.3838 | 0.2743 |
Proportion of Variance | 0.3881 | 0.2716 | 0.1171 | 0.0555 | 0.0525 | 0.0423 | 0.0311 | 0.0196 | 0.0147 | 0.0075 |
Cumulative Proportion | 0.3881 | 0.6597 | 0.7768 | 0.8323 | 0.8848 | 0.9271 | 0.9582 | 0.9778 | 0.9925 | 1.0000 |
Then, the scree plot for this PCA analysis is displayed.
According to these results, 2-3 factors could be the best number. While the cumulative proportion for 3 components is 78%, for two is 66%; beside, two elbows can be identified, the first one after the second component, and the other one on the third component.
Then, another approach is implemented. With this strategy several analysis are combined and depict in the same figure. The tools implemented are: the Kaiser rule (which drops the components with eigenvalues < 1), the parallel analysis, and the usual scree test (plotuScree), the acceleration factor (which indicates where the elbow of the scree plot appears).
Therefore, considering two factors seem to be an adequate approach.
Running the factor analysis with 10 items, first the communalities are explored:
## Warning in cor.smooth(mat): Matrix was not positive definite, smoothing was
## done
## In factor.stats, I could not find the RMSEA upper bound . Sorry about that
## pre_exp_preoc_q2 pre_exp_preoc_q5 pre_exp_preoc_q6 pre_exp_preoc_q7_i
## 0.4291 0.9066 0.8943 0.2557
## pre_exp_preoc_q7_iv pre_exp_preoc_q7_v pre_exp_preoc_q8 pre_exp_preoc_q9
## 0.9950 0.4134 0.6401 0.6241
## pre_exp_preoc_q10 pre_exp_preoc_q11
## 0.8089 0.7568
Considering the values of the communalities, only Q7_i is below 0.3.
Then, the whole output is displayed.
## Factor Analysis using method = ml
## Call: fa(r = pre_test_Q_ExpConcern_facAn_4, nfactors = 2, rotate = "oblimin",
## fm = "ml", cor = "poly")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML1 ML2 h2 u2 com
## pre_exp_preoc_q2 0.57 -0.40 0.43 0.5703 1.8
## pre_exp_preoc_q5 0.95 0.91 0.0934 1.1
## pre_exp_preoc_q6 0.93 0.89 0.1057 1.0
## pre_exp_preoc_q7_i 0.51 0.26 0.7450 1.1
## pre_exp_preoc_q7_iv 0.85 0.42 1.00 0.0049 1.5
## pre_exp_preoc_q7_v 0.62 0.41 0.5860 1.0
## pre_exp_preoc_q8 0.79 0.64 0.3599 1.0
## pre_exp_preoc_q9 0.74 0.62 0.3763 1.1
## pre_exp_preoc_q10 0.91 0.81 0.1910 1.1
## pre_exp_preoc_q11 0.85 0.76 0.2434 1.0
##
## ML1 ML2
## SS loadings 3.56 3.16
## Proportion Var 0.36 0.32
## Cumulative Var 0.36 0.67
## Proportion Explained 0.53 0.47
## Cumulative Proportion 0.53 1.00
##
## With factor correlations of
## ML1 ML2
## ML1 1.00 0.13
## ML2 0.13 1.00
##
## Mean item complexity = 1.2
## Test of the hypothesis that 2 factors are sufficient.
##
## df null model = 45 with the objective function = 50.58 with Chi Square = 1003
## df of the model are 26 and the objective function was 42.22
##
## The root mean square of the residuals (RMSR) is 0.12
## The df corrected root mean square of the residuals is 0.16
##
## The harmonic n.obs is 25 with the empirical chi square 34.67 with prob < 0.12
## The total n.obs was 25 with Likelihood Chi Square = 781 with prob < 6.8e-148
##
## Tucker Lewis Index of factoring reliability = -0.467
## RMSEA index = 1.077 and the 90 % confidence intervals are 1.034 NA
## BIC = 697.4
## Fit based upon off diagonal values = 0.93
## Measures of factor score adequacy
## ML1 ML2
## Correlation of (regression) scores with factors 0.99 0.97
## Multiple R square of scores with factors 0.98 0.95
## Minimum correlation of possible factor scores 0.97 0.89
There are 2 factors:
* Q2, Q8, Q9, Q10, Q11, and Q7_iv
* Q5, Q6, Q7_i, Q7_iv, Q7_v, Q2.
Item Q7.iv is predominantly in the first factor. Q2 is also linked with both factor, while the loadings for the first factor is larger than the other, both are still low.
Finally, plots show the relationship between the items and the factors.
6.1.5 Conclusion
Excluding three items the final number of questions included is 12 which is a reasonable amount and the results and metrics are adequate. With this approach plus five items from knowledge, and 2-3 sociodemographic questions, the number rises to 20. If five more items are included from attitudes, the whole questionnaire have 25 items. The initial current number was 46 items.
Therefore, three items could be considered to drop out from the questionnaire: Q1, Q4, and Q7.ii. Then, probably other two will be excluded from the factor analysis.
6.2 Attitudes domain
This domain gathers 11 items, 8 with a Likert Scale and 3 with multiple options. Out of these, 3 are about patients attitudes to an additional procedure, 2 are focused on how patients perceive the test, and 6 ask about motivations (3 with a Likert scale and 3 with multiple options). Therefore, the global idea is to reduce motivation items preserving only the question that allows the patient to select the main motivation. This strategy shrinks significantly the number of items, from 6 to 1. Doing this, there are 6 items (3=attitudes, 2=perception, 1=motivation). Considering the fact that the target number of items for this domain is 5, an additional item should be excluded. In this setting, candidates items for exclusion are Q4 or Q5, being Q5 the worst of them.
6.2.1 All the items.
First, a factor analysis with all the 8 items is performed.
Items 1-3 are correlated as well as those evaluationg motivation (Q6-8). Item Q5, which was conflicting in prior analysis, here does not have any correlation.
As first approach the Barlett’s sphericity test is performed.
Thus, the p-value = 0 and the H0 is rejected confirming the utility of applying a factor analysis to this dataset.
Considering the fact that the Bartlett’s test usually rejects the H0 since the scenario of the null hypothesis is too extreme, the KMO analysis is studied to determine how well the data fit the factor analysis and how useful each item is.
## Error in solve.default(r) :
## Lapack routine dgesv: system is exactly singular: U[7,7] = 0
## matrix is not invertible, image not found
## Kaiser-Meyer-Olkin factor adequacy
## Call: KMO(r = pre_test_Q_Actitd_facAn_1_corr)
## Overall MSA = 0.5
## MSA for each item =
## q1 q2 q3 q4 q6 q7 q8 q5-inv
## 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5
The matrix is not invertible and the KMO cannot be calculated.
To explore the number of factors PCA and evaluate the variability explained for each component is considered. Thus, the table with the PCA results is shown below:
PC1 | PC2 | PC3 | PC4 | PC5 | PC6 | PC7 | PC8 | |
---|---|---|---|---|---|---|---|---|
Standard deviation | 2.0138 | 1.4387 | 0.8599 | 0.6794 | 0.6181 | 0.4683 | 0.2691 | 0 |
Proportion of Variance | 0.5069 | 0.2587 | 0.0924 | 0.0577 | 0.0478 | 0.0274 | 0.0091 | 0 |
Cumulative Proportion | 0.5069 | 0.7657 | 0.8581 | 0.9158 | 0.9635 | 0.9910 | 1.0000 | 1 |
Then, the scree plot for this PCA analysis is displayed.
According to these results, 2 or 3 factors could be adequate. With two 77% of the variability is explained.
Then, another approach is implemented. With this strategy several analysis are combined and depict in the same figure. The tools implemented are: the Kaiser rule (which drops the components with eigenvalues < 1), the parallel analysis, and the usual scree test (plotuScree), the acceleration factor (which indicates where the elbow of the scree plot appears).
Therefore, two factors will be considered.
Running the factor analysis with all the items (n=8), first the communalities are explored:
## Warning in cor.smooth(mat): Matrix was not positive definite, smoothing was
## done
## In factor.stats, I could not find the RMSEA upper bound . Sorry about that
## In factor.scores, the correlation matrix is singular, the pseudo inverse is used
## I was unable to calculate the factor score weights, factor loadings used instead
## pre_actit_q1 pre_actit_q2 pre_actit_q3
## 0.5970 0.9950 0.9202
## pre_actit_q4 pre_actit_q6 pre_actit_q7
## 0.7857 0.9343 0.9950
## pre_actit_q8 pre_actit_q5_Inverted
## 0.9950 0.4544
Therefore, considering the values of the communalities all the items are above 0.4.
Then, the whole output is displayed.
## Factor Analysis using method = ml
## Call: fa(r = pre_test_Q_Actitd_facAn_1, nfactors = 2, rotate = "oblimin",
## fm = "ml", cor = "poly")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML1 ML2 h2 u2 com
## pre_actit_q1 0.70 0.60 0.4032 1.1
## pre_actit_q2 1.01 1.00 0.0050 1.0
## pre_actit_q3 0.90 0.92 0.0799 1.1
## pre_actit_q4 0.67 0.42 0.79 0.2142 1.7
## pre_actit_q6 1.00 0.93 0.0657 1.0
## pre_actit_q7 0.98 1.00 0.0025 1.0
## pre_actit_q8 0.98 1.00 0.0025 1.0
## pre_actit_q5_Inverted -0.40 0.67 0.45 0.5466 1.6
##
## ML1 ML2
## SS loadings 3.65 3.03
## Proportion Var 0.46 0.38
## Cumulative Var 0.46 0.84
## Proportion Explained 0.55 0.45
## Cumulative Proportion 0.55 1.00
##
## With factor correlations of
## ML1 ML2
## ML1 1.0 0.3
## ML2 0.3 1.0
##
## Mean item complexity = 1.2
## Test of the hypothesis that 2 factors are sufficient.
##
## df null model = 28 with the objective function = 38.07 with Chi Square = 742.3
## df of the model are 13 and the objective function was 23.46
##
## The root mean square of the residuals (RMSR) is 0.05
## The df corrected root mean square of the residuals is 0.08
##
## The harmonic n.obs is 24 with the empirical chi square 3.56 with prob < 1
## The total n.obs was 24 with Likelihood Chi Square = 426.3 with prob < 0.000000000000000000000000000000000000000000000000000000000000000000000000000000000063
##
## Tucker Lewis Index of factoring reliability = -0.341
## RMSEA index = 1.15 and the 90 % confidence intervals are 1.081 NA
## BIC = 384.9
## Fit based upon off diagonal values = 0.99
## Measures of factor score adequacy
## ML1 ML2
## Correlation of (regression) scores with factors 1 1.00
## Multiple R square of scores with factors 1 1.00
## Minimum correlation of possible factor scores 1 0.99
Exploring these results, there are two factors composed by:
* Q6, Q7, and Q8.
* Q1, Q2, and Q3.
Both, Q4 and Q5 inv are presented in ML1 and 2.
Finally, plots show the relationship between the items and the factors.
According to the factor analysis, two factors are found; one, is related to attitudes (Q1-3) and probably to the perception of benefit from genomic testing, and the other is related to motivation (Q6-8) and probably the perception of imprecision about the genomic testing (Q5 inv). In addition, when 3 factors were considered, instead of two, item 5 was identified as the unique element in this third factor. Taken all together, an approach could be choosing the first four items as the same domain, and the multiple option item as the other domain.
6.2.2 Excluding item 5.
Items 1-3 are correlated as well as those evaluating motivation (Q6-8).
As first approach the Barlett’s sphericity test is performed.
Thus, the p-value = 0 and the H0 is rejected confirming the utility of applying a factor analysis to this dataset.
Considering the fact that the Bartlett’s test usually rejects the H0 since the scenario of the null hypothesis is too extreme, the KMO analysis is studied to determine how well the data fit the factor analysis and how useful each item is.
## Kaiser-Meyer-Olkin factor adequacy
## Call: KMO(r = pre_test_Q_Actitd_facAn_2_corr)
## Overall MSA = 0.76
## MSA for each item =
## q1 q2 q3 q4 q6 q7 q8
## 0.90 0.65 0.71 0.94 0.88 0.70 0.74
All the values are above 0.7.
To explore the number of factors PCA and evaluate the variability explained for each component is considered. Thus, the table with the PCA results is shown below:
PC1 | PC2 | PC3 | PC4 | PC5 | PC6 | PC7 | |
---|---|---|---|---|---|---|---|
Standard deviation | 1.9949 | 1.3312 | 0.6795 | 0.6383 | 0.5015 | 0.2888 | 0.2103 |
Proportion of Variance | 0.5685 | 0.2532 | 0.0660 | 0.0582 | 0.0359 | 0.0119 | 0.0063 |
Cumulative Proportion | 0.5685 | 0.8217 | 0.8876 | 0.9458 | 0.9818 | 0.9937 | 1.0000 |
Then, the scree plot for this PCA analysis is displayed.
According to these results, 2 factors could be adequate. With two 77% of the variability is explained.
Then, another approach is implemented. With this strategy several analysis are combined and depict in the same figure. The tools implemented are: the Kaiser rule (which drops the components with eigenvalues < 1), the parallel analysis, and the usual scree test (plotuScree), the acceleration factor (which indicates where the elbow of the scree plot appears).
Therefore, two factors will be considered.
Running the factor analysis with all the items (n=7), first the communalities are explored:
## Warning in cor.smooth(mat): Matrix was not positive definite, smoothing was
## done
## In factor.stats, I could not find the RMSEA upper bound . Sorry about that
## pre_actit_q1 pre_actit_q2 pre_actit_q3 pre_actit_q4 pre_actit_q6 pre_actit_q7
## 0.5954 0.9689 0.9905 0.7851 0.9154 0.9950
## pre_actit_q8
## 0.9778
Therefore, considering the values of the communalities all the items are above 0.4.
Then, the whole output is displayed.
## Factor Analysis using method = ml
## Call: fa(r = pre_test_Q_Actitd_facAn_2, nfactors = 2, rotate = "oblimin",
## fm = "ml", cor = "poly")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML1 ML2 h2 u2 com
## pre_actit_q1 0.72 0.60 0.4046 1.1
## pre_actit_q2 1.02 0.97 0.0311 1.0
## pre_actit_q3 0.97 0.99 0.0095 1.0
## pre_actit_q4 0.61 0.46 0.79 0.2149 1.9
## pre_actit_q6 0.99 0.92 0.0846 1.0
## pre_actit_q7 0.97 1.00 0.0041 1.0
## pre_actit_q8 1.00 0.98 0.0222 1.0
##
## ML1 ML2
## SS loadings 3.41 2.82
## Proportion Var 0.49 0.40
## Cumulative Var 0.49 0.89
## Proportion Explained 0.55 0.45
## Cumulative Proportion 0.55 1.00
##
## With factor correlations of
## ML1 ML2
## ML1 1.00 0.36
## ML2 0.36 1.00
##
## Mean item complexity = 1.1
## Test of the hypothesis that 2 factors are sufficient.
##
## df null model = 21 with the objective function = 30.28 with Chi Square = 630.8
## df of the model are 8 and the objective function was 18.21
##
## The root mean square of the residuals (RMSR) is 0.02
## The df corrected root mean square of the residuals is 0.03
##
## The harmonic n.obs is 25 with the empirical chi square 0.43 with prob < 1
## The total n.obs was 25 with Likelihood Chi Square = 355.1 with prob < 0.0000000000000000000000000000000000000000000000000000000000000000000000074
##
## Tucker Lewis Index of factoring reliability = -0.6
## RMSEA index = 1.317 and the 90 % confidence intervals are 1.227 NA
## BIC = 329.3
## Fit based upon off diagonal values = 1
## Measures of factor score adequacy
## ML1 ML2
## Correlation of (regression) scores with factors 1.00 1.00
## Multiple R square of scores with factors 1.00 0.99
## Minimum correlation of possible factor scores 0.99 0.99
Exploring these results, there are three factors composed by:
* Q6, Q7, and Q8.
* Q1, Q2, and Q3.
Again, Q4 are presented in both.
Finally, plots show the relationship between the items and the factors.
Now, although Q4 is related to both its loading is higher for ML1 and was located in the opposite factor comparing to the first analysis.
6.2.3 Excluding item 4 (preserving item 5 inverted).
Items 1-3 are correlated as well as those evaluating motivation (Q6-8).
As first approach the Barlett’s sphericity test is performed.
Thus, the p-value = 0 and the H0 is rejected confirming the utility of applying a factor analysis to this dataset.
Considering the fact that the Bartlett’s test usually rejects the H0 since the scenario of the null hypothesis is too extreme, the KMO analysis is studied to determine how well the data fit the factor analysis and how useful each item is.
## Error in solve.default(r) :
## Lapack routine dgesv: system is exactly singular: U[6,6] = 0
## matrix is not invertible, image not found
## Kaiser-Meyer-Olkin factor adequacy
## Call: KMO(r = pre_test_Q_Actitd_facAn_3_corr)
## Overall MSA = 0.5
## MSA for each item =
## q1 q2 q3 q6 q7 q8 q5_inv
## 0.5 0.5 0.5 0.5 0.5 0.5 0.5
The matrix is not invertible and the KMO cannot be calculated.
To explore the number of factors PCA and evaluate the variability explained for each component is considered. Thus, the table with the PCA results is shown below:
PC1 | PC2 | PC3 | PC4 | PC5 | PC6 | PC7 | |
---|---|---|---|---|---|---|---|
Standard deviation | 1.800 | 1.4187 | 0.8631 | 0.7443 | 0.5997 | 0.2955 | 0 |
Proportion of Variance | 0.463 | 0.2875 | 0.1064 | 0.0791 | 0.0514 | 0.0125 | 0 |
Cumulative Proportion | 0.463 | 0.7506 | 0.8570 | 0.9361 | 0.9875 | 1.0000 | 1 |
Then, the scree plot for this PCA analysis is displayed.
According to these results, 2 factors could be adequate. With two 75% of the variability is explained.
Then, another approach is implemented. With this strategy several analysis are combined and depict in the same figure. The tools implemented are: the Kaiser rule (which drops the components with eigenvalues < 1), the parallel analysis, and the usual scree test (plotuScree), the acceleration factor (which indicates where the elbow of the scree plot appears).
Therefore, two factors will be considered.
Running the factor analysis with all the items (n=7), first the communalities are explored:
## Warning in cor.smooth(mat): Matrix was not positive definite, smoothing was
## done
## In smc, smcs < 0 were set to .0
## In smc, smcs < 0 were set to .0
## In factor.stats, I could not find the RMSEA upper bound . Sorry about that
## In factor.scores, the correlation matrix is singular, the pseudo inverse is used
## I was unable to calculate the factor score weights, factor loadings used instead
## pre_actit_q1 pre_actit_q2 pre_actit_q3
## 0.6117 0.9950 0.8820
## pre_actit_q6 pre_actit_q7 pre_actit_q8
## 0.6996 0.9950 0.9950
## pre_actit_q5_Inverted
## 0.4516
Therefore, considering the values of the communalities all the items are above 0.4, but Q5 showed the lower score (0.45).
Then, the whole output is displayed.
## Factor Analysis using method = ml
## Call: fa(r = pre_test_Q_Actitd_facAn_3, nfactors = 2, rotate = "oblimin",
## fm = "ml", cor = "poly")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML2 ML1 h2 u2 com
## pre_actit_q1 0.71 0.61 0.3884 1.1
## pre_actit_q2 1.00 1.00 0.0049 1.0
## pre_actit_q3 0.89 0.88 0.1180 1.1
## pre_actit_q6 0.86 0.70 0.3003 1.1
## pre_actit_q7 0.97 1.00 0.0025 1.0
## pre_actit_q8 0.97 1.00 0.0025 1.0
## pre_actit_q5_Inverted 0.67 0.45 0.5467 1.5
##
## ML2 ML1
## SS loadings 2.82 2.82
## Proportion Var 0.40 0.40
## Cumulative Var 0.40 0.81
## Proportion Explained 0.50 0.50
## Cumulative Proportion 0.50 1.00
##
## With factor correlations of
## ML2 ML1
## ML2 1.00 0.25
## ML1 0.25 1.00
##
## Mean item complexity = 1.1
## Test of the hypothesis that 2 factors are sufficient.
##
## df null model = 21 with the objective function = 34.92 with Chi Square = 727.6
## df of the model are 8 and the objective function was 23.69
##
## The root mean square of the residuals (RMSR) is 0.08
## The df corrected root mean square of the residuals is 0.13
##
## The harmonic n.obs is 25 with the empirical chi square 6.71 with prob < 0.57
## The total n.obs was 25 with Likelihood Chi Square = 462 with prob < 0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001
##
## Tucker Lewis Index of factoring reliability = -0.805
## RMSEA index = 1.506 and the 90 % confidence intervals are 1.42 NA
## BIC = 436.2
## Fit based upon off diagonal values = 0.98
## Measures of factor score adequacy
## ML2 ML1
## Correlation of (regression) scores with factors 1.00 1.00
## Multiple R square of scores with factors 1.00 1.00
## Minimum correlation of possible factor scores 0.99 0.99
Exploring these results, there are two factors composed by:
* Q1, Q2, and Q3, plus Q5 inv.
* Q6, Q7, and Q8.
Finally, plots show the relationship between the items and the factors.
Again, attitudes items and motivation are located separately in different factors. Now, Q5 inv is related with the first factor with attitudes items.
6.2.4 Including Q1-Q5.
Now, motivation items are excluded.
Item Q5 has only a moderate correlation with Q2 with no correlation with Q4.
As first approach the Barlett’s sphericity test is performed.
Thus, the p-value = 0 and the H0 is rejected confirming the utility of applying a factor analysis to this dataset.
Considering the fact that the Bartlett’s test usually rejects the H0 since the scenario of the null hypothesis is too extreme, the KMO analysis is studied to determine how well the data fit the factor analysis and how useful each item is.
## Kaiser-Meyer-Olkin factor adequacy
## Call: KMO(r = pre_test_Q_Actitd_facAn_4_corr)
## Overall MSA = 0.65
## MSA for each item =
## q1 q2 q3 q4 q5-inv
## 0.84 0.60 0.62 0.71 0.53
All the items are above 0.5, only one is below 0.6.
To explore the number of factors PCA and evaluate the variability explained for each component is considered. Thus, the table with the PCA results is shown below:
PC1 | PC2 | PC3 | PC4 | PC5 | |
---|---|---|---|---|---|
Standard deviation | 1.7243 | 0.9632 | 0.7603 | 0.6658 | 0.2788 |
Proportion of Variance | 0.5946 | 0.1856 | 0.1156 | 0.0887 | 0.0156 |
Cumulative Proportion | 0.5946 | 0.7802 | 0.8958 | 0.9844 | 1.0000 |
Then, the scree plot for this PCA analysis is displayed.
According to these results, 1 or perhaps 2 factors could be adequate. With two 78% of the variability is explained.
Then, another approach is implemented. With this strategy several analysis are combined and depict in the same figure. The tools implemented are: the Kaiser rule (which drops the components with eigenvalues < 1), the parallel analysis, and the usual scree test (plotuScree), the acceleration factor (which indicates where the elbow of the scree plot appears).
Therefore, one factor will be considered.
Running the factor analysis with all the items (n=5), first the communalities are explored:
## Warning in cor.smooth(mat): Matrix was not positive definite, smoothing was
## done
## In smc, smcs < 0 were set to .0
## In smc, smcs < 0 were set to .0
## In factor.stats, I could not find the RMSEA upper bound . Sorry about that
## pre_actit_q1 pre_actit_q2 pre_actit_q3
## 0.5660 0.9950 0.9077
## pre_actit_q4 pre_actit_q5_Inverted
## 0.3594 0.3195
Therefore, considering the values of the communalities all the items are above 0.3.
Then, the whole output is displayed.
## Factor Analysis using method = ml
## Call: fa(r = pre_test_Q_Actitd_facAn_4, nfactors = 1, rotate = "oblimin",
## fm = "ml", cor = "poly")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML1 h2 u2 com
## pre_actit_q1 0.75 0.57 0.434 1
## pre_actit_q2 1.00 1.00 0.005 1
## pre_actit_q3 0.95 0.91 0.092 1
## pre_actit_q4 0.60 0.36 0.641 1
## pre_actit_q5_Inverted 0.57 0.32 0.680 1
##
## ML1
## SS loadings 3.15
## Proportion Var 0.63
##
## Mean item complexity = 1
## Test of the hypothesis that 1 factor is sufficient.
##
## df null model = 10 with the objective function = 24.03 with Chi Square = 492.6
## df of the model are 5 and the objective function was 20.04
##
## The root mean square of the residuals (RMSR) is 0.11
## The df corrected root mean square of the residuals is 0.15
##
## The harmonic n.obs is 24 with the empirical chi square 5.54 with prob < 0.35
## The total n.obs was 24 with Likelihood Chi Square = 397.4 with prob < 0.000000000000000000000000000000000000000000000000000000000000000000000000000000000011
##
## Tucker Lewis Index of factoring reliability = -0.682
## RMSEA index = 1.808 and the 90 % confidence intervals are 1.696 NA
## BIC = 381.6
## Fit based upon off diagonal values = 0.97
## Measures of factor score adequacy
## ML1
## Correlation of (regression) scores with factors 1.00
## Multiple R square of scores with factors 1.00
## Minimum correlation of possible factor scores 0.99
Exploring these results, there is one factor composed by:
* Composed by all the items (Q1-Q5).
The lowest loading is for Q5 with 0.57 followed by Q4 with 0.60. The total variance explained is 0.63.
Finally, plots show the relationship between the items and the factors.
6.2.5 Conclusion
Three options could be considered:
1. Items Q1-3 + Q4 and Q5 + item 9 with multiple options.
2. Items Q1-3 + Q5 + item 9 with multiple options.
3. Items Q1-3 + Q4 + item 9 with multiple options.
The positive effect of the 2nd approach is having a inverted item, but this Q5 showed some issues during the first analysis. On the other hand, the third strategy could be better but perhaps less informative.
6.3 Both domains, the expectations and concerns domain plus the attitudes domain
6.3.1 Expectations plus Q2 and Q3 from attitudes.
Including all the items selected for expectations (Q2, Q5, Q6, Q7.i, Q7.iv, Q7.v, Q8, Q9, 10, and Q11), plus two items from the attitude domain (Q2 and Q3).
The Barlett’s sphericity test is performed.
Thus, the p-value = 0 and the H0 is rejected confirming the utility of applying a factor analysis to this dataset.
Considering the fact that the Bartlett’s test usually rejects the H0 since the scenario of the null hypothesis is too extreme, the KMO analysis is studied to determine how well the data fit the factor analysis and how useful each item is.
## Kaiser-Meyer-Olkin factor adequacy
## Call: KMO(r = pre_test_Q_Exp_Attit_facAn_1_corr)
## Overall MSA = 0.64
## MSA for each item =
## q2 q5 q6 q7_i q7_iv q7_v q8 q9
## 0.70 0.60 0.63 0.69 0.66 0.66 0.68 0.62
## q10 q11 actit_q2 actit_q3
## 0.56 0.73 0.56 0.67
According to these results, there is no items under the threshold of 0.5. Two items are below 0.6.
To explore the number of factors PCA is applied. Thus, the table with the PCA results is shown below:
PC1 | PC2 | PC3 | PC4 | PC5 | PC6 | PC7 | PC8 | PC9 | PC10 | PC11 | PC12 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Standard deviation | 2.023 | 1.9119 | 1.2291 | 0.7759 | 0.7453 | 0.6567 | 0.6098 | 0.5580 | 0.4418 | 0.3863 | 0.2685 | 0.2304 |
Proportion of Variance | 0.341 | 0.3046 | 0.1259 | 0.0502 | 0.0463 | 0.0359 | 0.0310 | 0.0260 | 0.0163 | 0.0124 | 0.0060 | 0.0044 |
Cumulative Proportion | 0.341 | 0.6456 | 0.7715 | 0.8217 | 0.8680 | 0.9039 | 0.9349 | 0.9609 | 0.9771 | 0.9896 | 0.9956 | 1.0000 |
Then, the scree plot for this PCA analysis is displayed.
According to these results, 3 factors could be the best number. The cumulative proportion showed that with 3 components 77%, there are two elbows, one after two and the other on the third component.
Then, another approach is implemented. With this strategy several analysis are combined and depict in the same figure. The tools implemented are: the Kaiser rule (which drops the components with eigenvalues < 1), the parallel analysis, and the usual scree test (plotuScree), the acceleration factor (which indicates where the elbow of the scree plot appears).
Therefore, considering three factors seem to be an adequate approach.
Running the factor analysis with 12 items, first the communalities are explored:
## Warning in cor.smooth(mat): Matrix was not positive definite, smoothing was
## done
## In factor.stats, I could not find the RMSEA upper bound . Sorry about that
## pre_exp_preoc_q2 pre_exp_preoc_q5 pre_exp_preoc_q6 pre_exp_preoc_q7_i
## 0.7043 0.8482 0.9534 0.7732
## pre_exp_preoc_q7_iv pre_exp_preoc_q7_v pre_exp_preoc_q8 pre_exp_preoc_q9
## 0.9950 0.7092 0.7629 0.6177
## pre_exp_preoc_q10 pre_exp_preoc_q11 pre_actit_q2 pre_actit_q3
## 0.6972 0.7771 0.9371 0.9629
Considering the values of the communalities, all are above 0.3.
Then, the whole output is displayed.
## Factor Analysis using method = ml
## Call: fa(r = pre_test_Q_Exp_Attit_facAn_1, nfactors = 3, rotate = "oblimin",
## fm = "ml", cor = "poly")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML2 ML3 ML1 h2 u2 com
## pre_exp_preoc_q2 0.70 0.70 0.2956 1.4
## pre_exp_preoc_q5 0.85 0.85 0.1518 1.2
## pre_exp_preoc_q6 1.02 0.95 0.0466 1.0
## pre_exp_preoc_q7_i 0.87 0.77 0.2266 1.2
## pre_exp_preoc_q7_iv 0.84 1.00 0.0048 1.4
## pre_exp_preoc_q7_v 0.63 0.71 0.2911 2.1
## pre_exp_preoc_q8 0.90 0.76 0.2370 1.0
## pre_exp_preoc_q9 0.73 0.62 0.3825 1.1
## pre_exp_preoc_q10 0.79 0.70 0.3030 1.2
## pre_exp_preoc_q11 0.84 0.78 0.2229 1.0
## pre_actit_q2 0.95 0.94 0.0629 1.1
## pre_actit_q3 0.94 0.96 0.0371 1.0
##
## ML2 ML3 ML1
## SS loadings 3.67 3.22 2.85
## Proportion Var 0.31 0.27 0.24
## Cumulative Var 0.31 0.57 0.81
## Proportion Explained 0.38 0.33 0.29
## Cumulative Proportion 0.38 0.71 1.00
##
## With factor correlations of
## ML2 ML3 ML1
## ML2 1.00 -0.16 0.33
## ML3 -0.16 1.00 0.25
## ML1 0.33 0.25 1.00
##
## Mean item complexity = 1.2
## Test of the hypothesis that 3 factors are sufficient.
##
## df null model = 66 with the objective function = 74.6 with Chi Square = 1430
## df of the model are 33 and the objective function was 60.9
##
## The root mean square of the residuals (RMSR) is 0.05
## The df corrected root mean square of the residuals is 0.07
##
## The harmonic n.obs is 25 with the empirical chi square 7.44 with prob < 1
## The total n.obs was 25 with Likelihood Chi Square = 1045 with prob < 2.7e-198
##
## Tucker Lewis Index of factoring reliability = -0.667
## RMSEA index = 1.107 and the 90 % confidence intervals are 1.072 NA
## BIC = 939.2
## Fit based upon off diagonal values = 0.99
## Measures of factor score adequacy
## ML2 ML3 ML1
## Correlation of (regression) scores with factors 0.99 0.97 0.99
## Multiple R square of scores with factors 0.98 0.95 0.99
## Minimum correlation of possible factor scores 0.96 0.90 0.98
There are 3 factors:
* Q2, Q7_i, Q7_v, attQ2, attQ3.
* Q5, Q6, Q7_iv.
* Q8, Q9, Q10, Q11.
Finally, plots show the relationship between the items and the factors.
6.3.2 Expectations plus Q2, Q3, Q4, and Q5 inv from attitudes.
Including all the items selected for expectations (Q2, Q5, Q6, Q7.i, Q7.iv, Q7.v, Q8, Q9, 10, and Q11), plus four items from the attitude domain (Q2, Q3, Q4, and Q5 inverted).
The Barlett’s sphericity test is performed.
Thus, the p-value = 0 and the H0 is rejected confirming the utility of applying a factor analysis to this dataset.
Considering the fact that the Bartlett’s test usually rejects the H0 since the scenario of the null hypothesis is too extreme, the KMO analysis is studied to determine how well the data fit the factor analysis and how useful each item is.
## Kaiser-Meyer-Olkin factor adequacy
## Call: KMO(r = pre_test_Q_Exp_Attit_facAn_2_corr)
## Overall MSA = 0.61
## MSA for each item =
## q2 q5 q6 q7_i q7_iv q7_v
## 0.76 0.67 0.55 0.57 0.67 0.46
## q8 q9 q10 q11 actit_q2 actit_q3
## 0.52 0.58 0.51 0.80 0.65 0.67
## actit_q4 actit_q5_Inv
## 0.73 0.53
According to these results, there is one item under the threshold of 0.5 (q7_v). Six items are below 0.6.
To explore the number of factors PCA is applied. Thus, the table with the PCA results is shown below:
PC1 | PC2 | PC3 | PC4 | PC5 | PC6 | PC7 | PC8 | PC9 | PC10 | PC11 | PC12 | PC13 | PC14 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Standard deviation | 2.2233 | 1.8626 | 1.3819 | 0.8605 | 0.8470 | 0.7602 | 0.6753 | 0.5996 | 0.5481 | 0.4938 | 0.3672 | 0.2475 | 0.2267 | 0.1869 |
Proportion of Variance | 0.3531 | 0.2478 | 0.1364 | 0.0529 | 0.0512 | 0.0413 | 0.0326 | 0.0257 | 0.0215 | 0.0174 | 0.0096 | 0.0044 | 0.0037 | 0.0025 |
Cumulative Proportion | 0.3531 | 0.6009 | 0.7373 | 0.7902 | 0.8414 | 0.8827 | 0.9153 | 0.9410 | 0.9624 | 0.9798 | 0.9895 | 0.9938 | 0.9975 | 1.0000 |
Then, the scree plot for this PCA analysis is displayed.
According to these results, 3-4 factors could be the best number. The cumulative proportion showed that with 3 components 74% and 79% with four. The greatest elbow is on the four component.
Then, another approach is implemented. With this strategy several analysis are combined and depict in the same figure. The tools implemented are: the Kaiser rule (which drops the components with eigenvalues < 1), the parallel analysis, and the usual scree test (plotuScree), the acceleration factor (which indicates where the elbow of the scree plot appears).
Therefore, considering three factors seem to be an adequate approach.
Running the factor analysis with 14 items, first the communalities are explored:
## Warning in cor.smooth(mat): Matrix was not positive definite, smoothing was
## done
## In smc, smcs < 0 were set to .0
## In smc, smcs < 0 were set to .0
## In factor.stats, I could not find the RMSEA upper bound . Sorry about that
## pre_exp_preoc_q2 pre_exp_preoc_q5 pre_exp_preoc_q6
## 0.8616 0.8974 0.9731
## pre_exp_preoc_q7_i pre_exp_preoc_q7_iv pre_exp_preoc_q7_v
## 0.9950 0.9950 0.6193
## pre_exp_preoc_q8 pre_exp_preoc_q9 pre_exp_preoc_q10
## 0.6864 0.5281 0.7194
## pre_exp_preoc_q11 pre_actit_q2 pre_actit_q3
## 0.7535 0.8810 0.8167
## pre_actit_q4 pre_actit_q5_Inverted
## 0.6180 0.6246
Considering the values of the communalities, all are above 0.3.
Then, the whole output is displayed.
## Factor Analysis using method = ml
## Call: fa(r = pre_test_Q_Exp_Attit_facAn_2, nfactors = 3, rotate = "oblimin",
## fm = "ml", cor = "poly")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML2 ML3 ML1 h2 u2 com
## pre_exp_preoc_q2 0.77 0.86 0.1384 1.4
## pre_exp_preoc_q5 0.86 0.90 0.1024 1.3
## pre_exp_preoc_q6 1.00 0.97 0.0269 1.0
## pre_exp_preoc_q7_i 1.06 1.00 0.0049 1.1
## pre_exp_preoc_q7_iv 0.50 0.83 1.00 0.0049 1.6
## pre_exp_preoc_q7_v 0.65 0.62 0.3804 1.8
## pre_exp_preoc_q8 0.85 0.69 0.3142 1.1
## pre_exp_preoc_q9 0.70 0.53 0.4661 1.1
## pre_exp_preoc_q10 0.79 0.72 0.2813 1.1
## pre_exp_preoc_q11 0.85 0.75 0.2460 1.0
## pre_actit_q2 0.85 0.88 0.1188 1.2
## pre_actit_q3 0.81 0.82 0.1841 1.1
## pre_actit_q4 0.67 0.62 0.3803 1.3
## pre_actit_q5_Inverted -0.63 0.62 0.3752 1.6
##
## ML2 ML3 ML1
## SS loadings 4.43 3.64 2.90
## Proportion Var 0.32 0.26 0.21
## Cumulative Var 0.32 0.58 0.78
## Proportion Explained 0.40 0.33 0.26
## Cumulative Proportion 0.40 0.74 1.00
##
## With factor correlations of
## ML2 ML3 ML1
## ML2 1.00 -0.22 0.27
## ML3 -0.22 1.00 0.07
## ML1 0.27 0.07 1.00
##
## Mean item complexity = 1.3
## Test of the hypothesis that 3 factors are sufficient.
##
## df null model = 91 with the objective function = 97.53 with Chi Square = 1609
## df of the model are 52 and the objective function was 81.71
##
## The root mean square of the residuals (RMSR) is 0.06
## The df corrected root mean square of the residuals is 0.08
##
## The harmonic n.obs is 23 with the empirical chi square 14.47 with prob < 1
## The total n.obs was 23 with Likelihood Chi Square = 1185 with prob < 7.4e-214
##
## Tucker Lewis Index of factoring reliability = -0.498
## RMSEA index = 0.972 and the 90 % confidence intervals are 0.946 NA
## BIC = 1022
## Fit based upon off diagonal values = 0.98
## Measures of factor score adequacy
## ML2 ML3 ML1
## Correlation of (regression) scores with factors 1.00 0.98 0.99
## Multiple R square of scores with factors 0.99 0.97 0.99
## Minimum correlation of possible factor scores 0.99 0.94 0.98
There are 3 factors:
* Q2, Q7_i, Q7_v, attQ2, attQ3, attQ4.
* Q5, Q6, Q7_iv.
* Q8, Q9, Q10, Q11, Q7_iv, attQ5 inv.
Even when items from attitudes are mixed with expectations, there is a conceptual relationship between them. Besides, the negative loading of Q5 makes sense with concerns.
Finally, plots show the relationship between the items and the factors.
6.3.3 Expectations plus Q2, Q3, and Q4 from attitudes.
Including all the items selected for expectations (Q2, Q5, Q6, Q7.i, Q7.iv, Q7.v, Q8, Q9, 10, and Q11), plus three items from the attitude domain (Q2, Q3, and Q4).
The Barlett’s sphericity test is performed.
Thus, the p-value = 0 and the H0 is rejected confirming the utility of applying a factor analysis to this dataset.
Considering the fact that the Bartlett’s test usually rejects the H0 since the scenario of the null hypothesis is too extreme, the KMO analysis is studied to determine how well the data fit the factor analysis and how useful each item is.
## Kaiser-Meyer-Olkin factor adequacy
## Call: KMO(r = pre_test_Q_Exp_Attit_facAn_3_corr)
## Overall MSA = 0.69
## MSA for each item =
## q2 q5 q6 q7_i q7_iv q7_v q8 q9
## 0.75 0.66 0.61 0.77 0.61 0.68 0.71 0.62
## q10 q11 actit_q2 actit_q3 actit_q4
## 0.59 0.71 0.68 0.71 0.79
According to these results, there is no items under the threshold of 0.5. One item is below 0.6.
To explore the number of factors PCA is applied. Thus, the table with the PCA results is shown below:
PC1 | PC2 | PC3 | PC4 | PC5 | PC6 | PC7 | PC8 | PC9 | PC10 | PC11 | PC12 | PC13 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Standard deviation | 2.1555 | 1.8185 | 1.3641 | 0.8423 | 0.7515 | 0.7050 | 0.6256 | 0.5872 | 0.5008 | 0.4077 | 0.3638 | 0.2792 | 0.2262 |
Proportion of Variance | 0.3574 | 0.2544 | 0.1431 | 0.0546 | 0.0435 | 0.0382 | 0.0301 | 0.0265 | 0.0193 | 0.0128 | 0.0102 | 0.0060 | 0.0039 |
Cumulative Proportion | 0.3574 | 0.6118 | 0.7549 | 0.8095 | 0.8529 | 0.8912 | 0.9213 | 0.9478 | 0.9671 | 0.9799 | 0.9901 | 0.9961 | 1.0000 |
Then, the scree plot for this PCA analysis is displayed.
According to these results, 3-4 factors could be the best number. The cumulative proportion showed that with 3 components 75% and 81% with four. The greatest elbow is on the four component.
Then, another approach is implemented. With this strategy several analysis are combined and depict in the same figure. The tools implemented are: the Kaiser rule (which drops the components with eigenvalues < 1), the parallel analysis, and the usual scree test (plotuScree), the acceleration factor (which indicates where the elbow of the scree plot appears).
Therefore, considering three factors seem to be an adequate approach.
Running the factor analysis with 13 items, first the communalities are explored:
## Warning in cor.smooth(mat): Matrix was not positive definite, smoothing was
## done
## In factor.stats, I could not find the RMSEA upper bound . Sorry about that
## pre_exp_preoc_q2 pre_exp_preoc_q5 pre_exp_preoc_q6 pre_exp_preoc_q7_i
## 0.8800 0.9316 0.9271 0.9950
## pre_exp_preoc_q7_iv pre_exp_preoc_q7_v pre_exp_preoc_q8 pre_exp_preoc_q9
## 0.9817 0.6500 0.7389 0.5678
## pre_exp_preoc_q10 pre_exp_preoc_q11 pre_actit_q2 pre_actit_q3
## 0.7577 0.7799 0.8639 0.8234
## pre_actit_q4
## 0.6682
Considering the values of the communalities, all are above 0.3.
Then, the whole output is displayed.
## Factor Analysis using method = ml
## Call: fa(r = pre_test_Q_Exp_Attit_facAn_3, nfactors = 3, rotate = "oblimin",
## fm = "ml", cor = "poly")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML1 ML3 ML2 h2 u2 com
## pre_exp_preoc_q2 0.79 0.88 0.1201 1.3
## pre_exp_preoc_q5 0.85 0.93 0.0684 1.3
## pre_exp_preoc_q6 0.98 0.93 0.0729 1.0
## pre_exp_preoc_q7_i 1.06 1.00 0.0049 1.1
## pre_exp_preoc_q7_iv 0.48 0.82 0.98 0.0183 1.6
## pre_exp_preoc_q7_v 0.67 0.65 0.3490 1.8
## pre_exp_preoc_q8 0.88 0.74 0.2610 1.1
## pre_exp_preoc_q9 0.73 0.57 0.4327 1.1
## pre_exp_preoc_q10 0.79 0.76 0.2421 1.2
## pre_exp_preoc_q11 0.86 0.78 0.2204 1.1
## pre_actit_q2 0.88 0.86 0.1361 1.1
## pre_actit_q3 0.83 0.82 0.1766 1.1
## pre_actit_q4 0.68 0.67 0.3316 1.3
##
## ML1 ML3 ML2
## SS loadings 4.43 3.27 2.86
## Proportion Var 0.34 0.25 0.22
## Cumulative Var 0.34 0.59 0.81
## Proportion Explained 0.42 0.31 0.27
## Cumulative Proportion 0.42 0.73 1.00
##
## With factor correlations of
## ML1 ML3 ML2
## ML1 1.00 -0.18 0.27
## ML3 -0.18 1.00 0.08
## ML2 0.27 0.08 1.00
##
## Mean item complexity = 1.2
## Test of the hypothesis that 3 factors are sufficient.
##
## df null model = 78 with the objective function = 77.04 with Chi Square = 1374
## df of the model are 42 and the objective function was 62
##
## The root mean square of the residuals (RMSR) is 0.05
## The df corrected root mean square of the residuals is 0.07
##
## The harmonic n.obs is 24 with the empirical chi square 8.91 with prob < 1
## The total n.obs was 24 with Likelihood Chi Square = 981.7 with prob < 1.9e-178
##
## Tucker Lewis Index of factoring reliability = -0.528
## RMSEA index = 0.965 and the 90 % confidence intervals are 0.933 NA
## BIC = 848.2
## Fit based upon off diagonal values = 0.99
## Measures of factor score adequacy
## ML1 ML3 ML2
## Correlation of (regression) scores with factors 1.00 0.98 0.99
## Multiple R square of scores with factors 0.99 0.96 0.98
## Minimum correlation of possible factor scores 0.99 0.91 0.97
There are 3 factors:
* Q2, Q7_i, Q7_v, attQ2, attQ3, attQ4.
* Q5, Q6, Q7_iv.
* Q8, Q9, Q10, Q11, Q7_iv.
No additional effect is detected when item 5 is excluded.
Finally, plots show the relationship between the items and the factors.
6.3.4 Evaluating the presence of outliers
As an exploratory approach, the possibility of being a patient that answered all the items with “totally agree” is analyzed. Considering the small number of patients, and some behavior of the items along with the lack of complementary with the HOPE study with possibility is considered.
First, the existence of cases with this pattern is checked.
There is a patient (ID 27) who answer all the items with 5 (Totalmente de acuerdo)
6.3.4.1 Expectations plus Q2 and Q3 from attitudes.
The factor analysis is run again with the best combination of items. Including all the items selected for expectations plus Q2 and Q3 from attitudes.
The Barlett’s sphericity test is performed.
Thus, the p-value = 0 and the H0 is rejected confirming the utility of applying a factor analysis to this dataset.
Considering the fact that the Bartlett’s test usually rejects the H0 since the scenario of the null hypothesis is too extreme, the KMO analysis is studied to determine how well the data fit the factor analysis and how useful each item is.
## Kaiser-Meyer-Olkin factor adequacy
## Call: KMO(r = pre_test_Q_Exp_Attit_facAn_4_corr)
## Overall MSA = 0.65
## MSA for each item =
## q2 q5 q6 q7_i q7_iv q7_v q8 q9
## 0.71 0.60 0.62 0.70 0.67 0.64 0.67 0.60
## q10 q11 actit_q2 actit_q3
## 0.59 0.73 0.58 0.69
According to these results, there is no items under the threshold of 0.5. Two items are below 0.6.
To explore the number of factors PCA is applied. Thus, the table with the PCA results is shown below:
PC1 | PC2 | PC3 | PC4 | PC5 | PC6 | PC7 | PC8 | PC9 | PC10 | PC11 | PC12 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Standard deviation | 2.0216 | 1.8952 | 1.2310 | 0.7743 | 0.7618 | 0.6745 | 0.6202 | 0.5649 | 0.4302 | 0.3908 | 0.2747 | 0.2320 |
Proportion of Variance | 0.3406 | 0.2993 | 0.1263 | 0.0500 | 0.0484 | 0.0379 | 0.0321 | 0.0266 | 0.0154 | 0.0127 | 0.0063 | 0.0045 |
Cumulative Proportion | 0.3406 | 0.6399 | 0.7662 | 0.8162 | 0.8645 | 0.9024 | 0.9345 | 0.9611 | 0.9765 | 0.9892 | 0.9955 | 1.0000 |
Then, the scree plot for this PCA analysis is displayed.
According to these results, 3 factors could be the best number. The cumulative proportion showed that with 3 components 77%, there are two elbows, one after two and the other on the third component.
Then, another approach is implemented. With this strategy several analysis are combined and depict in the same figure. The tools implemented are: the Kaiser rule (which drops the components with eigenvalues < 1), the parallel analysis, and the usual scree test (plotuScree), the acceleration factor (which indicates where the elbow of the scree plot appears).
Therefore, considering three factors seem to be an adequate approach.
Running the factor analysis with 12 items, first the communalities are explored:
## Warning in cor.smooth(mat): Matrix was not positive definite, smoothing was
## done
## In smc, smcs < 0 were set to .0
## In smc, smcs < 0 were set to .0
## In factor.stats, I could not find the RMSEA upper bound . Sorry about that
## pre_exp_preoc_q2 pre_exp_preoc_q5 pre_exp_preoc_q6 pre_exp_preoc_q7_i
## 0.7762 0.9950 0.7690 0.6779
## pre_exp_preoc_q7_iv pre_exp_preoc_q7_v pre_exp_preoc_q8 pre_exp_preoc_q9
## 0.9066 0.7442 0.6739 0.6320
## pre_exp_preoc_q10 pre_exp_preoc_q11 pre_actit_q2 pre_actit_q3
## 0.8610 0.6427 0.9950 0.9312
Considering the values of the communalities, all are above 0.3.
Then, the whole output is displayed.
## Factor Analysis using method = ml
## Call: fa(r = pre_test_Q_Exp_Attit_facAn_4, nfactors = 3, rotate = "oblimin",
## fm = "ml", cor = "poly")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML1 ML3 ML2 h2 u2 com
## pre_exp_preoc_q2 0.70 0.78 0.224 1.4
## pre_exp_preoc_q5 0.97 1.00 0.005 1.2
## pre_exp_preoc_q6 0.89 0.77 0.231 1.0
## pre_exp_preoc_q7_i 0.75 0.68 0.322 1.3
## pre_exp_preoc_q7_iv 0.41 0.79 0.91 0.093 1.5
## pre_exp_preoc_q7_v 0.59 0.40 0.74 0.256 2.5
## pre_exp_preoc_q8 0.83 0.67 0.326 1.0
## pre_exp_preoc_q9 0.75 0.63 0.367 1.1
## pre_exp_preoc_q10 0.87 0.86 0.139 1.2
## pre_exp_preoc_q11 0.78 0.64 0.357 1.0
## pre_actit_q2 1.01 1.00 0.005 1.1
## pre_actit_q3 0.93 0.93 0.069 1.0
##
## ML1 ML3 ML2
## SS loadings 3.52 3.15 2.94
## Proportion Var 0.29 0.26 0.24
## Cumulative Var 0.29 0.56 0.80
## Proportion Explained 0.37 0.33 0.31
## Cumulative Proportion 0.37 0.69 1.00
##
## With factor correlations of
## ML1 ML3 ML2
## ML1 1.00 -0.28 0.36
## ML3 -0.28 1.00 0.18
## ML2 0.36 0.18 1.00
##
## Mean item complexity = 1.3
## Test of the hypothesis that 3 factors are sufficient.
##
## df null model = 66 with the objective function = 74.45 with Chi Square = 1352
## df of the model are 33 and the objective function was 61.46
##
## The root mean square of the residuals (RMSR) is 0.06
## The df corrected root mean square of the residuals is 0.08
##
## The harmonic n.obs is 24 with the empirical chi square 9.97 with prob < 1
## The total n.obs was 24 with Likelihood Chi Square = 993.6 with prob < 2.1e-187
##
## Tucker Lewis Index of factoring reliability = -0.689
## RMSEA index = 1.101 and the 90 % confidence intervals are 1.065 NA
## BIC = 888.8
## Fit based upon off diagonal values = 0.99
## Measures of factor score adequacy
## ML1 ML3 ML2
## Correlation of (regression) scores with factors 1.00 0.97 1.00
## Multiple R square of scores with factors 0.99 0.94 0.99
## Minimum correlation of possible factor scores 0.99 0.88 0.98
There are 3 factors:
* Q2, Q7_i, Q7_v, attQ2, attQ3.
* Q5, Q6, Q7_iv.
* Q8, Q9, Q10, Q11.
The only change is related to Q7.v which now appears belonging to two factors.
Finally, plots show the relationship between the items and the factors.
6.3.4.2 Expectations with the final set of items.
Now, the analysis with only expectations items excluding items Q1, Q4, Q7_ii, Q7_iii and Q3 is re-run.
The Barlett’s sphericity test is performed.
Thus, the p-value = 0 and the H0 is rejected confirming the utility of applying a factor analysis to this dataset.
Considering the fact that the Bartlett’s test usually rejects the H0 since the scenario of the null hypothesis is too extreme, the KMO analysis is studied to determine how well the data fit the factor analysis and how useful each item is.
## Kaiser-Meyer-Olkin factor adequacy
## Call: KMO(r = pre_test_Q_ExpConcern_facAn_4b_corr)
## Overall MSA = 0.64
## MSA for each item =
## q2 q5 q6 q7_i q7_iv q7_v q8 q9 q10 q11
## 0.79 0.75 0.65 0.50 0.67 0.60 0.64 0.58 0.52 0.80
According to these results, all the items are above 0.5. Three items are below 0.6.
To explore the number of factors PCA is applied. Thus, the table with the PCA results is shown below:
PC1 | PC2 | PC3 | PC4 | PC5 | PC6 | PC7 | PC8 | PC9 | PC10 | |
---|---|---|---|---|---|---|---|---|---|---|
Standard deviation | 1.9065 | 1.6936 | 1.0922 | 0.7671 | 0.7219 | 0.6701 | 0.5677 | 0.4381 | 0.3888 | 0.2831 |
Proportion of Variance | 0.3635 | 0.2868 | 0.1193 | 0.0588 | 0.0521 | 0.0449 | 0.0322 | 0.0192 | 0.0151 | 0.0080 |
Cumulative Proportion | 0.3635 | 0.6503 | 0.7696 | 0.8284 | 0.8805 | 0.9254 | 0.9577 | 0.9769 | 0.9920 | 1.0000 |
Then, the scree plot for this PCA analysis is displayed.
According to these results, 2-3 factors could be the best number. While the cumulative proportion for 3 components is 77%, for two is 65%; beside, two elbows can be identified, the first one after the second component, and the other one on the third component.
Then, another approach is implemented. With this strategy several analysis are combined and depict in the same figure. The tools implemented are: the Kaiser rule (which drops the components with eigenvalues < 1), the parallel analysis, and the usual scree test (plotuScree), the acceleration factor (which indicates where the elbow of the scree plot appears).
Therefore, considering two factors seem to be an adequate approach.
Running the factor analysis with 10 items, first the communalities are explored:
## Warning in cor.smooth(mat): Matrix was not positive definite, smoothing was
## done
## In factor.stats, I could not find the RMSEA upper bound . Sorry about that
## pre_exp_preoc_q2 pre_exp_preoc_q5 pre_exp_preoc_q6 pre_exp_preoc_q7_i
## 0.4707 0.9267 0.8766 0.2682
## pre_exp_preoc_q7_iv pre_exp_preoc_q7_v pre_exp_preoc_q8 pre_exp_preoc_q9
## 0.9950 0.4067 0.6034 0.5948
## pre_exp_preoc_q10 pre_exp_preoc_q11
## 0.8194 0.7168
Considering the values of the communalities, only Q7_i is below 0.3.
Then, the whole output is displayed.
## Factor Analysis using method = ml
## Call: fa(r = pre_test_Q_ExpConcern_facAn_4b, nfactors = 2, rotate = "oblimin",
## fm = "ml", cor = "poly")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML1 ML2 h2 u2 com
## pre_exp_preoc_q2 0.55 -0.46 0.47 0.5295 1.9
## pre_exp_preoc_q5 0.94 0.93 0.0733 1.2
## pre_exp_preoc_q6 0.93 0.88 0.1233 1.0
## pre_exp_preoc_q7_i 0.50 0.27 0.7328 1.2
## pre_exp_preoc_q7_iv 0.88 0.42 1.00 0.0049 1.4
## pre_exp_preoc_q7_v 0.63 0.41 0.5936 1.0
## pre_exp_preoc_q8 0.77 0.60 0.3969 1.0
## pre_exp_preoc_q9 0.73 0.59 0.4058 1.2
## pre_exp_preoc_q10 0.90 0.82 0.1805 1.1
## pre_exp_preoc_q11 0.83 0.72 0.2838 1.0
##
## ML1 ML2
## SS loadings 3.55 3.12
## Proportion Var 0.36 0.31
## Cumulative Var 0.36 0.67
## Proportion Explained 0.53 0.47
## Cumulative Proportion 0.53 1.00
##
## With factor correlations of
## ML1 ML2
## ML1 1.00 0.07
## ML2 0.07 1.00
##
## Mean item complexity = 1.2
## Test of the hypothesis that 2 factors are sufficient.
##
## df null model = 45 with the objective function = 50.32 with Chi Square = 947.6
## df of the model are 26 and the objective function was 42.16
##
## The root mean square of the residuals (RMSR) is 0.12
## The df corrected root mean square of the residuals is 0.16
##
## The harmonic n.obs is 24 with the empirical chi square 32.22 with prob < 0.19
## The total n.obs was 24 with Likelihood Chi Square = 737.8 with prob < 8.3e-139
##
## Tucker Lewis Index of factoring reliability = -0.475
## RMSEA index = 1.067 and the 90 % confidence intervals are 1.024 NA
## BIC = 655.2
## Fit based upon off diagonal values = 0.93
## Measures of factor score adequacy
## ML1 ML2
## Correlation of (regression) scores with factors 0.99 0.97
## Multiple R square of scores with factors 0.99 0.95
## Minimum correlation of possible factor scores 0.97 0.90
There are 2 factors:
* Q2, Q8, Q9, Q10, Q11, and Q7_iv
* Q5, Q6, Q7_i, Q7_iv, Q7_v, Q2.
No significant changes are detected when this case is excluded.
Finally, plots show the relationship between the items and the factors.