Chapter 5 Pre-test Questionnaire- Descriptive and exploratory analysis

5.1 Demographics

5.1.1 Descriptive Statistics

First, the demographics characteristics of the population included is analyzed.

	Overall
n	27
edad_qpretest (mean (SD))	72.56 (8.46)
estado_civil (%)
Casado y/o vive en pareja	25 (92.6)
Soltero y no vive en pareja	1 ( 3.7)
Viudo	1 ( 3.7)
ocupacion (%)
Jubilado o de baja/incapacidad previamente al diagnostico	22 (81.5)
Trabajador en activo por cuenta ajena	2 ( 7.4)
Trabajador en activo por cuenta propia	3 (11.1)
formacion_academica (%)
Educacion Primaria	11 (40.7)
Educacion Secundaria o Bachillerato	5 (18.5)
Formacion profesional	3 (11.1)
Otro (puede especificar):	3 (11.1)
Preferiria no contestar	1 ( 3.7)
Titulacion universitaria	4 (14.8)
exp_lab_salud = Si (%)	1 ( 3.7)
fliar_salud = Si (%)	12 (44.4)
fliar_cancer = Si (%)	23 (85.2)
vivienda = Vivo en una ciudad o area ambito urbano (%)	20 (74.1)

As general conclusion, most of the patients had partner, were retired, had primary school as their higher education reached, did not have relationship with health, had a relative with cancer, and live in a city.
Then, the main variables that could be related to knowledge (education, laboral experience, or a relative with laboral experience in the healthcare system, and relatives with cancer) are summarized.

5.2 Knowledge domain

In this domain there are 12 items. Out of these, 11 items were designed to chose one (or more) correct answers and the last item is about the source of knowledge without a correct or wrong answer and includes multiple information source options and patients could chose more than one. The target number of knowledge items to be included in the final version of this questionnaire is set in 5.

5.2.1 Descriptive statistics and sociodemographics comparison

As the first approach, a summary of the answers is done, and comparisons between sociodemographics are performed.
Initially, new variables are built with the success of the answers. It is worth mentioning that, each variable has three answers (yes, no, do not know), except one that has many options to be selected. Thus, for each yes-no question, a new binary variable is created with 1 if the answer was correct and 0 otherwise. Then, for the multiple choice question, a fraction for each correct item was assigned (it has 5 options, and all were correct; thus, 1/5 for each option selected). Then, the sum of successes is calculated. The proportion of success is also calculated (considering the whole number of questions).

To start with the descriptive analysis, the missing data is explored.

##  /\     /\
## {  `---'  }
## {  O   O  }
## ==>  V <==  No need for mice. This data set is completely observed.
##  \  \|/  /
##   `-----'

##    pre_conocim_q1 pre_conocim_q2 pre_conocim_q3___a pre_conocim_q3___b
## 27              1              1                  1                  1
##                 0              0                  0                  0
##    pre_conocim_q3___c pre_conocim_q3___d pre_conocim_q3___e pre_conocim_q3___ni
## 27                  1                  1                  1                   1
##                     0                  0                  0                   0
##    pre_conocim_q4___a pre_conocim_q4___b pre_conocim_q4___c pre_conocim_q4___d
## 27                  1                  1                  1                  1
##                     0                  0                  0                  0
##    pre_conocim_q4___e pre_conocim_q4___ni pre_conocim_q5 pre_conocim_q6
## 27                  1                   1              1              1
##                     0                   0              0              0
##    pre_conocim_q7 pre_conocim_q8 pre_conocim_q9 pre_conocim_q10 pre_conocim_q11
## 27              1              1              1               1               1
##                 0              0              0               0               0
##    pre_conocim_q12___a pre_conocim_q12___b pre_conocim_q12___c
## 27                   1                   1                   1
##                      0                   0                   0
##    pre_conocim_q12___d pre_conocim_q12___e pre_conocim_q12___f
## 27                   1                   1                   1
##                      0                   0                   0
##    pre_conocim_q12___g pre_conocim_q12___ni pre_conocim_q12_g  
## 27                   1                    1                 1 0
##                      0                    0                 0 0

Knowledge is analyzed considering the sociodemographic variables. In this setting, three variables are compared: the health relationship, the highest education level acquired, and the presence of a relative with cancer. Plots and formal tests are performed.

First, the educational level is analyzed.

## # A tibble: 3 × 11
##   education_high     n missing Minimun Maximun  Mean    DS Median FirstQ ThirdQ
##   <fct>          <int>   <int>   <dbl>   <dbl> <dbl> <dbl>  <dbl>  <dbl>  <dbl>
## 1 No                16       0     0.2    8.27  3.00  2.05   2.8    1.65   4.05
## 2 Yes                9       0     1.4    9.2   5.02  2.24   5.07   4.4    6.4 
## 3 Unknown            2       0     1.4    3.53  2.47  1.51   2.47   1.93   3   
## # ℹ 1 more variable: IQR <dbl>

## 
##  Shapiro-Wilk normality test
## 
## data:  resids_m_education_high
## W = 0.96, p-value = 0.5

## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  2    0.11    0.9
##       24

## 
##  Two Sample t-test
## 
## data:  knowg_correct_totalsum by education_high
## t = -2.3, df = 23, p-value = 0.03
## alternative hypothesis: true difference in means between group No and group Yes is not equal to 0
## 95 percent confidence interval:
##  -3.8502 -0.2026
## sample estimates:
##  mean in group No mean in group Yes 
##             2.996             5.022

There is a significant difference between scores from both groups with different educational levels, the mean score for the knowledge questions in the group with the highest educational level was 5.022 (sd 2.243), and in the other group it was 2.996 (sd 2.045) (p=0.03).

Secondly, the relationship with health is considered.

## # A tibble: 2 × 11
##   health_related     n missing Minimun Maximun  Mean    DS Median FirstQ ThirdQ
##   <chr>          <int>   <int>   <dbl>   <dbl> <dbl> <dbl>  <dbl>  <dbl>  <dbl>
## 1 No                14       0     0.2    5.07  3.06  1.41    3.2   2.18    3.9
## 2 Yes               13       0     0.4    9.2   4.25  2.83    4.4   1.4     6.4
## # ℹ 1 more variable: IQR <dbl>

## 
##  Shapiro-Wilk normality test
## 
## data:  resids_m_health_related
## W = 0.97, p-value = 0.7

## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)  
## group  1    5.15  0.032 *
##       25                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## 
##  Welch Two Sample t-test
## 
## data:  pre_test_Q_Conocim$knowg_correct_totalsum by pre_test_Q_Conocim$health_related
## t = -1.4, df = 17, p-value = 0.2
## alternative hypothesis: true difference in means between group No and group Yes is not equal to 0
## 95 percent confidence interval:
##  -3.0218  0.6533
## sample estimates:
##  mean in group No mean in group Yes 
##             3.062             4.246

No difference was seen between scores from both groups of health relationship, the mean score for the knowledge questions in the group related with health was 4.25 (sd 2.84), and it was 3.06 (sd 1.41) in the other group (p=0.2).

Finally, the influence of a relative with cancer is analyzed.

## # A tibble: 2 × 11
##   fliar_cancer     n missing Minimun Maximun  Mean    DS Median FirstQ ThirdQ
##   <chr>        <int>   <int>   <dbl>   <dbl> <dbl> <dbl>  <dbl>  <dbl>  <dbl>
## 1 No               4       0    2.53    4.53  3.48 0.940   3.43   2.78   4.13
## 2 Si              23       0    0.2     9.2   3.66 2.42    3.53   1.57   5.07
## # ℹ 1 more variable: IQR <dbl>

## 
##  Shapiro-Wilk normality test
## 
## data:  resids_m_fliar_cancer
## W = 0.96, p-value = 0.3

## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  1    2.37   0.14
##       25

## 
##  Two Sample t-test
## 
## data:  pre_test_Q_Conocim$knowg_correct_totalsum by pre_test_Q_Conocim$fliar_cancer
## t = -0.14, df = 25, p-value = 0.9
## alternative hypothesis: true difference in means between group No and group Si is not equal to 0
## 95 percent confidence interval:
##  -2.736  2.386
## sample estimates:
## mean in group No mean in group Si 
##            3.483            3.658

The same score was observed for both groups, without any significant difference, mean score in the group that has relatives with cancer was 3.66 (sd 2.42), and 3.48 (sd 0.94) in the other group (p=0.9).

Finally, the region where the patient lives is considered.

## # A tibble: 2 × 11
##   vivienda_cat     n missing Minimun Maximun  Mean    DS Median FirstQ ThirdQ
##   <fct>        <int>   <int>   <dbl>   <dbl> <dbl> <dbl>  <dbl>  <dbl>  <dbl>
## 1 City            20       0    0.2      9.2  3.44  2.52    2.8   1.4    5.07
## 2 Town             7       0    2.73     6.4  4.17  1.15    4     3.57   4.47
## # ℹ 1 more variable: IQR <dbl>

## 
##  Shapiro-Wilk normality test
## 
## data:  resids_m_vivienda
## W = 0.94, p-value = 0.1

## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)  
## group  1    3.43  0.076 .
##       25                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## 
##  Two Sample t-test
## 
## data:  pre_test_Q_Conocim$knowg_correct_totalsum by pre_test_Q_Conocim$vivienda_cat
## t = -0.73, df = 25, p-value = 0.5
## alternative hypothesis: true difference in means between group City and group Town is not equal to 0
## 95 percent confidence interval:
##  -2.783  1.327
## sample estimates:
## mean in group City mean in group Town 
##              3.443              4.171

The same score was observed for both groups, without any significant difference, mean score in the group that lives in the city was 3.44 (sd 2.52), and 4.17 (sd 1.15) for those who lives in a town (p=0.5).

Results according to the type of job or if they are retired are describe considering the knowledge score.

## # A tibble: 3 × 11
##   ocupacion_cat       n missing Minimun Maximun  Mean    DS Median FirstQ ThirdQ
##   <chr>           <int>   <int>   <dbl>   <dbl> <dbl> <dbl>  <dbl>  <dbl>  <dbl>
## 1 Retired            22       0    0.2     8.27  3.23  1.84    3.2   1.82   4.35
## 2 Working, emplo…     2       0    0.4     6.4   3.4   4.24    3.4   1.9    4.9 
## 3 Working, self-…     3       0    4.53    9.2   6.71  2.35    6.4   5.47   7.8 
## # ℹ 1 more variable: IQR <dbl>

Finally, according to the last item asking about the source of information, the results are shown below.

To note, in this item, patients could select more than one option. However, the results demonstrate that the most frequent source was their oncologist.

5.2.2 Analysis of each question across the patients.

Now, results for each question across all the patients are shown. Questions with binary responses are displayed together, then the remaining two questions with multiple options.

From the analysis of this set of 8 questions, some conclusions can be extracted evaluating the frequency of correct and wrong answers. The Q10 has mostly correct answers, thus it not seems to add valuable information. Similarly, Q2 and Q11 have the lowest percentage of correct answers. While for Q2 most of the answers where unknown (85%), for Q11 were wrong answers (63%). The questions with more useful variability are Q1, Q8, and Q5, followed by Q7. Q6 did not have wrong answers, it has a slightly higher percentage of unknown than correct responses. It seems that those who have the knowledge chose the correct option, indeed this item could be study-dependent; since, if the project is focused on liquid biopsy the question will have an obvious response.
Potential questions: Q1, Q8, Q5 and Q7.

These last two items showed some degree of variability, however, Q4 could be affected by the study design and the population included. Similarly to Q6, if the study enrolls patients for liquid biopsy the answers will be biased. Therefore, Q3 could be the most useful of these two items.

After this analysis of the variability of the answers a first conclusion can be drawn. It is desired to select questions with a lower proportion of unknown answers and with a similar proportion of correct and incorrect options selected. A higher proportion of unknown answers possibly reflects a hard-to-know or understand question, and this fact pinpoints a problematic item. A similar proportion between correct and incorrect answers implies a high variability between subjects; thus, it is expected that this item could gather a higher amount of information. Therefore, a score (from 1 to 10) is assigned to each item considering these characteristics in order to organize and visualize this first analysis.

Then, in a second stage, a measure of correlation between each item and the total score is explore. An Spearman correlation is determined for the two items with several options (Q3 and Q4). For the remaining 8 questions with correct/incorrect results cannot be possible to compute a correlation coefficient. Thus, for these items a visual strategy is explore.

After these analyses the results obtained are summarize in the following table (organized by the variability factor). Note that here “median” means the median for the total knowledge score for those who answered the correct options (see also previous box plot panel):

items	variability	Median_pvalue	Sp_correlation
Q1	1	6.4 (p=0.558)	NA
Q8	2	5.1 (p=0.131)	NA
Q5	3	6.4 (p=0.137)	NA
Q7	4	4.6 (p=0.134)	NA
Q3	5	NA	0.132
Q6	6	NA	NA
Q4	7	NA	0.768
Q11	8	8.7 (p=0.115)	NA
Q10	9	3.6 (p=0.49)	NA
Q2	10	7.8 (p=0.046)	NA

5.2.3 Global conclusions about knowledge domain

Despite the fact that only the comparison of item Q2 was statistically significant, for most of the items there is an association between correct answers and higher total scores. Numerically, the highest difference was found for item Q11. Between Q3 and Q4, the former showed no correlation with the total score. On the other hand, while Q4 has a high correlation, it not seem to be relevant for other study design. Items Q2 and Q10 probably are the less useful. Q7 could be relevant in order to explore the idea of incidental findings and it is the only question focused on this topic.
As a conclusion, according this analysis, the meaning of each question, and the theoretical importance of each, a selection is made:

These five could be the most relevant one until this point.
Q1. Todas las alteraciones genéticas dentro de un cáncer son hereditarias (se transmiten dentro de la familia).
Q8. Los análisis genómicos de todos los genes del cáncer es algo que los doctores solicitan rutinariamente para la mayoría de personas con cáncer.
Q5. El análisis de mutaciones hereditarias (en sangre o saliva) es suficiente para estudiar todas las alteraciones genéticas de un tumor.
Q7. El análisis genómico del tumor puede llevar a resultados no esperados o incidentales que pueden ser relevantes para conocer los riesgos para la salud, presente y futuros, en otros miembros de la familia.
Q3. ¿para que pueden usarse los análisis genómicos?
A. Confirmar el tipo de cáncer diagnosticado
B. Predecir el riesgo de cáncer en un individuo, sus hijos o familiares
C. Seleccionar el tratamiento más adecuado para un paciente
D. Ver si la enfermedad ha mejorado o empeorado tras iniciar un tratamiento
E. Predecir la evolución futura de la enfermedad

Then, these other three could be considered according to additional analysis. Q6 showed some variability but without wrong answers and could be study-dependent, Q11 showed low variability but high nominal difference for the total score, and Q2 had statistical difference between correct and wrong answers regarding the total score.
Q6. El análisis de muestras de sangre (biopsia liquida) puede estudiar mutaciones a partir de restos del tumor en la sangre.
Q11. Los doctores buscan cambios genéticos en mi tumor para determinar si miembros de mi familia pueden tener mayor riesgo de desarrollar cáncer.

Finally, these seem to be the less relevant items.
Q2. El cáncer es usualmente causado por un cambio en un solo gen.
Q10. Los doctores buscan cambios genéticos en mi tumor para ayudar en la selección de mi tratamiento.
Q4. Los análisis genómicos se pueden realizar a partir de
A. Muestras de sangre
B. Muestras de tumor
C. Muestras de saliva
D. Ninguna de las anteriores
E. No lo se

5.3 Expectations and concerns domain

In this setting, expectations and concerns are studied. In this domain there are 11 questions, but one (the seventh) has 5 items. Thus, there are a total of 15 items in this domain. These items can be divided into expectations, from 1 to 7.v., and concerns, from 8 to 11. All of them with a likert scale with 5 levels: Totalmente en desacuerdo, Algo en desacuerdo, Ni de acuerdo ni en desacuerdo, Algo de acuerdo, Totalmente de acuerdo.

For expectations items, from 1 to 7.v., while most are expected to be answered with agreement, there are two that can be interpreted in a different way:
- item 4. “Siento que necesito una visita formal con un especialista en consejo genético antes de realizar el análisis genómico”. In this setting, if patients has a correct understanding of the genomic test and its implications it is not expected that he need an especial visit with clinical genetics. In any case, when the results imply something that needs a particular management or evaluation, patients are referred to a clinical genetics specialist.
- item 7.iii. “Mi doctor cambiará mi tratamiento inmediatamente de acuerdo a los resultados.”. It is expected that the treatment does not change upon the results if the patient is under treatment. When the test is performed before a new line, results could guide a new treatment selection. However, if the genomic test is done during a current treatment, that result should not be used to modify that treatment. Instead, the result will be used for the next therapy. The complex issue here is the fact that this is a mixed population because for some patients this test was performed to choose the next therapy while for others do not.

For concerns items, from 8 to 11, all these items imply levels of concerns and they could be interpreted in the same line that were designed without inverting any of these. It is important to note, that higher levels of agreement will be linked with higher levels of concerns.

Considering the number of items in this domain and the objective of this first analysis, the target final number of items is 10, including 8 expectations´ items and 2 concerns´ items.

5.3.1 Visualizing items and responses

From the first visual analysis, all items have a wide majority of agreement answers, with a higher percentage of totally agree answers. Then, items with less variability are 7.ii and 7.i. Those with more variability are, first, item 4, with an inverted design, having 16% of disagreement and 40% of agreement; followed by item 6. Then, there are two groups, one including item 5 and 7.iv with a higher number of agreement (with a relevant percentage of partially agreement), between 12-15% neither agreement nor disagreement, and the highest percentage of disagreement. The second group includes item 3, 2, 1, and 7.v with a similar distribution of answers (65-75% agreement, 25-30% of neither agreement nor disagreement, and a lower percentage of disagreement). It is important to highlight that item 2 and 1 are widely overlapped, thus, it is expected to chose one of them. Additionally, item 7.iv has one of the highest percentage of disagreement but also one of the highest levels of agreement.

With regards to concerns, there are variability in the four answers. Maybe, the 10th could have the lower variability with the higher percentage of totally disagree answers.

As second exploratory approach analysis, the difficulty and discrimination capacity is evaluate for each of these items.

## Scale for fill is already present.
## Adding another scale for fill, which will replace the existing scale.
## Scale for colour is already present.
## Adding another scale for colour, which will replace the existing scale.

The highest scores for discrimination are shown by items 7.v, 2, 5, 6, 7.iv, and 7.ii. On the other hand, item 4 shows the lowest discrimination score, being under the specified threshold. Additionally, other items with a lower discrimination score are 1, 3, 7.i, 7.iii.

## Scale for fill is already present.
## Adding another scale for fill, which will replace the existing scale.
## Scale for colour is already present.
## Adding another scale for colour, which will replace the existing scale.

Again, concerns items seems to results similarly between them. In this setting item 11 shows the highest discrimination score.

Partial conclusion:
Expectations.
Variability- Highest variability: 4, and 6. Then 5 and 7.iv (less neither agree/disagreement and higher disagreement percentage); and 3, 2, 1, and 7.v (less disagreement and higher neither agree/disagreement percentage). It is important to note that item 1 and 2 are equivalent, only one will be selected.
Lowest variability: 7.ii and 7.i.

Discrimination- Highest discrimination: 7.v, 2, 5, 6, 7.iv, and 7.ii.
Lowest discrimination: item 4. Followed by 1, 3, 7.i, 7.iii.

summary-
The best five elements are: 2, 6, 7.v, 5, and 7.iv.
The worst two: 7.i and 7.iii. Then, if item 2 is selected item 1 could be excluded.
Conflicting results: with high variability and low discrimination item 4, 3, and with low variability and high discrimination 7.ii.

Concerns.
Variability- There is no clear variability. Discrimination- There is not clear discrimination, but item 11 shows the highest discrimination score.

5.3.2 Evaluating the reliability of questions (Cronbach’s α and Omega)

The Cronbach’s α and the Guttman’s lambda_6 (G6) are calculated for expectations and later for concerns.

5.3.2.1 Expectations

Regarding expectations items, the number 4 will be inverted.

Some concepts and explanations regarding the output.
raw.r is the correlation of the item with the entire scale, not correcting for item overlap. std.r is the correlation of the item with the entire scale, if each item were standardized. r.drop is the correlation of the item with the scale composed of the remaining items. Although each of these are conventional statistics, they have the disadvantage that a) item overlap inflates the first and b) the scale is different for each item when an item is dropped. Thus, the fourth alternative, r.cor, corrects for the item overlap by subtracting the item variance but then replaces this with the best estimate of common variance, the smc.
Guttman’s Lambda 6 (G6) considers the amount of variance in each item that can be accounted for the linear regression of all of the other items. Hence, G6 shows how well each question represents the concept it aims to reflect. Low values indicate that the question does not reflect the underlying concept while high values (.7 and higher) indicate that the question captures that concept well (or to an acceptable degree).
Additionally, the alpha can be calculated when an item is excluded in order to determine the impact of such question. Thus, the raw alpha when an item is dropped is another metric available.
Therefore, r.cor, G6 and alpha when an item is dropped measure the association of each item regarding the whole set of questions, while the Cronbach’s coefficient is a metric of the whole questionnaire. The omega (ω) coefficient is also a reliability measure of internal consistency, and represents an estimate of the general factor saturation fo a test. While, Hierarchical ω provides more appropriate estimates of the general factor saturation, total ω is a better estimate of the reliability of the total test compared to both Cronbach’s α.

First, analyzing the reliability of the whole subset of items, i.e., the consistency of the expectations section several measurements are display: the Cronbach’s α, with its IC, and the omega (ω) coefficient.

## 
##    95% confidence boundaries (Feldt)
##  lower alpha upper
##   0.62  0.77  0.88

## 
## Information about this analysis:
## 
##                  Dataframe: pre_test_Q_Expectations_values[, 2:12]
##                      Items: all
##               Observations: 24
##      Positive correlations: 47 out of 55 (85%)
## 
## Estimates assuming interval level:
## 
##              Omega (total): 0.91
##       Omega (hierarchical): 0.45
##    Revelle's omega (total): 0.91
## Greatest Lower Bound (GLB): NA
##              Coefficient H: 0.96
##          Coefficient alpha: 0.81
## 
## (Estimates assuming ordinal level not computed, as the polychoric correlation matrix has missing values.)
## 
## Note: the normal point estimate and confidence interval for omega are based on the procedure suggested by Dunn, Baguley & Brunsden (2013) using the MBESS function ci.reliability, whereas the psych package point estimate was suggested in Revelle & Zinbarg (2008). See the help ('?scaleStructure') for more information.

Then, in order to inspect the role of each item, the r.cor and the Guttman’s Lambda 6 (G6) are explored.

Dashed line indicates the alpha (Feldt alpha) value for the whole set of items in order to compare this value if the alpha value that results from dropping the corresponding item (yellow bar).

Figure 5.1: Dashed line indicates the alpha (Feldt alpha) value for the whole set of items in order to compare this value if the alpha value that results from dropping the corresponding item (yellow bar).

Items	G6	raw_alpha_itemDropped	r.cor
4.Inverted- Necesito visita formal	0.9405	0.8429	-0.1876
3.Estoy interesado/a en aprender más	0.9118	0.7608	0.4769
7iii.Mi Dr cambiará mi tto de acuerdo a los resultados	0.9044	0.7499	0.5761
7ii.Recibiré informe escrito con el resultado	0.9035	0.7597	0.5429
6.El resultado ayudará a aumentar mi expectativa vida	0.9031	0.7351	0.6882
7iv.Tendré opciones de tratamiento adicionales	0.8993	0.7380	0.6761
1.Tengo suficiente conocimiento de beneficios y riesgos para tomar decisión informada	0.8977	0.7679	0.4526
7v.Podré recibir tratamientos experimentales	0.8972	0.7372	0.6999
5.El resultado ayudará al control de mi cáncer	0.8965	0.7230	0.7796
7i.Mi Dr me explicará resultados y la implicación para mi salud	0.8959	0.7425	0.6987
2.He recibido suficiente información para comprender beneficios y riesgos del análisis genómico	0.8916	0.7455	0.6406

G6 values are very close altogether. Surprisingly, the highest values correspond to item 4 with the lowest r.cor. Then, the best metric combination seems to be for items 5, 7.v, 6, 7.iv, and 7.i. While the worst combination is seen for items 4, and 1, 3, and 7.ii.

Then, as an exploratory analysis, item 7.iii is inverted and results evaluated.

## 
##    95% confidence boundaries (Feldt)
##  lower alpha upper
##   0.41  0.65  0.82

Omega (total)
0.4478

Omega (hierarchical)
0.9121

Items	G6	raw_alpha_itemDropped	r.cor
5.El resultado ayudará al control de mi cáncer	0.8498	0.5411	0.7949
2.He recibido suficiente información para comprender beneficios y riesgos del análisis genómico	0.8392	0.5629	0.7303
7i.Mi Dr me explicará resultados y la implicación para mi salud	0.8514	0.5849	0.6919
7v.Podré recibir tratamientos experimentales	0.8573	0.5879	0.6299
7ii.Recibiré informe escrito con el resultado	0.8581	0.6017	0.6265
6.El resultado ayudará a aumentar mi expectativa vida	0.8680	0.5946	0.5803
1.Tengo suficiente conocimiento de beneficios y riesgos para tomar decisión informada	0.8520	0.6098	0.5183
7iv.Tendré opciones de tratamiento adicionales	0.8659	0.6120	0.5155
3.Estoy interesado/a en aprender más	0.8737	0.6032	0.5033
4.Inverted- Necesito visita formal	0.9168	0.7324	-0.1390
7iii.Inverted- Mi Dr cambiará mi tto inmediatamente	0.9044	0.7499	-0.3942

When this item is inverted, the scores decrease. Alpha value droped from 0.77 to 0.65, while omega remains equal. G6 and correlation for each item decrease in their absolute values except for the new inverted item 7.iii.

Two spheres or general concepts are proposed in order to analyze the granularity of the expectations domain and to find the best or more adequate items inside these spheres:
(1) Expectations of genomic results on treatment impact
(2) Expectations of results communications and the information provided.
The treatment impact sphere is focused on the impact that the genomic results could have on patients´ treatments and its efficacy. For instance, if the treatment will be modified according to the results, if their disease could be handle in a better way, if new treatment possibilities will be available, if this could improve their life expectancy. The information sphere gathers those items related to the idea of an adequate information management, i.e., if patients have previous knowledge about the genomic testing, prior good quality information, if they know its implications, and how the result can affect them in terms of concerns or stressing issues.

(1) Expectations of genomic results on treatment impact
The items proposed are:
item 5: Creo que el resultado del análisis genómico directamente ayudará al control de mi cáncer.
item 6: Creo que el resultado del análisis genómico ayudará a aumentar mi expectativa de años de vida.
item 7.iii: Mi doctor cambiará mi tratamiento inmediatamente de acuerdo a los resultados.
item 7.iv: Tendré opciones de tratamiento adicionales para mi cáncer.
item 7.v: Podré recibir tratamientos experimentales para mi cáncer.

## 
##    95% confidence boundaries (Feldt)
##  lower alpha upper
##   0.78  0.87  0.94

## 
## Information about this analysis:
## 
##                  Dataframe: pre_test_Q_Expectations_values[, c(5:6, 9:11)]
##                      Items: all
##               Observations: 26
##      Positive correlations: 10 out of 10 (100%)
## 
## Estimates assuming interval level:
## 
##              Omega (total): 0.94
##       Omega (hierarchical): 0.68
##    Revelle's omega (total): 0.94
## Greatest Lower Bound (GLB): NA
##              Coefficient H: 0.93
##          Coefficient alpha: 0.88
## 
## (Estimates assuming ordinal level not computed, as the polychoric correlation matrix has missing values.)
## 
## Note: the normal point estimate and confidence interval for omega are based on the procedure suggested by Dunn, Baguley & Brunsden (2013) using the MBESS function ci.reliability, whereas the psych package point estimate was suggested in Revelle & Zinbarg (2008). See the help ('?scaleStructure') for more information.

	raw_alpha	std.alpha	G6(smc)	average_r	S/N	alpha se	var.r	med.r
7v.Podré recibir tratamientos experimentales	0.8772	0.8740	0.8858	0.6342	6.936	0.0380	0.0298	0.6706
7iii.Mi Dr cambiará mi tto de acuerdo a los resultados	0.8538	0.8510	0.8616	0.5882	5.713	0.0456	0.0244	0.5291
5.El resultado ayudará al control de mi cáncer	0.8620	0.8609	0.8538	0.6074	6.187	0.0430	0.0230	0.5674
7iv.Tendré opciones de tratamiento adicionales	0.8140	0.8127	0.8238	0.5203	4.338	0.0583	0.0184	0.5020
6.El resultado ayudará a aumentar mi expectativa vida	0.8163	0.8153	0.8053	0.5246	4.414	0.0576	0.0162	0.5156

	n	raw.r	std.r	r.cor	r.drop	mean	sd
7iv.Tendré opciones de tratamiento adicionales	26	0.9078	0.8931	0.8889	0.8260	4.308	1.0870
6.El resultado ayudará a aumentar mi expectativa vida	27	0.8780	0.8867	0.8886	0.8170	4.037	1.1260
7iii.Mi Dr cambiará mi tto de acuerdo a los resultados	26	0.7941	0.7928	0.7353	0.6729	4.385	0.9414
5.El resultado ayudará al control de mi cáncer	27	0.8219	0.7645	0.7091	0.6355	4.222	0.9740
7v.Podré recibir tratamientos experimentales	26	0.7168	0.7248	0.6131	0.5650	4.231	0.9511

The capture of the treatment impact concept is adequate with a Cronbach’s α of 0.87 and the omega coefficient of 0.94. Regarding the G6 score, the lowest values are for items 6, and 7.iv; while analyzing r.cor are 7.v, and 5.

(2) Expectations of results communications and the information provided
For this concept, the questions grouped are:
Item 1: Considero que tengo suficiente conocimiento acerca de los potenciales beneficios y riesgos sobre los análisis genómicos para tomar una decisión informada y realizar la prueba.
Item 2: Considero que he recibido suficiente información para comprender los beneficios (aspectos buenos) y riesgos (aspectos malos) acerca de realizar el análisis genómico.
Item 3: Estoy interesado/a en aprender más sobre análisis genómicos en cáncer.
Item 4 (inverted): Siento que necesito una visita formal con un especialista en consejo genético antes de realizar el análisis genómico.
item 7.i: Mi doctor me explicará los cambios en los genes y la implicación para el futuro de mi salud.
item 7.ii: Recibiré un informe por escrito con el resultado del análisis genómico.

First, all these 6 items are analyzed without inverting item 4.

## 
##    95% confidence boundaries (Feldt)
##  lower alpha upper
##    0.3  0.59  0.79

## 
## Information about this analysis:
## 
##                  Dataframe: pre_test_Q_Expectations_values[, c(1:4, 7:8)]
##                      Items: all
##               Observations: 24
##      Positive correlations: 12 out of 15 (80%)
## 
## Estimates assuming interval level:
## 
##              Omega (total): 0.86
##       Omega (hierarchical): 0.42
##    Revelle's omega (total): 0.86
## Greatest Lower Bound (GLB): NA
##              Coefficient H: 1
##          Coefficient alpha: 0.56
## 
## (Estimates assuming ordinal level not computed, as the polychoric correlation matrix has missing values.)
## 
## Note: the normal point estimate and confidence interval for omega are based on the procedure suggested by Dunn, Baguley & Brunsden (2013) using the MBESS function ci.reliability, whereas the psych package point estimate was suggested in Revelle & Zinbarg (2008). See the help ('?scaleStructure') for more information.

	raw_alpha	std.alpha	G6(smc)	average_r	S/N	alpha se	var.r	med.r
4.Necesito visita formal con especialista consejo genético antes del test	0.7108	0.7167	0.8024	0.3359	2.5293	0.0878	0.0529	0.3168
3.Estoy interesado/a en aprender más	0.5332	0.6051	0.7533	0.2346	1.5322	0.1490	0.1020	0.1800
7i.Mi Dr me explicará resultados y la implicación para mi salud	0.5110	0.5401	0.6641	0.1902	1.1746	0.1519	0.0714	0.1437
7ii.Recibiré informe escrito con el resultado	0.5365	0.5714	0.6512	0.2105	1.3333	0.1454	0.0743	0.2072
1.Tengo suficiente conocimiento de beneficios y riesgos para tomar decisión informada	0.5491	0.6025	0.6365	0.2326	1.5159	0.1413	0.0474	0.2072
2.He recibido suficiente información para comprender beneficios y riesgos del análisis genómico	0.4012	0.4732	0.5537	0.1523	0.8981	0.1873	0.0531	0.1437

	n	raw.r	std.r	r.cor	r.drop	mean	sd
2.He recibido suficiente información para comprender beneficios y riesgos del análisis genómico	27	0.7497	0.8021	0.8383	0.6361	4.148	0.9885
7i.Mi Dr me explicará resultados y la implicación para mi salud	26	0.6802	0.6960	0.6438	0.4396	4.615	0.8038
7ii.Recibiré informe escrito con el resultado	25	0.6160	0.6392	0.6014	0.3913	4.680	0.6904
1.Tengo suficiente conocimiento de beneficios y riesgos para tomar decisión informada	27	0.5729	0.5773	0.5700	0.3266	4.222	0.9740
3.Estoy interesado/a en aprender más	26	0.6459	0.5720	0.4004	0.3624	4.154	1.0077
4.Necesito visita formal con especialista consejo genético antes del test	25	0.4239	0.2884	0.0582	0.0081	3.320	1.2152

The analyses of these 6 items grouped by the information sphere shows an alpha value of 0.59 and G6 scores with some variability. There are some items with values above 0.7 (item 3 and 4) and with high correlation (item 2, followed by 7.i, and 7.ii). Those with the worst combination are item 4, 3, and then item 1.
Then, the same items are evaluated but with item 4 inverted. Results are shown below.

## 
##    95% confidence boundaries (Feldt)
##  lower alpha upper
##   0.29  0.59  0.79

## 
## Information about this analysis:
## 
##                  Dataframe: pre_test_Q_Expectations_values[, c(2:4, 7:8, 12)]
##                      Items: all
##               Observations: 24
##      Positive correlations: 13 out of 15 (87%)
## 
## Estimates assuming interval level:
## 
##              Omega (total): 0.86
##       Omega (hierarchical): 0.42
##    Revelle's omega (total): 0.86
## Greatest Lower Bound (GLB): NA
##              Coefficient H: 1
##          Coefficient alpha: 0.67
## 
## (Estimates assuming ordinal level not computed, as the polychoric correlation matrix has missing values.)
## 
## Note: the normal point estimate and confidence interval for omega are based on the procedure suggested by Dunn, Baguley & Brunsden (2013) using the MBESS function ci.reliability, whereas the psych package point estimate was suggested in Revelle & Zinbarg (2008). See the help ('?scaleStructure') for more information.

	raw_alpha	std.alpha	G6(smc)	average_r	S/N	alpha se	var.r	med.r
4.Inverted- Necesito visita formal	0.7108	0.7167	0.8024	0.3359	2.5293	0.0878	0.0529	0.3168
3.Estoy interesado/a en aprender más	0.6176	0.6562	0.7729	0.2763	1.9086	0.1197	0.0783	0.2552
7ii.Recibiré informe escrito con el resultado	0.5454	0.5824	0.6568	0.2181	1.3949	0.1402	0.0707	0.1973
7i.Mi Dr me explicará resultados y la implicación para mi salud	0.4865	0.5174	0.6533	0.1766	1.0723	0.1556	0.0769	0.1437
1.Tengo suficiente conocimiento de beneficios y riesgos para tomar decisión informada	0.4334	0.5274	0.5943	0.1825	1.1162	0.1780	0.0705	0.1449
2.He recibido suficiente información para comprender beneficios y riesgos del análisis genómico	0.3725	0.4526	0.5419	0.1419	0.8268	0.1961	0.0565	0.1437

	n	raw.r	std.r	r.cor	r.drop	mean	sd
2.He recibido suficiente información para comprender beneficios y riesgos del análisis genómico	27	0.8045	0.8178	0.8572	0.6666	4.148	0.9885
1.Tengo suficiente conocimiento de beneficios y riesgos para tomar decisión informada	27	0.7352	0.7037	0.7158	0.5539	4.222	0.9740
7i.Mi Dr me explicará resultados y la implicación para mi salud	26	0.6571	0.7202	0.6720	0.4786	4.615	0.8038
7ii.Recibiré informe escrito con el resultado	25	0.5242	0.6035	0.5606	0.3328	4.680	0.6904
3.Estoy interesado/a en aprender más	26	0.5132	0.4401	0.2481	0.1457	4.154	1.0077
4.Inverted- Necesito visita formal	25	0.2853	0.2724	0.0385	-0.0081	2.680	1.2152

When item 4 is considered inverted, the alpha value is the same (0.59), but while the G6 score for item 4 increases the correlation was still lower. The rest of the parameters are similar.

Another approach is considered, excluding Q1 due to the fact that Q1 and Q2 are overlapped and only one of these two items will be preserved. Results are shown below, first with Q4 as it is, and secondly as inverted item.

## 
##    95% confidence boundaries (Feldt)
##  lower alpha upper
##   0.21  0.55  0.77

## 
## Information about this analysis:
## 
##                  Dataframe: pre_test_Q_Expectations_values[, c(1, 3:4, 7:8)]
##                      Items: all
##               Observations: 24
##      Positive correlations: 8 out of 10 (80%)
## 
## Estimates assuming interval level:
## 
##              Omega (total): 0.79
##       Omega (hierarchical): 0.43
##    Revelle's omega (total): 0.79
## Greatest Lower Bound (GLB): NA
##              Coefficient H: 0.81
##          Coefficient alpha: 0.53
## 
## (Estimates assuming ordinal level not computed, as the polychoric correlation matrix has missing values.)
## 
## Note: the normal point estimate and confidence interval for omega are based on the procedure suggested by Dunn, Baguley & Brunsden (2013) using the MBESS function ci.reliability, whereas the psych package point estimate was suggested in Revelle & Zinbarg (2008). See the help ('?scaleStructure') for more information.

	raw_alpha	std.alpha	G6(smc)	average_r	S/N	alpha se	var.r	med.r
4.Necesito visita formal con especialista consejo genético antes del test	0.6560	0.6790	0.6691	0.3459	2.1157	0.1093	0.0372	0.3631
3.Estoy interesado/a en aprender más	0.4600	0.5613	0.5849	0.2424	1.2795	0.1797	0.0793	0.2262
2.He recibido suficiente información para comprender beneficios y riesgos del análisis genómico	0.4443	0.5057	0.5462	0.2037	1.0232	0.1780	0.0590	0.1449
7i.Mi Dr me explicará resultados y la implicación para mi salud	0.4513	0.4811	0.4669	0.1882	0.9272	0.1726	0.0256	0.1773
7ii.Recibiré informe escrito con el resultado	0.4454	0.4726	0.4604	0.1830	0.8962	0.1765	0.0331	0.2072

	n	raw.r	std.r	r.cor	r.drop	mean	sd
7ii.Recibiré informe escrito con el resultado	25	0.6880	0.7172	0.6840	0.4465	4.680	0.6904
7i.Mi Dr me explicará resultados y la implicación para mi salud	26	0.6803	0.7072	0.6736	0.4019	4.615	0.8038
2.He recibido suficiente información para comprender beneficios y riesgos del análisis genómico	27	0.6155	0.6773	0.5550	0.3889	4.148	0.9885
3.Estoy interesado/a en aprender más	26	0.6862	0.6026	0.4279	0.3638	4.154	1.0077
4.Necesito visita formal con especialista consejo genético antes del test	25	0.5425	0.4026	0.1382	0.0964	3.320	1.2152

## 
##    95% confidence boundaries (Feldt)
##  lower alpha upper
##   0.01  0.43  0.71

## 
## Information about this analysis:
## 
##                  Dataframe: pre_test_Q_Expectations_values[, c(3:4, 7:8, 12)]
##                      Items: all
##               Observations: 24
##      Positive correlations: 8 out of 10 (80%)
## 
## Estimates assuming interval level:
## 
##              Omega (total): 0.79
##       Omega (hierarchical): 0.43
##    Revelle's omega (total): 0.79
## Greatest Lower Bound (GLB): NA
##              Coefficient H: 0.81
##          Coefficient alpha: 0.55
## 
## (Estimates assuming ordinal level not computed, as the polychoric correlation matrix has missing values.)
## 
## Note: the normal point estimate and confidence interval for omega are based on the procedure suggested by Dunn, Baguley & Brunsden (2013) using the MBESS function ci.reliability, whereas the psych package point estimate was suggested in Revelle & Zinbarg (2008). See the help ('?scaleStructure') for more information.

	raw_alpha	std.alpha	G6(smc)	average_r	S/N	alpha se	var.r	med.r
4.Inverted- Necesito visita formal	0.6560	0.6790	0.6691	0.3459	2.1157	0.1093	0.0372	0.3631
3.Estoy interesado/a en aprender más	0.4621	0.5563	0.5822	0.2386	1.2536	0.1778	0.0815	0.2102
2.He recibido suficiente información para comprender beneficios y riesgos del análisis genómico	0.1751	0.3379	0.4543	0.1132	0.5104	0.2726	0.0934	0.0760
7ii.Recibiré informe escrito con el resultado	0.2884	0.3583	0.3888	0.1225	0.5584	0.2303	0.0553	0.1060
7i.Mi Dr me explicará resultados y la implicación para mi salud	0.2042	0.2890	0.3467	0.0922	0.4064	0.2551	0.0579	0.0679

	n	raw.r	std.r	r.cor	r.drop	mean	sd
7i.Mi Dr me explicará resultados y la implicación para mi salud	26	0.7005	0.7723	0.7691	0.4965	4.615	0.8038
7ii.Recibiré informe escrito con el resultado	25	0.6234	0.7106	0.6868	0.4111	4.680	0.6904
2.He recibido suficiente información para comprender beneficios y riesgos del análisis genómico	27	0.7082	0.7296	0.6251	0.4687	4.148	0.9885
3.Estoy interesado/a en aprender más	26	0.5718	0.4737	0.2564	0.1145	4.154	1.0077
4.Inverted- Necesito visita formal	25	0.2790	0.2548	-0.0701	-0.0964	2.680	1.2152

The global reliability did not change significantly, besides, Q4 remain being a conflicting item.

Moreover, a last analysis is run excluding item 4; this is, including Q2, Q3, Q7i and Q7ii.

## 
##    95% confidence boundaries (Feldt)
##  lower alpha upper
##   0.38  0.66  0.83

## 
## Information about this analysis:
## 
##                  Dataframe: pre_test_Q_Expectations_values[, c(3:4, 7:8)]
##                      Items: all
##               Observations: 25
##      Positive correlations: 6 out of 6 (100%)
## 
## Estimates assuming interval level:
## 
##              Omega (total): 0.76
##       Omega (hierarchical): 0.64
##    Revelle's omega (total): 0.76
## Greatest Lower Bound (GLB): NA
##              Coefficient H: 0.8
##          Coefficient alpha: 0.65
## 
## (Estimates assuming ordinal level not computed, as the polychoric correlation matrix has missing values.)
## 
## Note: the normal point estimate and confidence interval for omega are based on the procedure suggested by Dunn, Baguley & Brunsden (2013) using the MBESS function ci.reliability, whereas the psych package point estimate was suggested in Revelle & Zinbarg (2008). See the help ('?scaleStructure') for more information.

At this point and considering the first analysis (Cronbach_Exp_2_InfoConcept), while alpha increases (from 0.59 to 0.66), the Omega decreases (0.86 to 0.76).

Partial conclusion:
G6, correlation, alpha when an item was dropped- The best combination of both scores are for the following items: 5, 7.v, 6, 7.iv, and 7.i; and the worst are for items 4, 1, 3, and 7.ii.
From the exploratory analysis splitting this domain into two spheres, the treatment impact sphere seems to be adequate (alpha 0.87) without a clear patter of best or worst items. On the other hand, regarding the information sphere does not seem to work correctly (alpha 0.59). Besides, item 4 continues having problems and conflicting results.

5.3.2.2 Concerns

Regarding concerns, the last 4 questions are focus on this topic. Considering the fact that the focus now is on the concerns these questions are analyzed without reverting the scale (i.e., in the same direction that the patient completed it).

First, analyzing the reliability of the whole subset of items, i.e., the consistency of the concerns section several measurements are display: the Cronbach’s α, with its IC, and the omega (ω) coefficient.

## 
##    95% confidence boundaries (Feldt)
##  lower alpha upper
##   0.74  0.85  0.93

## 
## Information about this analysis:
## 
##                  Dataframe: pre_test_Q_Concerns_values
##                      Items: all
##               Observations: 25
##      Positive correlations: 6 out of 6 (100%)
## 
## Estimates assuming interval level:
## 
##              Omega (total): 0.88
##       Omega (hierarchical): 0.81
##    Revelle's omega (total): 0.88
## Greatest Lower Bound (GLB): NA
##              Coefficient H: 0.89
##          Coefficient alpha: 0.86
## 
## (Estimates assuming ordinal level not computed, as the polychoric correlation matrix has missing values.)
## 
## Note: the normal point estimate and confidence interval for omega are based on the procedure suggested by Dunn, Baguley & Brunsden (2013) using the MBESS function ci.reliability, whereas the psych package point estimate was suggested in Revelle & Zinbarg (2008). See the help ('?scaleStructure') for more information.

Then, in order to inspect the role of each item, the r.cor and the Guttman’s Lambda 6 (G6) are explored.

Figure 5.2: Dashed line indicates the alpha (Feldt alpha) value for the whole set of items in order to compare this value if the alpha value that results from dropping the corresponding item (yellow bar).

##                                                                                                Items
## 1                                9.Me preocupa que los resultados pueden ser difíciles de comprender
## 2                                            11.Los resultados pueden preocuparme o generar ansiedad
## 3                                    8.Me preocupa que los resultados puedan no guiar mi tratamiento
## 4 10.Me preocupa que los resultados pueden dar información del riesgo de enf que preferiría no saber
##       G6 raw_alpha_itemDropped  r.cor
## 1 0.7932                0.8470 0.6729
## 2 0.7738                0.8301 0.7268
## 3 0.7609                0.8037 0.7752
## 4 0.7023                0.7695 0.8653

Analyzing these items, concerns seem to be well captured with a Cronbach’s α of 0.85 and the omega coefficient of 0.88. Then, the evaluation of each item shows opposite score results for G6, r.cor, and alpha when the item is dropped. If both measurements are considered with the same importance, one strategy could be to choose those two items in the middle; i.e., with the best combination of both scores (item 11 and 8). However, there are still doubts about the scale direction and how to differentiate the real capture from an artefactual effect produced by the scale direction (most of the patients chose the “completely agree” option).

5.3.3 Global conclusions about expectations domain

Considering the aim of the analysis of this pilot study, which is to identify the final set of items that will be chosen for the validation study, there are some items that could be identify as less relevant and other as more relevant when all these prior steps are taken together.
Therefore, for expectations, it is possible to summarize the findings as follows:
Two spheres or topic can be individualized:
1.- Expectations of genomic results on treatment impact.
This topic involves: item 5, 6, 7.iv, 7.v., and 7.iii.
2.- Expectations of results communications and the information provided, however, metrics for this set of items showed lower alpha and omega values.
This topic involves: item 1, 2, 3, 4, 7.i, and 7.ii.

summary variability and discrimination-
The best five elements are: 2, 6, 7.v, 5, and 7.iv. The worst two: 7.i and 7.iii. Then, if item 2 is selected item 1 could be excluded.
Conflicting results: with high variability and low discrimination item 4, 3, and with low variability and high discrimination 7.ii.

Summary G6, alpha, and correlation-
The best items: 5, 7.v, 6, 7.iv, and 7.i.
The worst items 4, 1, 3, and 7.ii.

Best potential items:
6, 7.v, 5, and 7.iv.

Potential items to be excluded according to these two spheres:
- Expectations of genomic results on treatment impact: 7.iii, followed by 7.iv and 7.v. These two last items have adequate values, but they are the less critical ones in this sphere. Item 7.iv has a lower discrimination score than 7.v.
- Expectations of results communications and the information provided: 1 (if item 2 is selected), 4 due to its conflicting results along the analysis. Then item 3 and 7.ii showed conflicting results too. However, item 3 could be relevant if an interventional approach is considered.

With regard to the concerns domain, the there is not a clear difference between them. At least, item 11 seem to be relevant. The second item to be chosen can be identify when the rest of the analysis were completed.

5.4 Attitudes domain

This domain includes 11 items. Out of these, 8 are questions with a Likert scale and 3 are items to select one or more options. These last 3 items are oriented to the patient´s motivation for performing the genomic testing. The target number of attitudes items to be included in the final version of this questionnaire is set in 5, seeking to reach a final number of items of 20 (knowledge 5, expectations 10 (expectations 8 + concerns 2), attitudes 5).

For attitudes items, number 5 should be interpreted in an opposite way:
- item 5. “El análisis genómico parece ser una prueba imprecisa.”. Independently that this questionnaire is focused on patients point of view, clearly, genomic testing are precise assays. Later, for the analysis this question will be inverted.

5.4.1 Visualizing items and responses

Analyzing the variability between answers, items 8, 7, and 1 show the lowest variability. For items 8 and 7 there are more than 90% of agreement. While lower than the previous ones, the item 1 shows a high percentage of agreement too (85% totally agree). However, this item could be relevant to understand the relationship of patients with invasive procedure involved in this kind of tests. Besides, this item is the complement of the other two (item 2, and 3).

Secondly, the difficulty and discrimination capacity is evaluated for each of these items.

## Scale for fill is already present.
## Adding another scale for fill, which will replace the existing scale.
## Scale for colour is already present.
## Adding another scale for colour, which will replace the existing scale.

The lowest scores for discrimination are shown by items 5, 6, and 4, followed by items 8 and 7; being 7, 8 and 6 those with the highest difficulty.

Partial conclusion:
Variability- Highest variability: 5, 3, and 2.
Lowest variability: items 8, 7, and 1

Discrimination- Highest discrimination: items 3, 1, and 2.
Lowest discrimination: items 5, 6, and 4, followed by items 8 and 7.

Additionally, there are three last questions focused on motivations with multiple choice options. Two asking about the mains motivations, and one to determine the less relevant motivation. These results are depict graphically below.

According to these results, the key motivation to pursue genomic testing were to improve the journey or evolution of their disease followed by to contribute to cancer research and to help other patients.
Then, the less relevant topic was:

In this setting the less significant motivation was their family recommended it.

5.4.2 Evaluating the reliability of questions (Cronbach’s α and Omega)

The Cronbach’s α and the Guttman’s lambda_6 (G6) are calculated for the attitude domain.

## 
##    95% confidence boundaries (Feldt)
##  lower alpha upper
##   0.63  0.78  0.89

Omega (total)
0.926

Omega (hierarchical)
0.5365

Then, in order to inspect the role of each item, the Guttman’s Lambda 6 (G6),the alpha when the item is dropped, and the r.cor are explored.

Figure 5.3: Dashed line indicates the alpha (Feldt alpha) value for the whole set of items in order to compare this value if the alpha value that results from dropping the corresponding item (yellow bar).

Items	G6	raw_alpha_itemDropped	r.cor
5.Inverted- El análisis genómico parece ser una prueba imprecisa.	0.9197	0.8305	0.1821
6.Mi motivación- ayudar investigadores a comprender mejor cómo tratar mi tipo de cáncer.	0.9056	0.7907	0.4245
1.Dispuesto a que me realicen un procedimiento POCO INVASIVO.	0.9003	0.7440	0.6078
4.El análisis genómico parece algo beneficioso para mi.	0.8928	0.7300	0.7223
8.Mi motivación- siento que ayudo a otros pacientes con cáncer.	0.8633	0.7619	0.7297
3.Dispuesto a que me realicen un procedimiento médico MAYOR.	0.8554	0.6924	0.8272
2.Dispuesto a que me realicen un procedimiento médico MENOR.	0.8549	0.7068	0.7441
7.Mi motivación- contribuir con la investigación en cáncer.	0.8530	0.7566	0.7842

Figure 5.4: Dashed line indicates the alpha (Feldt alpha) value for the whole set of items in order to compare this value if the alpha value that results from dropping the corresponding item (yellow bar).

Items	G6	raw_alpha_itemDropped	r.cor
5.El análisis genómico parece ser una prueba imprecisa.	0.9197	0.8305	-0.0428
1.Dispuesto a que me realicen un procedimiento POCO INVASIVO.	0.8933	0.6603	0.5499
6.Mi motivación- ayudar investigadores a comprender mejor cómo tratar mi tipo de cáncer.	0.8892	0.6616	0.5763
4.El análisis genómico parece algo beneficioso para mi.	0.8838	0.6337	0.6902
2.Dispuesto a que me realicen un procedimiento médico MENOR.	0.8518	0.6645	0.5929
3.Dispuesto a que me realicen un procedimiento médico MAYOR.	0.8456	0.5998	0.7673
8.Mi motivación- siento que ayudo a otros pacientes con cáncer.	0.8444	0.6561	0.8009
7.Mi motivación- contribuir con la investigación en cáncer.	0.8320	0.6480	0.8591

The capture of the attitude domain is adequate with a Cronbach’s α of 0.78 and the omega coefficient of 0.926. In this setting G6 and correlation are totally in opposite directions. Similarly, the other parameters seem to identify different items. The highest values or the best combination of these three metrics are Q3, Q4, Q2, and Q7, followed by followed by Q8 and Q1.
Item 5, was analyzed again without inverting it. In this scenario, results are similar but the correlation for item 5 is still even worst. In both analysis, considering the item as it is or inverting it, the correlation is extremely low. Therefore, this items could be consider as the less related with the attitudes domain, or at least one that can be excluded.

Partial conclusion: G6, alpha, and correlation- In this domain, results are different depending where the focus is posed. Considering the best combination the chosen items are: Q3, Q4, Q2, Q7, and followed by Q8 and Q1. Being the less relevant item Q5 followed by item Q6.

5.4.3 Global conclusions about attitude domain

Considering the aim of the analysis of this pilot study and the task to identify the final set of items that will be chosen for the validation study, there is considerable variability in this domain and its items, leading to a complex scene regarding which strategy and items should be selected.
The highest variability and discrimination were seen for items Q3, Q2, Q1 and Q5. While the lowest were found for Q8, and Q7, followed by Q6 and Q4. Item 5 showed the highest variability and the worst discrimination capacity. Considering the G6, alpha and the correlation score, while the best ones are Q3, Q4, and Q2, the less relevant item Q5 followed by item Q6.

One possible approach is to exclude all the motivation items, Q6, Q7, and Q8; but including Q9 in which the patient select its main motivation. Then, to have the final set of 5 items, an additional one from Q1-Q5 should be excluded. In this line, item 5 showed the lower discrimination score and the worst alpha and correlation.
Therefore, the list includes: Q1, Q2, Q3, and Q4, plus Q9.