Open Access Links to Select Publications
The Response Process Evaluation MethodAbstract
Pretesting survey items for interpretability and relevance is a commonly recommended practice in the social sciences. The goal is to construct items that are understood as intended by the population of interest and test if participants use the expected cognitive processes when responding to a survey item. Such evidence forms the basis for a critical source of validity evidence known as the response process, which is often neglected in favor of quantitative methods. This may be because existing methods of investigating item comprehension, such as cognitive interviewing and web probing, lack clear guidelines for retesting revised items and documenting improvements, and can be difficult to implement in large samples. To remedy this, we introduce the Response Process Evaluation (RPE) method. The RPE method is a standardized framework for pretesting multiple versions of survey items and generating individual item validation reports. This iterative, evidence-based approach to item development relies on feedback from the population of interest to quantify and qualify improvements in item interpretability across a large sample. The result is a set of item validation reports that detail the intended interpretation and use of each item, the population it was validated on, the percent of participants that interpreted the item as intended, examples of participant interpretations, and any common misinterpretations to be cautious of. Researchers may find that they have more confidence in the inferences drawn from survey data after engaging in rigorous item pretesting.
Abstract
In the social sciences, validity refers to the adequacy of a survey (or other mode of assessment) for its intended purpose. Validation refers to the activities undertaken during and after the construction of the survey to evaluate and improve validity. Item validation refers here to procedures for evaluating and improving respondents’ understanding of the questions and response options included in a survey. Verbal probing techniques such as cognitive interviews can be used to understand respondents’ response process, that is, what they are thinking as they answer the survey items. Although cognitive interviews can provide evidence for the validity of survey items, they are time-consuming and thus rarely used in practice. The Response Process Evaluation (RPE) method is a newly-developed technique that utilizes open-ended meta-surveys to rapidly collect evidence of validity across a population of interest, make quick revisions to items, and immediately test these revisions across new samples of respondents. Like cognitive interviews, the RPE method focuses on how participants interpret the item and select a response. The chapter demonstrates the process of validating one survey item taken from the Inventory of Non-Ordinary Experiences (INOE).
Abstract
Model fit assessment is a central component of evaluating confirmatory factor analysis models. Fit indices like RMSEA, SRMR, and CFI remain popular and researchers often judge fit based on suggestions from Hu and Bentler (1999), who derived cutoffs that distinguish between fit index distributions of true and misspecified models. However, methodological studies note that the location and variability of fit index distributions – and, consequently, cutoffs distinguishing between true and misspecified fit index distributions – are not fixed but vary as a complex interaction of model characteristics like sample size, factor reliability, number of items, and number of factors. Many studies over the last 15 years have cautioned against fixed cutoffs and the faulty conclusions they can trigger. However, practical alternatives are absent, so fixed cutoffs have remained the status quo despite their shortcomings. Criticism of fixed cutoffs stem primarily from the fact that they were derived from one specific confirmatory factor analysis model and lack generalizability. To address this, we propose dynamic cutoffs such that derivation of cutoffs is adaptively tailored to the specific model and data being evaluated. This creates customized cutoffs that are designed to distinguish between true and misspecified fit index distributions in the researcher’s particular context. Importantly, we show that the method does not require knowledge of the “true” model to accomplish this. As with fixed cutoffs, the procedure requires Monte Carlo simulation, so we provide an open-source, web-based Shiny application that automates the entire process to make the method as accessible as possible.
Abstract
Assessing unidimensionality of a scale is a frequent interest in behavioral research. Often, this is done with approximate model fit indices in a factor analysis frameworksuch as RMSEA, CFI, or SRMR. These fit indices are continuous measures, so values indicating acceptablefit are up to interpretation. Cutoffs suggested by Hu and Bentler (1999)are a common guideline used in empirical research. However, these cutoffs werederived with intent todetect omitted cross-loadings or omitted factor covariances in three-factor models. These types of misspecifications cannot exist in one-factor models, so the appropriateness of using these guidelines in one-factor models is uncertain. This paper uses a simulation study to address whether traditional fit index cutoffs are sensitive to the types ofmisspecifications that can occur in one-factor models. The results showed that traditional cutoffs have very poor sensitivity to misspecification in one-factor modelsand that the traditional cutoffs generalize poorly to one-factor contexts. As an alternative, we investigate the accuracy and stability of the recently introduced dynamic fit cutoff approach for creating fit index cutoffs for one-factor models. Simulation results indicated excellent performance ofdynamic fit index cutoffsto classify correct or misspecified one-factor modelsand that dynamic fit index cutoffs are a promising approachfor more accurate assessment of unidimensionality.
Abstract
A common way to form scores from multiple-item scales is to sum responses of all items. Though sum scoring is often contrasted with factor analysis as a competing method, we review how factor analysis and sum scoring both fall under the larger umbrella of latent variable models, with sum scoring being a constrained version of a factor analysis. Despite similarities, reporting of psychometric properties for sum scored or factor analyzed scales are quite different. Further, if researchers use factor analysis to validate a scale but subsequently sum score the scale, this employs a model that differs from validation model. By framing sum scoring within a latent variable framework, our goal is to raise awareness that (a) sum scoring requires rather strict constraints, (b) imposing these constraints requires the same type of justification as any other latent variable model, and (c) sum scoring corresponds to a statistical model and is not a model-free arithmetic calculation. We discuss how unjustified sum scoring can have adverse effects on validity, reliability, and qualitative classification from sum score cut-offs. We also discuss considerations for how to use scale scores in subsequent analyses and how these choices can alter conclusions. The general goal is to encourage researchers to more critically evaluate how they obtain, justify, and use multiple-item scale scores.
Abstract
When validating psychological surveys, researchers tend to concentrate on analyzing item responses instead of the processes that generate them. Thus, the threat of invalid response on validity is neglected. Such invalid responses occur when participants unintentionally or intentionally select response options that are otherwise inaccurate. In this paper, we explore the effect of survey use on survey responses under the hypothesis that participants may intentionally give invalid responses if they disagree with the uses of the survey results. Results show that nearly all participants reflect on the intended uses of an assessment when responding to items and most decline to respond or modify their responses if they are not comfortable with the way the results will be used. We introduce methods to prevent and detect invalid responses, thus providing researchers with more confidence in the validity of their inferences.