What are four common threats to internal validity when conducting experimental research designs?

Internal Validity

Inhaltsverzeichnis Show

Internal Validity
3 Threats to Internal Validity
A Framework for Detecting and Diagnosing Configuration Faults in Web Applications
7.6 Threats to Experimental Validity
Securing microservices and microservice architectures: A systematic mapping study
8.2 Internal validity
System security requirements: A framework for early identification, specification and measurement of related software requirements
6.7 Threats to validity
Service oriented architecture maturity models: A systematic literature review
6 Threats to validity
Identifying, categorizing and mitigating threats to validity in software engineering secondary studies
2.1 Threats to validity in empirical software engineering
Success factors influencing requirements change management process in global software development
8 Threats to validity
Software security patch management - A systematic literature review of challenges, approaches, tools and practices
9 Threats to validity
What are the 4 threats to internal validity?
What are the threats to validity experimental design?
What are the 4 threats to external validity?
What are three major threats to the internal validity of experiments and possible solutions?

M.M. Mark, C.S. Reichardt, in International Encyclopedia of the Social & Behavioral Sciences, 2001

3 Threats to Internal Validity

The literature on internal validity consists largely of detailed lists of validity ‘threats.’ Internal validity threats are generic categories of causal forces that may frequently obscure causal inferences. Take as an example, once again, a researcher's efforts to determine whether an anger management program reduces aggressive behavior in a middle school. ‘History’ refers to the possibility that specific events, other than the intended treatment, may have occurred between the pretest and post-test observations and may obscure the true treatment effect. If the researcher observed the level of aggressive behavior on the playground before the anger management program, and again afterward, history would be a problem if a different, stricter teacher became playground monitor in the interim. ‘Maturation’ refers to the possibility that natural processes which occur over time within the study participants, such as growing older, hungrier, more fatigued, wiser, and the like, may create a false treatment effect or mask a real one. Less aggression may occur at the post-test simply because the children are older than at the pretest, for instance. ‘Attrition’ refers to the possible loss of participants in a study. For example, if children from troubled families are more likely to drop out of school or to move away in the middle of the school year, then attrition could cause a decrease in aggression from the pretest to the post-test. ‘Instrumentation’ arises as a validity threat when a change in a measuring instrument causes erroneous conclusions about the effects of an intervention. For instance, if observers' standards shifted over time, such that later incidents had to be more violent to be rated as aggressive, this could cause the appearance of a treatment effect when in fact there is none.

‘Selection’ refers to the possibility that post-test differences between a treatment group and a control group may be due to initial differences between the groups rather than to a treatment effect. Selection problems might occur if a researcher attempted to assess the effectiveness of an anger management program by comparing the level of playground aggression in two middle schools, one of which had implemented the program. In addition, more complex internal validity problems can occur, whereby some threat operates only (or more powerfully) in one group than another. For instance, ‘selection by maturation’ indicates that participants in the treatment condition are maturing at a different rate than those in the control condition. See Cook and Campbell (1979) for additional discussion of internal validity threats, including the threats of testing and regression to the mean.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B0080430767007294

A Framework for Detecting and Diagnosing Configuration Faults in Web Applications

Cyntrica Eaton, in Advances in Computers, 2012

7.6 Threats to Experimental Validity

7.6.1 Internal Validity

The internal validity of experiments is threatened when results of the dependent variable can be tainted by modeling and measurement errors. In each of the questions we address, accuracy is the primary dependent variable. Hence threats to internal validity, in this context, include possible errors in measuring/designating the training set and modeling/executing both the tag abstraction scheme and tag classification strategy.

Another threat lies in the correctness of the gold standard. The source used as the basis for the gold standard, in some instances, relies on the documentation provided from the browser manufacturer. Since this can be erroneous at times, it can have an undesirable impact on accuracy evaluations.

One final internal validity threat lies in, what amounts to, varied weighting for false positives and false negative in our accuracy model. In this case, we have considered the false positives to be more important than false negatives and chose a weighting system that reflects this idea. Perhaps, some weight should be given to the false positives as well in order to derive more reflective accuracy values.

7.6.2 External Validity

Threats to external validity, on the other hand, limit the ability to generalize experimental results. Several candidates for this constraint apply. For one, we are currently only considering pages in which there are source code-induced faults that can be linked to a certain tag and not, perhaps, JavaScript errors that can be linked to a faulty variable. Other threats include possible misclassification of Web pages on the behalf of submitters and low usage of a given client configuration platform (resulting in less raw material for the inductive algorithm). We took a great deal of care to ensure that pages were accurately labeled and included a sizable number of positive/negative examples during analysis; these factors may or may not be sustained in the field.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780123965356000053

Securing microservices and microservice architectures: A systematic mapping study

Abdelhakim Hannousse, Salima Yahiouche, in Computer Science Review, 2021

8.2 Internal validity

An internal validity threat concerns the data extraction from the set of included studies. Both authors are incorporated in this process. Following the guidelines of Peterson et al. [7], a data extraction form is designed by the first author, discussed and updated after a deep discussion with the second author. Each author has filled the form independently; after solving the disagreements between the two authors, a unique and final form is adopted for the rest of the study. Following the guidelines of Kitchenham et al. [6], the kappa coefficient is estimated and found equal 0.94 which is evaluated as very good or almost perfect score according to [6].

Read full article

URL: https://www.sciencedirect.com/science/article/pii/S1574013721000551

Kenza Meridji, ... Sylvie Trudel, in Computer Standards & Interfaces, 2019

6.7 Threats to validity

The internal validity threat in this example would be to any change in the design of the example, such as a lack of explanation for the concepts to be assessed in the running example. To mitigate the risk of this threat the principal researchers conducted an example of the proposed reference model. An external validity threat is articulated at the level of the results.

The proposed reference model of software security requirements was illustrated using only the requirement specifications of the withdrawal process for an ATM machine system. To mitigate the risk of this validity threat further examples should be conducted using requirement specifications of different types of software products (i.e., real time embedded software, business application software, or even a hybrid of both types).

Read full article

URL: https://www.sciencedirect.com/science/article/pii/S0920548918301272

Service oriented architecture maturity models: A systematic literature review

Supriya Pulparambil, Youcef Baghdadi, in Computer Standards & Interfaces, 2019

6 Threats to validity

The validation process is essential for any kind of empirical study. Wohlin et al. [70] defined four different types of validity threats: construct, internal, external, and conclusion.

•

Construct validity threats: According to [70], construct validity threats are the threats linked to the generalization of the results to the concepts behind the study. During our review, we could see that a few SOAMMs have very detailed documentation whereas others do not. In a few models, they use different terminologies or the details are not explicitly mentioned. In such cases, we concluded with our assessments. In order to minimize this kind of threat, we focused only on SOAMM design and usage aspects and the second author verified the extracted data.

•

Internal validity threats: Internal validity threats may affect the results due to incorrect conclusions. In order to mitigate this kind of risk we followed the SLR guidelines provided by Kitchenham [25] and conducted the study systematically. We defined a review protocol to include all relevant SOAMM studies in the search. We included the major digital libraries to cover all the publications in SOA maturity.

•

External validity threats: These threats are related to the generalization of results of the review to the real world scenarios [70]. In this review, there may be chances to exclude the SOAMMs, which may not be part of academic publications or related works of selected primary studies. We also defined inclusion/exclusion/quality criteria to mitigate the risks associated with primary study selection.

•

Conclusion validity threats: The threats related to the inaccurate conclusions come under this category. This could be the small set of primary studies we have collected. In order to mitigate the risk of excluding primary studies we did the following: (i) the search strings are defined by considering the concepts and their acronyms, (ii) the multiple electronic libraries are used as a search base, (iii) the related works of all the primary studies are carefully analyzed. This way, we could find three more new SOAMMs from the related works of selected models. In addition, to mitigate the risks associated with study duplication we used Microsoft Excel's sorting and filtering options based on the title, author, publisher, and year of publication. The selected papers are carefully read through to mitigate the misinterpretations of title/abstract.

Read full article

URL: https://www.sciencedirect.com/science/article/pii/S092054891730418X

Identifying, categorizing and mitigating threats to validity in software engineering secondary studies

Apostolos Ampatzoglou, ... Alexander Chatzigeorgiou, in Information and Software Technology, 2019

2.1 Threats to validity in empirical software engineering

Threats to validity have been often categorized in the literature of general research methods in different types. Initially, Cook and Campbell [8]2 recorded four types of validity threats in quantitative experimental analysis: statistical conclusion validity, internal validity, construct validity of putative causes and effects and external validity. Concerning qualitative research, Maxwell [29] provided a general categorization of threats that can be mapped to Cook and Campbell's categorization as follows: theoretical validity (construct validity), generalizability (internal, external validity), and interpretive validity (statistical conclusion validity). An additional threat category, mentioned by Maxwell [29], is descriptive validity, which is relevant only for qualitative studies. Descriptive validity reflects the accuracy and objectivity of the information gathered. For example, when researchers collect statements from participants, threats to validity can be related to the way that researchers recorded or transcribed the statements. Other types of validity threats that are found in literature are: reliability [38,51], transferability, credibility and confirmability [27], uncontrollability, and contingency [14].

In the empirical SE community there are two main schools on reporting threats to validity: (a) Wohlin et al. [47] who adopted Cook and Campbell's [8] categorization of validity threats and presented four main types of threats to validity for quantitative research within software engineering: conclusion, internal, construct, and external validity; and (b) Runeson et al. [38] who discussed four main types of validity threats for case studies within software engineering: reliability, internal, construct, and external validity. The threats of Runeson et al. [38] are similar to those of Wohlin et al. [47] with the exception of reliability replacing conclusion validity.

Biffl et al. [4] argue that researchers should also consolidate actual experimental research on a specific topic to complement existing generic threats and guidelines when performing their research. The tradeoff between internal and external validity has been addressed by Siegmund et al. [40], where the authors performed a survey and concluded that externally valid papers are of greater practicality while internally valid studies seem to be unrealistic. Additionally, the study examined the impact of replication studies and found that although researchers realize the necessity of such studies they are reluctant to conduct or review them mainly due to the fact that there are no guidelines for performing them [40]. A list of definitions of the union of the aforementioned categories of threats to validity (i.e. from [38] and [47]) are presented in Table 1.

Table 1. Categories of Threats to Validity in ESE Research.

Conclusion validity: Originally called “statistical conclusion validity”, this aspect deals with the degree to which conclusions reached (e.g. about relationships between factors) are reasonable within the data collected. Researcher bias, for example, can greatly impact conclusions reached and can be considered to be a threat to conclusion validity. Similarly, statistical analysis may lead to weak results that can be interpreted in different ways according to the bias of the researcher. In either case the researcher may reach the wrong conclusion [47].
Reliability: This aspect is concerned with to what extent the data and the analysis are dependent on the specific researchers. Example of this type of threat is the unclear coding of collected data. If a researcher produces certain results, then, other researchers should be able to reproduce identical results following the same methodology of the study [38].
Internal validity: This aspect relates to the examination of causal relations. Internal validity examines whether an experimental treatment/condition makes a difference or not, and whether there is evidence to support the claim [47].
Construct validity: Defines how effectively a test or experiment measures up to its claims. This aspect deals with whether or not the researcher measures what is intended to be measured [47].
External validity: The concern of this aspect is whether the results can be generalized. During the analysis of this validity, the researcher attempts to see if findings of the study are of relevance for others. In the case of quantitative research (experiments), this primarily relies on the chosen sample size. In contrast, case studies have normally a low sample size, so the researcher has to try and analyze to what extent the findings can be related to other cases [47].

Petersen et al. [35] based on the categorizations of threats to validity suggested by Maxwell, suggested a check list that can help researchers identify the threats applicable to the type of research performed by reporting first their world-view and then the research method applied. A secondary study attempting to assess the practices in reporting validity threats in ESE [12] concluded that more than 20% of the studied papers contain no discussion of validity threats and the ones that do discuss validity threats on average contain 5.44 threats.

Regarding threats to validity for secondary studies in software engineering, we have been able to identify only one related work. In particular, Zhou et al. [53] have performed a tertiary study on more than 300 secondary studies until 2015. The authors have identified 23 threats to validity for secondary studies, and organize the consequences of these studies into four categories: internal, external, conclusion, and construct validity. To alleviate these threats the authors maps the threats and possible consequences to 24 mitigation strategies. This paper shares common goals with our study, however, ours is broader in the sense that: (a) it covers a wider timeframe (until 2017 instead of middle of 2015); (b) it focuses only on top-quality venues, which are expected to pay special attention in the proper application of methodological guidelines, such as the proper reporting of threats to validity, a fact that increases the quality of the obtained data; and most importantly (c) our study answers two additional RQs, providing a classification schema and a checklist for identifying, mitigating, and reporting threats to validity. In addition to this, as indirect related work (especially in terms of mitigation actions), in Section 2.3 we present a review of guidelines on secondary studies in software engineering.

Read full article

URL: https://www.sciencedirect.com/science/article/pii/S0950584918302106

Success factors influencing requirements change management process in global software development

Muhammad Azeem Akbar, ... Hong Xiang, in Journal of Computer Languages, 2019

8 Threats to validity

Most of SLR results were extracted by the first author of this study. It might be a threat towards the validity of this study because the results of a single researcher could be biased and may constantly extract the wrong data. However, this threat was tried to be reduced by the participation of the other authors to arbitrarily examine the SLR results in order to find any issues that might exist.

Most of the selected primary studies have not discussed the key causes of the identified success factors and it could be the internal validity threat to this study. It is possible that in certain studies, there might be a trend to report a particular type of factors. Furthermore, the majority of the researchers of the 54 selected primary studies are in the academic field; therefore, they might have lack of knowledge about the current practices of RCM processes in the software development industry.

We have noticed that only 34 primary studies out of 107 have provided organization size information and due to this limitation, we were not able to generalize the organizational size information of all the selected primary studies. It might be a possible threat for the results of RQ2 that focused on the size of the organizations.

With the increasing number of studies on RCM, certain relevant studies may have been inadvertently omitted from our SLR process. However, similar to other SLR researchers, this omission was not systematic [16,18,23].

Read full article

URL: https://www.sciencedirect.com/science/article/pii/S1045926X18301411

Software security patch management - A systematic literature review of challenges, approaches, tools and practices

Nesara Dissanayake, ... M. Ali Babar, in Information and Software Technology, 2022

9 Threats to validity

In this section, we report the validity threats of our study and the corresponding mitigation strategies following the guidelines proposed by [42,58,59].

9.1 Internal validity

Bias in study selection (i.e., study filtering) and data extraction represent standard threats to all SLRs [59]. To address this, we defined a review protocol with explicit details about the search string construction, search process, study inclusion/exclusion and data extraction strategy [39,58,60]. Following a well-defined protocol helps achieve consistency in the study selection and data extraction, particularly, if multiple researchers are involved in the process [60]. We iteratively developed and improved the protocol, particularly the inclusion/exclusion criteria, after conducting a staged study selection process and pilot data extraction. Further, two authors selected the studies while the other authors cross-checked the outcomes and appropriateness of the selection criteria using randomly selected papers.

Concerning data extraction bias, we executed a pilot data extraction on a randomly selected sample of five studies to ensure the data extraction form captures all the required data to answer the RQs. We used a data extraction form (adapted from [30,41]) which was reviewed by all authors through the pilot data extraction. The first author extracted the data which was cross-checked by the other authors for accuracy. Throughout study selection and data extraction phases, weekly detailed discussions were held between all authors to resolve the disagreements.

Additionally, publication bias is acknowledged as an internal validity threat which refers to the issue of the high likelihood of publishing positive results than negative ones [30]. However, we have reported the negative results captured in the primary studies (e.g., challenges in software security patch management (RQ1)) and the challenges have been mapped against the reported solutions (RQ2), i.e., the positive results, when identifying the gaps in Section 8 moderating the effect of unreported negative results. Further, using snowballing to increase the time and publication coverage has helped mitigate the publication bias of outcomes [58].

9.2 External validity

Generalisability, referring to the likelihood of not being able to generalise the results, presents an important threat to overcome in SLRs. To address this, we conducted broad searches using one of the most well-known digital libraries (Scopus) to increase the identification of the related primary studies with broad time and publication coverage [58]. However, we acknowledge that our findings may not necessarily generalise to grey literature and studies outside the review period.

9.3 Construct validity

We are unable to guarantee that we have captured all the relevant primary studies in our SLR. The possibility of missing primary studies is an inevitable limitation in an SLR due to limitations in the search string construction and selection of non-comprehensive digital libraries (DL) [39,58]. However, to minimise the effects of this, we used several strategies which are described below.

We executed several pilot searches through which we systematically improved the search string to retrieve as many relevant papers as possible. An important point to note is that although the term “software security patch management” is widely used in the industry, this is still a new and emerging topic in research. Thus the use of inconsistent or different terminology in research papers, in particular, the term “management”, resulted in a large number of irrelevant studies after its inclusion in the search string. Therefore, we have excluded it from the search string. Although this keyword was not included, the structure of the search string (i.e., broad and not time-bounding) was capable of finding patch management papers, but we had to identify these papers through the study inclusion/exclusion phases. In addition, we used snowballing (i.e., forward and backward search on references of the selected studies) to mitigate the threat of missing relevant primary studies from the exclusion of this term.

Regarding the selection of DLs, while using only Scopus to identify studies may present a limitation of this study, this decision has enabled to increase the coverage of the relevant studies since Scopus is considered the most comprehensive search engine among other DLs with the largest indexing system [31,36]. We also did a pilot search on ACM Digital Library to compare and confirm the coverage of results from Scopus. To further mitigate this threat, we made our search string very broad by including the most common keywords to capture as many potentially relevant studies as possible.

9.4 Conclusion validity

Researcher bias or the potential bias of authors while interpreting or synthesising the data can impact the conclusions reached [58]. To reduce this impact, we adopted the recommended best practices for qualitative data analysis and research synthesis [58]. The first author led the data analysis and synthesis and the codebooks were shared with all authors every week where the second and third authors went through all the emergent codes, themes and synthesis results in detail. Disagreements between authors were discussed in detail in weekly meetings until an agreement was reached between all authors.

Read full article

URL: https://www.sciencedirect.com/science/article/pii/S0950584921002147

What are the 4 threats to internal validity?

What are threats to internal validity? There are eight threats to internal validity: history, maturation, instrumentation, testing, selection bias, regression to the mean, social interaction and attrition.

What are the threats to validity experimental design?

History, maturation, selection, mortality and interaction of selection and the experimental variable are all threats to the internal validity of this design.

What are the 4 threats to external validity?

What are threats to external validity? There are seven threats to external validity: selection bias, history, experimenter effect, Hawthorne effect, testing effect, aptitude-treatment and situation effect.

What are three major threats to the internal validity of experiments and possible solutions?

Threats to Internal Validity.

Attrition: Attrition is bad for your research because it leads to a bias. ... .

Confounding variables: When your research has an extra variable related to the treatment you applied to your sample group that affects your results, then that leads to confusion. ... .

Diffusion: This is a tricky one..