Did you know that a Cronbach’s alpha threshold of .70 is considered an “urban legend” (Lance et al. 2006, p. 206)? Have you noticed the different words or phrases used for testing reliability of an instrument? The most common is the Cronbach’s alpha, but did you know that the name of this test is often misquoted. This article will teach you what you need to know so your paper stands out as one that knows the different types of reliability tests, knows the accurate threshold for your field of study, and knows how to word the findings.
Not All Reliability Tests Are the Same
While researching a chapter on validity and reliability for an upcoming book I thought I would embark on a journey to understand more about the different words and phrases I see when scholars write about reliability (Bocarnea, Winston, & Dean, 2021). Little did I know it was such a controversial topic and little did I know that so many people do not understand what to do and how to say it. It truly is a case of we keep doing the same thing over and over again because everyone else is doing it. This article encourages you to stop and think about reliability as more than just a sentence you need to include in your dissertation. It is much more valuable and important than that and this can clearly separate you from the crowd.
Defining the Scale and Subscale
First, to reduce confusion let us clarify what a scale is. A scale is an instrument. It could also be called an assessment, a questionnaire, an index, a survey, or a test. These words are interchangeable. Some examples of these scales might be the Authentic Leadership Inventory (ALI) by Neider & Schriesheim (2011) or the Servant Leadership Assessment Index (SLAI) by Dennis & Bocarnea (2005). They are surveys used to answer a research question and test hypotheses. Within these instruments there are likely subscales. For example, the ALI measures Self-Awareness, Relational Transparency, Balanced Processing, and Internalized Moral Perspective with 16-items (4-items per subscale). The SLAI measures Agapao Love, Altruism, Empowerment, Humility, Serving, Trust, and Vision with 42-items (6-items per subscale). It is important to know that you cannot use Cronbach’s alpha to measure the entire instrument because that is multi-dimensional. Instead, Cronbach’s alpha can only be used for each individual subscale because it is uni-dimensional. Therefore, when doing this test, make sure you use only the questions for each individual subscale and not all of the questions at one time.
When testing reliability of an instrument, the ultimate goal is to know if the instrument performs in consistent and predictable ways. In general, there are four types of reliability including internal consistency, interrater reliability, parallel forms reliability, and test-retest reliability. Cronbach’s alpha falls under the category of internal consistency reliability seeking to know if the individual items (the questions) on the survey are reliable. This test measures the interrelatedness of the items/questions in the instrument. Interrater reliability has to do with testing different people with the same test and getting similar results. Parallel forms reliability evaluates if different versions of the test are the same. And test-retest reliability has to do with using the same test with the same people more than once.
A Name is Not Just a Name – The Wording Matters
Cronbach never intended for his name to be used with the test. He even said “it is an embarrassment to me that the formula became conventionally known as Cronbach’s α” (Cronbach & Shavelson, 2004, p. 397). Instead, the name could be just “α” or “coefficient alpha.” I’ll continue to use Cronbach’s alpha in this article, but for future publications we should all use “α” or “coefficient alpha” instead of including Cronbach in the name.
Another common mistake is when scholars say the instrument is reliable. Reliability is a characteristic of the data collected by an instrument and not the instrument itself. Therefore, instead of stating the instrument is reliable, we should say “the data collected by the instrument is reliable” or that “reliable data has been collected using the instrument in the past.”
Example: The data collected by the Authentic Leadership Inventory is reliable. The Self-Awareness subscale consisted of 4 items (α = .85), the Relational Transparency subscale consisted of 4 items (α = .82), the Balanced Processing subscale consisted of 4 items (α = .85), and the Internalized Moral Perspective subscale consisted of 4 items (α = .84).
What Is the Threshold
If I ask you what the Cronbach’s alpha score should be what would you say? Most people say it should be at least .70. Most people also argue this number came from Nunnally; however, if you read the original article from Nunnally (1975) you will find that the threshold score of .70 is not entirely true. Let me explain. The range for Cronbach’s alpha is .00 to 1.0. The higher the score the better. Nunnally actually said that a score of .70 is “miserably low” and that depending on the purpose of the test (think of an IQ test), a reliability of .90 may not be high enough (1975, p. 10). Gignac (2015) notes that the minimally acceptable level is .70 for exploratory research, .80 for basic research, and .90 for applied scenarios. This argument is reinforced by Lance, Butts, and Michels (2006) who refer to this threshold as an “urban legend” and stated that .70 is a modest reliability arguing that “Nunnally clearly recommended a reliability standard of .80” (p. 206). Cho and Kim (2015) argue that the accepted minimum range of .70 is due to “an immunity standard, which ‘legally’ excuses [researchers] from having to think further about reliability when α values above .70 or .80 are obtained” (p. 217). To get back to the source, Nunnally (1978) confirmed that “what a satisfactory level or reliability is depends on how a measure is being used” and wrote that for a modest reliability, “.70 or higher will suffice” yet for the purpose of applied settings a score of “.80 is not nearly high enough;” however, “a reliability of .90 is the minimum that should be tolerated, and a reliability of .95 should be considered the desirable standard” (Nunnally, 1978, pp. 245-246). In other words, we need to raise our standards for reliability.
Assumptions of Cronbach’s Alpha
In order to use Cronbach’s alpha to check the reliability of an instrument, there are several assumptions that must be met. Below are a few of the assumptions:
- First, it is assuming that the instrument is uni-dimensional in that it is measuring one single dimension or one common thing. It does not prove it – it assumes it.
- Second, the items in the scale are correlated to one another, meaning they are similar but not the same. If they are measuring the same thing that would be homogeneity.
- Third, the items in the instrument are either added together for a total score or averaged.
If these assumptions are not met, a different test should commence to evaluate reliability.
Other Options for Reliability Tests
Some articles on this topic brazenly state that “even among experts, there is no consensus on which methodology is superior to others” (Cho & Kim, 2015, p. 218). There are lots of other options to test reliability. Cronbach’s alpha is the most popular; however, some scholars believe the Omega Coefficient is just as good, if not better. The downside to the omega is that it can only be done in R right now, not in SPSS. Other tests for reliability may include bootstrapping, factor analysis, and structural equation modeling. Precursors to Cronbach’s alpha included the Spearman-Brown Split-Half Reliability and the Kuder-Richardson Coefficient of Equivalence. All of these tests are options in lieu of Cronbach’s alpha.
In summary, there is more to Cronbach’s alpha than meets the eye. While it is the most popular method for measuring reliability, it also has some history as to how to use it, when to use it, and how to write up the findings. This article has given you a quick glimpse of some of the arguments surrounding this popular tool and a few quick tips to make your paper stand out as one that is “in the know.”
Bocarnea, M., Winston, B., & Dean, D. (2021). Advancements in organizational data collection and measurements: Strategies for addressing attitudes, beliefs, and behaviors. Hershey, PA: Business Science Reference.
Cho, E., & Kim, S. (2015). Cronbach’s coefficient alpha: Well known but poorly understood. Organizational Research Methods, 18(2), 207-230. https://doi.org/10.1177/1094428114555994
Dennis, R. S., & Bocarnea, M. (2005). Development of the servant leadership assessment instrument. Leadership & Organization Development Journal, 26(8), 600-615. https://doi.org/10.1108/01437730510633692
Cronbach, L. J., & Shavelson, R. J. (2004). My current thoughts on coefficient alpha and successor procedures. Educational and Psychological Measurement, 64(3), 391-418. https://doi.org/10.1177/0013164404266386
Gignac, G. (2015). What is Cronbach’s Alpha. Retrieved February 18, 2021, from https://youtu.be/PCztXEfNJLM
Lance, C. E., Butts, M. M., & Michels, L. C. (2006). The sources of four commonly reported cutoff criteria: What did they really say? Organizational Research Methods, 9(2), 202-220. https://doi.org/10.1177/1094428105284919
Neider, L. L., & Schriesheim, C. A. (2011). The authentic leadership inventory (ALI): Development and empirical tests. The Leadership Quarterly, 22(6), 1146-1164. https://doi.org/10.1016/j.leaqua.2011.09.008
Nunnally, J. (1975). Psychometric theory. 25 years ago and now. Educational Researcher, 4(10), 7-21. https://doi.org/10.2307/1175619
Nunnally J. (1978) An Overview of Psychological Measurement. In: Wolman B.B. (eds) Clinical Diagnosis of Mental Disorders. Springer, Boston, MA. https://doi.org/10.1007/978-1-4684-2490-4_4
DOWNLOAD OUR FREE EBOOK
The FAST Dissertation