Background
The Army General Classification Test (AGCT) was a World War II intelligence test used by the U.S. Army to classify recruits according to their "ability to learn quickly the duties of a soldier.1" It remains one of the largest normed cognitive ability tests ever administered and shows strong g-loading and predictive validity.
After many concerns during World War II over the misassignment of soldiers into unsuitable roles and the underutilization of more capable soldiers, the US Army spent lots of resources on commissioning an intelligence and aptitude test, resulting in the early forms of the AGCT. After the end of World War II, the AGCT continued to undergo constant improvements and revisions to ensure its accuracy. Amassing an enormous sample of more than 12 million soldiers, this transcends the samples of modern professional tests by over five thousand times.2
Because the drafted soldiers spanned a wide range of ages, the test was tailored to provide accurate scores for teenagers to middle-aged adults. Accordingly, it has been found that performance on the AGCT is not related to age, as shown by a correlation of .02 for 4,330 inductees1. Furthermore, with drafted soldiers of all classes and lifestyles being the intended testees, the test was designed with questions that minimized prior knowledge from education and culture.
A test of g
In order to rehabilitate this test for modern use, a few things had to be done:
- The original score distribution had to be re-normalized by correcting for skew.
- Norm obsolescence, if any, had to be ascertained and accounted for.
- The g-loading has to be estimated.
Original distribution
The original distribution is highly left-skewed13. This is because those charged with the norming underestimated the number of easy questions on the test. This resulted in a test that discriminates well in the low range (you don’t want to draft morons), but not as effectively in the higher range.
In order to correct for this flaw, the test had to be re-normalized. With percentile rank-equating, it is possible to generate new aligned norms.
This is the original distribution:
This is the fixed distribution:
Overall, most of the changes happened in the low range; however, this step was necessary for psychometric rigor.
Norm obsolescence
It is normal to wonder if a test from 1941, 82 years ago, is still valid today.
Consider this:
In 1980, during the renorming of the ASVAB, the AGCT was pitted against it4. It was found that the percentiles matched nicely at all ranges. 39 years later, where Flynn effects would have predicted a systematic inflation of nearly 12 pts, what was found was a simple fluctuation of the sign of the difference between the tests throughout the range. This can be easily attributed to either sampling or error of measurement. There are absolutely no Flynn effects for this test. This makes sense, given the absence of the particular test elements that are responsible for the majority of the Flynn effect.
Before it was released on the subreddit, it was given to dozens of people within the community with known scores from professional tests. More often than not, AGCT ended up being one of their lower rather than higher scores. This provides strong confidence to declare that the AGCT is not an obsolete test.
Construct Validity
The ‘g-loading’ is the degree to which a test correlates with the ‘g factor’ or general intelligence. A higher g-loading means a test is better, and figures above 0.8 are generally considered to be great. These correlations are often derived through factor analysis. As item data for this test is impossible to get by, we can first estimate this test’s accuracy by its proxy g-loading from its successors, the ASVAB and AFOQT. Factor analyzing these two batteries and deriving composites from subtests that most closely resemble the AGCT in content was the only way to assess its construct validity.
From the ASVAB, the pseudo-AGCT composite yielded a g-loading of .92, whereas the AFOQT pseudo-AGCT composite had a g-loading of .90. Averaging the two gives an estimate of ~.91.
Furthermore, using data from the automated AGCT form at CognitiveMetrics, the g-loading for the AGCT can be calculated. With a sample size of 1734 and M 121.7 SD 12.95, we can calculate the reliability at 0.941 and after being corrected for range, 0.956.
The g-loading of this sample is 0.816, and after being corrected for range restriction and SLODR, the g-loading has been calculated at 0.925, further aligning with our estimations above. The g-loading unadjusted for V is 0.535, Q is 0.733, and S is 0.597. It isn’t possible to correct for SLODR due to a lack of individual norms, but after correcting for range restriction, the g-loadings are 0.659 for V, 0.733 for Q, and 0.646 for S.
A g-loading of 0.925 is highly impressive for an 82-year-old test. Factorial validity is manifest.
Convergent Validity
Of the AGCT's convergent validity, the Examiner Manual for the AGCT states1:
The relationship of the AGCT to other well-known tests of general learning ability is high (7)3. Correlations of the ACCT with three intelligence tests were: .90 with the Army Alpha, .83 with the Otis Higher Mental Ability Examination, and .79 with the American Council on Education Psychological Examination. (p. 2)
Additionally, Jensen states that the AGCT is, "as highly correlated (r≈.80) with various IQ tests as the IQ tests are correlated with each other." (p. 376)5. We tested whether the convergent validity of the AGCT holds up today.
In 2023, the AGCT was proctored to 58 individuals with verified scores on professional tests, and the relationship depicted below was calculated. The correlation was calculated at r = 0.7219, with the average estimated g-loading of the professional test composite being 0.924.
Here are the correlations for some of the most prominent tests directly:
| Test | n | r | rRR | MeanTest | SDTest | MeanAGCT* | SDAGCT* |
|---|---|---|---|---|---|---|---|
| Composite | 58 | 0.7219 | 0.8621 | 132.06 | 9.98 | 128.48 | 9.20 |
| Old SAT | 29 | 0.6964 | 0.8477 | 133.21 | 10.33 | 130.72 | 9.11 |
| Old GRE | 20 | 0.7623 | 0.8755 | 134.25 | 10.10 | 131.55 | 9.75 |
| CAIT | 37 | 0.6957 | 0.8378 | 135.49 | 11.76 | 129.49 | 9.47 |
| WAIS-IV | 14 | 0.6484 | 0.7515 | 132.07 | 12.47 | 130.64 | 11.22 |
| SB-V | 9 | 0.7435 | 0.8644 | 127.33 | 9.11 | 127.11 | 9.70 |
Note. rRR: Since these correlations were calculated on a sample from a portion of the general population, they were corrected for indirect range restriction so that they better approximate the correlations that would be observed in an unrestricted population. Since not every participant took every test, the table shows the corresponding AGCT scores for those who took the specific test.
Ultimately, such similar and strong convergent validity eighty years later using modern batteries shows how robust the AGCT is to the Flynn Effect and how it remains a great test of intelligence.
Note. The reliability of the AGCT under good administration is not less than .953. The test-retest reliability for the ACCT was .823. The actual gain in the ACCT score upon retesting was only 1.3 points1.
Predictive Validity
The Examiner Manual1 for the AGCT states that, even with substantial restriction of range, the predictive validity of the AGCT is high. For 3,000 clerical trainees, a validity of .40 was reported between AGCT score and success in clerical school. The validity of .40 means that the chances that a subject would do average or better in the course were only 1 in 100 if his AGCT score corresponded to 74 IQ (SD = 15); 5 in 100 for 85 IQ; 20 in 100 for ~100 IQ; 47 in 100 for 113 IQ; and 76 for 125 IQ6. Moreover, in officer candidate schools for the various arms and services, AGCT usually correlated with academic grades around .40 as well, despite the rigid selection1.
The tables below are from Army studies conducted since WW216. The AGCT scores are in the original standard deviation of 20.
Mean AGCT Scores per Individual Occupation
| AGCT Scores per Individual Occupation | Mean |
|---|---|
| Accountant | 121.1 |
| Lawyer | 120.7 |
| Public Relations Man | 119.5 |
| Auditor | 119.4 |
| Chemist | 118.6 |
| Reporter | 118.4 |
| Chief Clerk | 118.2 |
| Teacher | 117.1 |
| Draftsman | 116.5 |
| Stenographer | 115.8 |
| Pharmacist | 115.4 |
| Tabulating Machine Operator | 115.1 |
| Bookkeeper | 115.0 |
| Manager, Sales | 114.3 |
| Purchasing Agent | 114.0 |
| Production Manager | 113.6 |
| Photographer | 113.2 |
| Clerk, General | 113.1 |
| Clerk, Typist | 112.6 |
| Installer, Telephone and Telegraph | 111.9 |
| Cashier | 111.9 |
| Instrument Repairman | 111.6 |
| Radio Repairman | 111.5 |
| Artist | 111.2 |
| Manager, Retail Store | 110.5 |
| Laboratory Assistant | 110.1 |
| Tool Maker | 109.4 |
| Stock Clerk | 108.9 |
| Musician | 108.2 |
| Machinist | 107.6 |
| Watchmaker | 107.4 |
| Airplane Mechanic | 107.0 |
| Sales Clerk | 106.9 |
| Electrician | 106.8 |
| Lathe Operator | 106.4 |
| Receiving and Shipping Checker | 105.7 |
| Sheet Metal Worker | 105.6 |
| Lineman, Power and Tel. & Tel. | 105.3 |
| Auto Service Man | 103.2 |
| Riveter | 103.1 |
| Cabinetmaker | 102.6 |
| Upholsterer | 102.5 |
| Butcher | 102.2 |
| Plumber | 102.0 |
| Bartender | 101.7 |
| Carpenter, Construction | 101.6 |
| Pipe Fitter | 101.4 |
| Welder | 101.4 |
| Auto Mechanic | 101.0 |
| Molder | 100.8 |
| Chauffeur | 100.6 |
| Tractor Driver | 99.6 |
| Painter, General | 98.7 |
| Crane Hoist Operator | 98.4 |
| Weaver | 97.8 |
| Barber | 96.5 |
| Farmer | 94.5 |
| Farmhand | 93.6 |
| Miner | 92.9 |
| Teamster | 90.8 |
per Major Occupational Group
| AGCT Scores per Major Occupational Group | Mean |
|---|---|
| Professional | 117.2 |
| Managerial | 114.1 |
| Semiprofessional | 113.2 |
| Sales | 109.1 |
| Clerical | 103.3 |
| Skilled | 101.3 |
| Semiskilled | 99.7 |
| Personal Service | 99.0 |
| Agricultural | 94.0 |
per Type of Work
| AGCT Scores per Type of Work | Mean |
|---|---|
| Literary Work | 118.9 |
| Technical Work | 117.3 |
| Public Service | 117.1 |
| Managerial Work | 112.8 |
| Artistic Work | 112.2 |
| Recording Work | 111.8 |
| Public Contact Work | 109.1 |
| Musical Work | 108.2 |
| Manipulative Work | 104.5 |
| Crafts | 103.8 |
| Machine Trades | 102.6 |
| Observational Work | 100.2 |
| Personal Service Work | 99.0 |
| Farming | 92.9 |
per Field of Specialization & Respective Degree Level
| AGCT Scores per Field of Specialization | Degree Level | 10th | 25th | 50th | 75th | 90th |
|---|---|---|---|---|---|---|
| Natural Sciences | AB | 111 | 116 | 121 | 126 | 132 |
| Graduate students | 114 | 119 | 125 | 130 | 135 | |
| PhD | 117 | 123 | 129 | 136 | 144 | |
| Chemistry | AB | 112 | 117 | 123 | 128 | 134 |
| Graduate students | 114 | 120 | 126 | 132 | 136 | |
| PhD | 119 | 124 | 130 | 136 | 143 | |
| Physical Sciences, other | AB | 112 | 117 | 124 | 129 | 137 |
| Graduate students | 117 | 122 | 127 | 132 | 136 | |
| PhD | 117 | 126 | 132 | 141 | 146 | |
| Earth Sciences | AB | 111 | 115 | 120 | 126 | 129 |
| Graduate students | 111 | 116 | 122 | 128 | 133 | |
| PhD | 120 | 125 | 129 | 137 | 145 | |
| Biological Sciences | AB | 109 | 114 | 120 | 125 | 130 |
| Graduate students | 113 | 117 | 123 | 129 | 134 | |
| PhD | 115 | 120 | 126 | 132 | 138 | |
| Psychology | AB | 110 | 114 | 121 | 126 | 132 |
| Graduate students | 117 | 123 | 128 | 132 | 137 | |
| PhD | 119 | 125 | 132 | 141 | 147 | |
| Social Sciences | AB | 108 | 113 | 120 | 124 | 129 |
| Graduate students | 111 | 116 | 122 | 129 | 134 | |
| Economics | AB | 111 | 115 | 120 | 126 | 132 |
| Graduate students | 111 | 116 | 123 | 129 | 134 | |
| History | AB | 108 | 114 | 119 | 124 | 129 |
| Graduate students | 111 | 116 | 122 | 127 | 133 | |
| Other Social Sciences | AB | 106 | 111 | 117 | 123 | 128 |
| Graduate students | 111 | 116 | 122 | 129 | 134 | |
| Humanities and Arts | AB | 110 | 115 | 120 | 126 | 131 |
| Graduate students | 111 | 117 | 123 | 129 | 135 | |
| English | AB | 111 | 116 | 121 | 127 | 132 |
| Graduate students | 115 | 120 | 126 | 131 | 135 | |
| Languages | AB | 111 | 116 | 121 | 126 | 132 |
| Graduate students | 111 | 117 | 123 | 130 | 136 | |
| Philosophy and other Humanities | AB | 107 | 114 | 117 | 125 | 129 |
| Graduate students | 113 | 120 | 126 | 132 | 136 | |
| Fine Arts | AB | 109 | 114 | 120 | 124 | 130 |
| Graduate students | 109 | 114 | 120 | 126 | 132 | |
| Engineering | AB | 111 | 117 | 122 | 128 | 134 |
| Graduate students | 114 | 117 | 123 | 129 | 134 | |
| PhD | 116 | 123 | 129 | 137 | 140 | |
| Applied Biology | AB | 105 | 111 | 116 | 120 | 126 |
| Graduate students | 113 | 117 | 129 | 126 | 131 | |
| Agriculture | AB | 111 | 114 | 118 | 123 | 128 |
| Graduate students | 116 | 120 | 124 | 129 | 133 | |
| PhD | 110 | 116 | 123 | 128 | 133 | |
| Home Economics | AB | 100 | 108 | 114 | 118 | 123 |
| Graduate students | 108 | 112 | 116 | 120 | 123 | |
| Health Fields | Graduate students | 112 | 117 | 123 | 128 | 133 |
| Medicine | Medical school students | 114 | 119 | 124 | 129 | 134 |
| Dentistry | Dental school students | 109 | 114 | 120 | 126 | 132 |
| Nursing | AB | 110 | 114 | 119 | 126 | 132 |
| Other | Graduate students | 112 | 117 | 123 | 129 | 134 |
| Business and Commerce | AB | 108 | 113 | 118 | 123 | 128 |
| Graduate students | 109 | 114 | 120 | 125 | 130 | |
| Education | AB | 104 | 111 | 117 | 122 | 126 |
| Graduate students | 109 | 114 | 120 | 125 | 129 | |
| Education, general | AB | 105 | 112 | 117 | 123 | 127 |
| Graduate students | 110 | 114 | 120 | 126 | 129 | |
| Physical Education | AB | 99 | 108 | 113 | 118 | 126 |
| Graduate students | 106 | 111 | 115 | 119 | 122 | |
| Other Fields | ||||||
| Law | Law school graduates | 113 | 115 | 122 | 125 | 130 |
| Social Work | Graduate students | 109 | 114 | 120 | 124 | 129 |
| All Fields Combined (weighted averages) | AB | 109 | 114 | 120 | 125 | 130 |
| Graduate students | 111 | 116 | 122 | 128 | 133 |
References
Related Reading. See "Stephen Jay Gould’s Analysis of the Army Beta Test in The Mismeasure of Man: Distortions and Misconceptions Regarding a Pioneering Mental Test" (2019).
Science Research Associates. (1947). Examiner manual for the Army General Classification Test: First civilian edition. Author. https://clearinghouse-umich-production.s3.amazonaws.com/media/doc/79410.pdf ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎
Harrell, T. W. (1992). Some history of the Army General Classification Test. Journal of Applied Psychology, 77(6), 875–878. https://doi.org/10.1037/0021-9010.77.6.875 ↩︎
ARMY General Classification Test. (1945). Psychological bulletin, 42, 760–768. https://doi.org/10.1037/h0056195 ↩︎ ↩︎ ↩︎ ↩︎
Maier, M. H., & Sims, W. H. (1986, July). The ASVAB score scales: 1980 and World War II (CNR-116). Center for Naval Analyses. https://www.yumpu.com/en/document/read/15323423/the-asvab-score-scales-1980-and-world-war-ii-cna ↩︎
Jensen, A. R. (1998). The g factor: The science of mental ability. Praeger. https://arthurjensen.net/ ↩︎
U.S. War Department. (1946). Personnel classification tests (Technical Manual TM 12-260, rev.). U.S. Government Printing Office. https://archive.org/details/TM12260 ↩︎ ↩︎