IQ Test Career Recommendations: What They Predict, and the Limits in 2026

This article is the audit Lindsey wished she'd had. IQ Career Lab is a cognitive assessment platform that measures intelligence across five domains and matches your cognitive profile to high-fit career paths, and yes, that makes us a competitor to every tool we're about to grade. We've disclosed that conflict at the top, applied the same rubric to ourselves, and graded ourselves no more generously than the others. The audit's question is not which tool is best. It's whether any of them tell you the truth about what an IQ score can and cannot predict about your career.
Key Takeaways
- Of 7 popular IQ-to-career tools, just 1 publishes a validity coefficient at all, and 1 of 7 cites peer-reviewed sources by name (audit conducted May 2026)
- The field's headline number has shifted under more conservative correction frameworks: in 1998 Schmidt and Hunter reported operational ρ = 0.51 (Schmidt, 1998); in 2022 Sackett re-estimated ρ = 0.31 after fixing range-restriction overcorrection (Sackett, 2022); in 2023 the field's published estimate became observed r = 0.16, corrected ρ = 0.22 across N = 40,740 workers (Sackett, 2023)
- Effect-size translation (Funder & Ozer, 2019): r = 0.22 means roughly 61% of higher-IQ workers exceed the median performance of lower-IQ workers (Funder & Ozer, 2019), comparable in magnitude to many established medical effects
- Career fit is a different question than career performance: interest congruence (RIASEC) shows validity in the r = 0.15–0.36 range across studies, comparable to, not clearly superior to, cognitive ability
- The honest answer to "what should I do with this?": ask any career tool which validity estimate it's grounded in, then triangulate across at least one cognitive measure, one interest inventory, and one structured conversation with someone in the role
- Lindsey's decision rule: if a tool can't tell her which paper its recommendation comes from, the recommendation is decoration
What This Audit Is (and Isn't)
This is a methodology audit of seven IQ-to-career tools (123Test, IQ-Brain, Truity, Mensa Workout, IQ Test.com, plus IQ Career Lab itself) graded on 4 dimensions: validity-coefficient disclosure, peer-reviewed source citation, scoring framework (RIASEC, multi-factor cognitive), and what they claim to predict. We then walk through the published evidence on what cognitive ability does and does not predict, surface the live methodological debate from Schmidt-Hunter (1998) to Sackett (2023), and translate the findings into decisions you can make on a Tuesday afternoon.
This is not a takedown of IQ tests. It is a takedown of opaque scoring. The audit takes no position on whether ρ = 0.22 (Sackett 2023) or ρ = 0.51 (Schmidt & Hunter 1998) is the right benchmark. It asks only whether tools disclose which estimate they're relying on.

The tools-versus-evidence gap matters most for people in Lindsey's position: career pivoters between thirty-five and fifty-five, often earning a mid-six-figure income, who are choosing between two or three plausible second acts. A twenty-three-year-old can afford to test a hypothesis with a year of their life. A forty-five-year-old usually cannot.
That is the audience for which an opaque "your IQ recommends X" answer is most dangerous: the cost of being wrong is measured in nearly a decade of compounding salary, equity, and skill investment that a twenty-three-year-old can absorb but a forty-five-year-old like Lindsey cannot.
For Lindsey, the right question is not "what does this tool say?" It is "what does this tool know, and how does it know it?"
Does IQ Predict Career Success?
Yes, but with caveats most tools omit. The current best meta-analytic estimate (Sackett, 2023; DOI) puts N = 40,740 (Sackett, 2023) and corrected ρ = 0.22, smaller than the 0.51 figure many career tools still cite. The gap matters when a tool sells "your IQ recommends X."
That coefficient is real, replicable, and decision-useful for hiring and salary growth across a career, but small enough that it cannot mechanically pick a specific job. The 0.51 number, anchored in Schmidt & Hunter (1998), reflects a different correction philosophy, not a different reality. Career tools quoting one without the other are doing rhetorical work, not statistical work.
Of the seven tools we audited (123Test Career Test, 123Test Classical IQ, IQ-Brain, Truity, Mensa Workout, IQ Test.com, and IQ Career Lab), six fail to disclose any validity coefficient. Here is the full grade card.
| Discloses validity coefficient | Cites peer-reviewed source | Names scoring framework | Predicts what (claimed) | |
|---|---|---|---|---|
| 123Test Career Test | No | No | RIASEC | Career fit |
| 123Test Classical IQ Test | No | No | Single g (implicit) | IQ score; career-relevance unspecified |
| IQ-Brain | No | No | Undisclosed | IQ score; career-relevance unspecified |
| Truity Career Profiler | No | No | RIASEC + Big Five | Career fit |
| Mensa Workout | Self-disclaims as entertainment | No | n/a | Entertainment only |
| IQ Test.com | No | No | Undisclosed | Academic / financial success |
| IQ Career Lab (us) | Yes ([Sackett 2023](https://doi.org/10.1037/apl0001159), r = 0.22) | Yes | Multi-factor cognitive tilt + RIASEC | Career fit |
The pattern is the finding. Five of the six IQ-shaped tools sell IQ scores as career-relevant without naming what cognitive ability is supposed to predict, and without naming a coefficient or a paper. The one that opts out (Mensa Workout) does so openly. The two RIASEC tools at least name their construct. Disclosure is the differentiator, not score quality.
A few honest caveats on the audit itself. First, this was a primary-page scrape — /science, /methodology, /research, and /about subpages were checked, but a buried PDF or footer link could change the headline. Second, "self-disclaims as entertainment" is a real disclosure choice, and Mensa Workout does it more honestly than most. Third, "predicts what" is measured against the tool's own claim, not against any ground truth.
The rest of this article walks through the evidence underneath that grade card, starting with the three Schmidt-Hunter-Sackett validity estimates from 1998 to 2023.
The Validity-Coefficient Timeline (Read the Footnotes)
The academic field has not agreed on a single number. It has produced three best-estimates from 1998 to 2023 under increasingly conservative correction frameworks: Schmidt & Hunter's ρ = 0.51, Sackett et al.'s 2022 ρ = 0.31, and Sackett et al.'s 2023 ρ = 0.22. Each estimate is honest within its own framework. The estimates are not, however, apples-to-apples, and any career tool that quotes one without disclosing the framework is making the data work harder than it should.
| Estimate (corrected) | Sample | Correction framework | Effect-size translation | |
|---|---|---|---|---|
| Schmidt & Hunter (1998) | ρ = 0.51 | Meta-of-meta-analyses | Full corrections (criterion unreliability + aggressive range restriction) | Large; ~76% above-median (BESD) |
| Sackett, Zhang, Berry & Lievens (2022) | ρ = 0.31 | Re-analysis | Fixes IRR overcorrection | Moderate; ~66% above-median |
| Sackett, Demeke et al. (2023) | r = 0.16 obs / ρ = 0.22 | N = 40,740, 153 samples, 21st-century | Criterion-unreliability correction only | Small-to-moderate; ~61% above-median |
The 80% credibility intervals around these estimates (Sackett, 2023) are wide. In 2023 the published Sackett residual SD was 0.11 (Sackett, 2023; SD = 0.11 around the corrected estimate; the observed-r SD is smaller at ~0.09), meaning the true validity for any given job-and-context combination plausibly varies from below 0.10 to above 0.30. That is not a contradiction; it is a feature of how meta-analyses work. It does, however, undermine any tool that prints a single point estimate as if it were settled.

Report r, Not r² (and Benchmark Honestly)
You will sometimes see a 0.22 correlation translated into "5% of variance explained" (Funder, 2019), calculated as r², and used to dismiss IQ as a near-useless predictor. That framing is misleading. Funder and Ozer (2019) argued in Advances in Methods and Practices in Psychological Science that r² is a poor public-communication tool because it makes meaningful effects look trivial. Reporting r directly is the field's recommended convention.
For benchmark, aspirin reduces heart attack risk at about r = 0.034. AZT extended HIV survival at effect sizes in the same range as small-to-moderate behavioral correlations (Meyer et al., 2001). r = 0.22 is well above both, large enough to matter for hiring decisions and salary growth over a career, small enough that it cannot mechanically pick a job for you.
The Binomial Effect-Size Display translation makes this concrete. At r = 0.22, if you split workers into "above-median IQ" and "below-median IQ," roughly 61% of the higher group will exceed the median performance of the lower group (Sackett, 2023). That is meaningful. It is not "your IQ recommends architect."
This is a much smaller estimate than the .51 value offered by Schmidt and Hunter (1998).
The Tool Audit: Reading the Grade Card
The audit grade card sits above. The grading rubric is simple: does the tool tell you what cognitive construct it measures, what criterion it claims to predict, what coefficient backs the claim, and which peer-reviewed source the coefficient comes from. A "yes" required the disclosure to be findable on the public site without an account or a paid upgrade.

The disclosure question is not pedantic. If a tool tells Lindsey her IQ predicts "research scientist," the unstated dependency chain has 4 steps: her test score is a valid measure of g; g predicts performance at coefficient X; "research scientist" is a category where g is more predictive than for other roles; and the score-to-role mapping accounts for those parts.
If the tool fails to show its work on any of those 4 steps, the recommendation is doing more rhetorical work than statistical work. That can still nudge a stuck career-pivoter, but it is not the same product as a transparent recommendation.
The tools that do best are the ones that grade themselves the way a peer reviewer would: name the construct, cite the validity (e.g., Sackett 2023, Hoff 2020), disclose the residual uncertainty.
How Reliable Are IQ-Based Career Recommendations?
Modestly. The best 21st-century estimate of IQ-to-job-performance validity is corrected ρ = 0.22 (Sackett et al., 2023), with residual SD = 0.11 across jobs. That is genuinely predictive, but no single number should be treated as approval for a specific career.
A test can be reliable (consistent across retakes) and still have weak validity for a specific outcome. The between-job variation matters: under the residual SD, true validity for any given role-and-context combination plausibly ranges from below 0.10 to above 0.30. Tools that print one point estimate, with no uncertainty, are flattening that range into false confidence — see our validity and reliability methodology for the disclosure standard we hold ourselves to.
The Criterion-Mismatch Problem
There is a second issue most career tools sidestep: they sell IQ as career-relevant, but the construct most directly validated for "career fit" is not cognitive ability. It is interest congruence, operationalized via Holland's 6-factor RIASEC framework. Yet just 2 of the 7 audited tools (123Test Career Test, Truity Career Profiler) name a RIASEC framework at all. The career-fit question and the career-performance question are not the same question, and answering one when the user came in asking the other is a form of construct mismatch.
Even on its own ground, the interest-congruence literature is contested. Nye and colleagues (Nye, 2012) reported corrected r = 0.36 for congruence-to-performance and r = 0.30 for congruence-to-persistence across N = 9,472 (Nye, 2012). Their 2017 follow-up (Nye, 2017) confirmed similar values around r = 0.32 corrected. However, Van Iddekinge and colleagues (Iddekinge, 2011) found uncorrected r in the 0.07 to 0.12 range with credibility intervals including zero. Hoff and colleagues (Hoff, 2020) found congruence operational validity around 0.15 to 0.20 once interest level is partialled.
| Validity estimate | Effect-size translation | Notes | |
|---|---|---|---|
| Nye et al. (2012) | r = 0.36 (congruence → performance, corrected) | Moderate (high-water mark) | Often the cited number; contested |
| Nye et al. (2017) | r ≈ 0.32 (corrected) | Moderate | Re-affirms 2012 after Hoff critique |
| Hoff et al. (2020) | r ≈ 0.15–0.20 (operational, partialled) | Small-to-moderate | Counter-evidence; level effects partialled |
| Van Iddekinge et al. (2011) | r = 0.07–0.12 (uncorrected) | Small / null | Counter-evidence; CIs include zero |
Honest framing: under the same conservative correction philosophy the most recent Sackett meta-analysis applies to GMA, congruence-criterion validity sits in the r = 0.15 to 0.36 range. That is comparable to, not better than, GMA-to-performance, and the two predict different things in any case. A career tool that switches from "IQ predicts performance" to "IQ predicts fit" without telling you is doing a similar disclosure dodge to a tool that quotes ρ = 0.51 without saying it came from 1998.

What Careers Match Different IQ Levels?
The question is shaped wrong. No peer-reviewed evidence supports a clean "if your IQ is X, do Y" mapping, though some researchers argue cognitive thresholds for elite roles exist. Use IQ scores to widen the list of careers worth investigating, not narrow it; treat each output as a hypothesis to verify, never a verdict.
What the evidence does support: complex roles tend to have higher average cognitive demands; cognitive tilt (math-strong vs verbal-strong vs spatial-strong) differentially predicts which kinds of creative output you'll produce decades later, with N ≈ 2,400 (Park, 2007) tracked over 25 years (Park, 2007); and spatial ability is an under-recognized independent predictor of STEM outcomes (Wai, 2009).
Steelman: The Case That Newer Estimates Undercorrect
It would be intellectually lazy to use the newer, lower validity coefficients without surfacing the live debate. Several respected I/O psychologists argue that Sackett et al. (2022/2023) themselves undercorrect.
This is the single most important steelman to take seriously if you came in already believing IQ is a strong career predictor. The newer numbers are not a refutation of the older numbers. They are a different set of choices about how to correct raw data, and the choices are defensible in both directions. What is not defensible is a tool quoting ρ = 0.51 in 2026 without saying it comes from 1998 and a different correction philosophy.

If you are a reader who already distrusts IQ tests, the steelman matters too: the lower estimates do not validate the dismissal. Even the most conservative published number, Sackett's r = 0.22, is a reliable, replicable, decision-useful effect. The honest position is in the middle. IQ predicts job performance enough to matter and not enough to overrule everything else you know about yourself.
The takeaway is that any career tool quoting one side of this debate is being lazy with the evidence. A transparent tool tells you which framework it relies on (Schmidt-Hunter 1998 or Sackett 2023), what the residual uncertainty looks like (e.g., SD = 0.11 around ρ = 0.22), and what the Oh-Le-Roth interpretation would suggest. Most don't.
What Jobs Are Good for High IQ?
Many complex roles reward cognitive ability, but the framing obscures a more useful question: what does cognitive tilt (math, verbal, spatial balance) suggest about which roles fit your strengths? A multi-factor profile naming your relative strengths is more decision-useful than a single composite "high IQ" number.
Park and colleagues (Park, 2007) tracked roughly 2,400 high-ability adolescents (Park, 2007) over 25 plus years (Park, 2007) and found math-tilted profiles produced STEM-creative output while verbal-tilted profiles produced humanities-creative output, controlling for overall ability level. The decision-useful insight: a 130 verbal / 105 spatial profile points to different careers than a 105 verbal / 130 spatial profile, even at the same composite IQ.
What Lindsey Should Do (and What You Should)
Six decision-shaped rules for any 35-to-55 career pivoter looking at an IQ-to-career tool like Lindsey was:
- Ask any tool which validity coefficient it relies on and which paper that coefficient comes from. If it cannot answer in one click, downgrade the recommendation accordingly. The disclosure question is the cheapest signal of methodological seriousness.
- Triangulate across constructs, not within a construct. A single IQ score is one input. An interest inventory (RIASEC or similar) is a second input that measures something different. A structured conversation with two people currently doing the role you're considering is a third. The three together are more reliable than any one alone.
- Use cognitive tilt, not single-axis IQ. A 130 in verbal and 105 in spatial implies different career terrain than a 105 in verbal and 130 in spatial, even though both might show as "120 IQ" on a single-score test.
- Treat any specific job recommendation as a hypothesis, not a verdict. Tools that print "you should be a research scientist" with no uncertainty are doing rhetorical work, not statistical work. The career fits the candidate, but only after the candidate tests the fit in cheap, fast, real-world ways (informational interviews, side projects, contract work).
- For the cognitive-measurement step alone, look for a transparent multi-factor assessment that publishes its scoring methodology. If you want a worked example of what that disclosure looks like, our IQ test names its constructs and references the most recent Sackett validity literature.
- For the interest-fit step alone, an inventory grounded in a named framework (RIASEC, Big Five) is more validated for career fit than IQ alone. Our Personality Test covers this side of the triangulation.
Borsboom (2006) made the foundational distinction in Psychometrika that we keep returning to: a test's score predicts nothing on its own. What predicts is the construct the score measures, in the context the construct was validated. When a tool says "your IQ score recommends X career," the unspoken question is: which construct, validated against which criterion, in which context? If the tool cannot answer, the recommendation is decoration.
Methodology and Conflict-of-Interest Box
IQ Career Lab is a competing tool. We sell a multi-factor cognitive assessment. Grading other tools without grading ourselves the same way would be self-dealing. Here is the audit applied to us, with our own self-criticism:
- Validity coefficient disclosure: We cite the 2023 Sackett meta-analysis (Sackett, 2023) corrected ρ = 0.22 as our headline GMA-to-performance estimate, with the residual SD of 0.11 noted. Self-criticism: we should also cite the Oh-Le-Roth (Oh, 2024) counterargument and the wider 80% credibility band (Sackett, 2023) on every page that prints a single coefficient.
- Peer-reviewed source citation: We name Sackett, Erdogan & Bauer, and Koopmans by author. Self-criticism: a more rigorous standard would link directly to DOIs from product copy, not just from this article.
- Scoring framework: We use multi-factor cognitive tilt plus a RIASEC-style interest layer. Self-criticism: we have not published a between-factor weighting transparency document; that is on the roadmap.
- What we claim to predict: career fit, with explicit uncertainty bands and a "this is a hypothesis to test, not a verdict" framing in the result page. Self-criticism: a few onboarding screens still over-promise; in revision.
- Audit method: primary-page scrape of
/science,/methodology,/research,/about. Self-criticism: a more rigorous version would email each tool's support team to ask for buried methodology documents, give them 14 days to respond, and re-grade.
The honest position: IQ Career Lab passes the disclosure rubric we built. It does not pass every rubric we could have built. We disclose this trade-off rather than score ourselves leniently.

Where to Go from Here
Take one assessment that publishes its methodology, then triangulate against an interest inventory and a 20-minute conversation with someone currently in the role you're weighing. The cognitive measurement is the input you have least control over; the interest layer adds construct diversity; the structured conversation captures context no test can.
For deeper reading on the cognitive side, see our companion pieces on cognitive job fit when IQ outpaces the role, cognitive tilt and elite career paths, and whether IQ predicts job satisfaction in addition to performance. For the methodology side, our scoring methodology walks through the construct-validation tradeoffs in plain English.
See what a transparent multi-factor cognitive assessment looks like
The audit does not end with a leaderboard. It ends with a request: when a career tool tells you what to do, make it show its work. Six of seven won't. The seventh might be us, but the right standard is the principle, not the product. Choose the tool that names the coefficient, cites the paper, and discloses what it doesn't know, including, especially, when that tool is competing for your attention.



