IQ by Medical Specialty: Which Doctors Score Highest

That question sends thousands of people to search engines every match season, and the honest answer surprises most of them. At IQ Career Lab we measure cognitive ability head-on, so we care about one distinction above all others: the gap between a real cognitive measurement and a number that only looks like one. The "IQ by medical specialty" ranking everyone shares is the second kind. It is a real, rankable order. It is just not an IQ ranking, and the distance between those two ideas is the whole story.
Key Takeaways
- Dermatology tops the ranking at a 257 mean Step 2 CK score, while Family Medicine and Psychiatry sit at the bottom at 244 (NRMP Charting Outcomes 2024, matched applicants alone)
- The whole top-to-bottom spread is 13 points, equal to just 0.87 of an exam standard deviation, so the score curves overlap far more than they separate
- No published study we could find ties USMLE scores to a validated IQ test in medical students, so any per-specialty "IQ = X" claim is editorial inference, not measurement
- The admissions aptitude test explains roughly 10% (Gauer, 2016) of the Step score we rank doctors by (MCAT composite x Step 2 CK is r = 0.31), making the ranking a competitiveness order, not a cognitive verdict
- Bioethicist Ezekiel Emanuel argues that past a cognitive threshold, emotional intelligence matters more than raw IQ (JAMA, 2018), a reframing that fits how weakly Step scores track real-world physician performance
Which Medical Specialty Has the Highest IQ?
Here is the headline answer first, with its asterisk. Dermatology posts the highest mean USMLE Step 2 CK score among matched applicants, at 257. Plastic surgery, orthopaedics, and radiology sit close behind at 256 (NRMP, 2024). But that is a board-exam ranking, not an IQ ranking, and no validated conversion between the two exists.
Most articles bury that caveat. That is why skeptical readers assume the genre is SEO filler. We put it up front. The competitiveness order, for example the 257-to-244 spread, is genuine and worth knowing. The "IQ" label pinned on the NRMP Charting Outcomes data is the problem.

Here is the chain you are reading. A medical student sits the USMLE Step 2 CK. That is a clinical-knowledge licensing exam. Programs use those scores, among many factors, to decide who they interview and rank. The most selective fields end up with the highest mean scores among the people who match.
That mean becomes the "smartest specialty" list. It reflects how hard a field like dermatology is to enter. It reflects how much its applicants studied. It reflects how programs weighted a single exam, the Step 2 CK. What it does not reflect, in any validated way, is the general cognitive ability we call IQ.
So the NRMP ranking is honest about exactly one thing: it ranks how competitive the matched applicant pool was in 2024. About the intelligence of the doctors inside that pool, it says nothing measurable at all.
A Proxy of a Proxy
Before the full table, a single finding deserves to stand alone. The viral ranking does not sit one step from intelligence; it sits two or three correlational rungs away.
Read that closely. It answers the question most "smartest specialty" lists never ask. Some readers will object that a proxy can still carry real signal, and that is fair. But the proxy chain leaks most of the variance at every rung (Gauer, 2016). By the time you reach a specialty label, almost nothing of a measured trait survives.
Put the leak in perspective. The SAT is a close cousin of the MCAT. It correlates with measured general intelligence at about r = 0.82. That explains roughly 67% of variance in a 917-student sample (Frey, 2004). The MCAT-to-Step link is a fraction of that. So the test feeding Step scores already captures far less of a doctor's aptitude than people assume.
What Is the Smartest Medical Specialty?
By the single number people cite, dermatology is the "smartest" specialty: its mean Step 2 CK score is 257 among 2024 matched applicants (NRMP, 2024). But that label swaps competitiveness for cognition. Dermatology is the hardest field to match into, and that selectivity inflates the scores of everyone who succeeds. It is the most selective specialty, not the brightest one.

Here the status-driven framing breaks down. People want a clean pecking order that runs dermatology, then neurosurgery, then psychiatry, and the numbers seem to deliver just that: dermatology 257, neurosurgery 254, psychiatry 244 (NRMP, 2024). Surely that settles it.
It does not. The reason is statistical. A 13-point gap sits across a distribution with a 15-point standard deviation. So the score curves overlap heavily (NRMP, 2024). Plenty of psychiatry applicants scored above the dermatology mean, and plenty of dermatology applicants scored below the psychiatry mean. The mean tells you about the pool's center. It tells you nothing about a single person inside it.
That is the base-rate error inside every "smartest doctor" claim. You cannot read a physician's mind from their specialty's average, because a single dermatologist could easily sit below the median pediatrician while the field-level mean still points the other way. It is like guessing one player's height from a team's roster mean. The shortcut is tempting, but the math forbids it.
So a list that crowns dermatology reports a competitiveness fact dressed in cognitive language. The accurate sentence is narrow: dermatology's 2024 matched applicants posted the highest mean licensing score. Everything past that sentence is interpretation. The most popular interpretation happens to have the least support behind it.
The Full Specialty Ranking, Honestly Labeled
Here is the complete order from the 2024 NRMP Charting Outcomes data. Read the third column with particular care, because those are percentiles within the population of Step 2 CK examinees, not general-population IQ percentiles. Converting them to IQ percentiles is invalid. No published USMLE-to-IQ conversion exists.
| Step 2 CK mean | Matched applicants (n) | Percentile vs. examinees | |
|---|---|---|---|
| Dermatology | 257 | 424 | 68th |
| Plastic Surgery (Integrated) | 256 | 188 | 66th |
| Orthopaedic Surgery | 256 | 726 | 66th |
| Diagnostic Radiology | 256 | 777 | 66th |
| Otolaryngology (ENT) | 255 | 339 | 63rd |
| Vascular Surgery (Integrated) | 255 | 77 | 63rd |
| Neurosurgery | 254 | 204 | 61st |
| Interventional Radiology | 253 | 144 | 58th |
| Anesthesiology | 252 | 1387 | 55th |
| Internal Medicine | 251 | 3699 | 53rd |
| Emergency Medicine | 248 | 1246 | 45th |
| Pediatrics | 247 | 1438 | 42nd |
| Family Medicine | 244 | 1427 | 35th |
| Psychiatry | 244 | 1304 | 35th |
A few things jump out once the numbers sit in a table (NRMP, 2024). Sample sizes vary widely. They range from 77 matched vascular surgery applicants to several thousand internal medicine matches. So the means at the top rest on thinner data than the means in the middle. And the whole vertical distance covers 13 points. That is the same gap that equals 0.87 of an exam standard deviation (NRMP, 2024).

A single caveat matters more than any row. These are matched applicants alone (NRMP, 2024). They are not specialty incumbents. They are not the general population. The figures describe the people who entered each field in a given cycle. That is a selected slice. Range restriction of this kind compresses the true spread between specialties, so the real cognitive differences are smaller than 13 points suggests.
There is also a circularity worth naming. The NRMP match itself shapes the competitiveness ranking, and that ranking then feeds back in as the input for the next cycle. Dermatology is competitive in part because it is already known to be competitive, which draws strong applicants, which keeps the 257 mean high, which reinforces the reputation all over again. The ranking is as much a social process as a fixed measurement.
None of this makes the table useless. Are you a medical student weighing the 204 Neurosurgery slots against your scores? Then it is genuinely useful. It just is not intelligence about intelligence.
Want a wider view of how cognitive demands stack up beyond medicine? Our profession-by-profession IQ ranking covers 360 occupations with the same insistence on honest labels.
Do USMLE Scores Measure Intelligence?
Not directly, and not well by proxy. USMLE Step exams are clinical-knowledge tests built to license physicians, not to gauge general cognitive ability. They do tap reasoning, but they reward recent, coachable disease knowledge far more than any stable trait, which is why focused study moves a score.

The evidence for that reading is striking. In a single cohort, a practice exam of disease knowledge (the NBME CBSE) predicted Step 1 well. The correlation was r = 0.711, about 51% of variance in 81 students (Giordano, 2016). In that same sample, the broad-aptitude MCAT did not predict Step 1 at all (Giordano, 2016).
Read those two facts side by side and the framing collapses. In that cohort the best predictor of a Step 1 score was recent practice on medical content, not a general aptitude measure. That is the signature of an achievement test. The score rises with effort. It does not behave like a stable cognitive trait, and the same logic carries to Step 2 CK, the clinical-knowledge exam that replaced Step 1 at the top of these rankings.
Where Step 2 CK predicts well, it predicts other knowledge outcomes. A meta-analysis of 43 studies tracked this. Step 2 CK correlated with in-training and board-exam scores at r = 0.52 (95% CI 0.45 to 0.59), about 27% of variance (Shirkhodaie, 2023). But its link to subjective resident performance ratings was just r = 0.19 (95% CI 0.13 to 0.25), about 4% of variance (Shirkhodaie, 2023). The exam predicts the next exam well. It tracks the day-to-day work of a doctor badly.
MCAT composite x Step 2 CK
r = 0.31
About 9.6% of shared variance (Gauer, 2016) in a 1,065-student cohort (Gauer, 2016)
The aptitude test that gates medical school explains roughly a tenth of the Step score we rank doctors by.
Source: Gauer & Jackson 2016, Medical Education Online
This is the honest place to draw a contrast. A licensing exam like the Step 2 CK is a proxy. Want a cognitive-ability estimate rather than a credentialing score? That is what a validated IQ test measures. It works on a population-normed scale. It does not infer ability backward through admissions and board exams. The two answer different questions. Conflating them is the whole mistake the "smartest specialty" genre makes. For more on how admissions tests relate to measured IQ, our LSAT, MCAT, and GRE correlation guide traces the same chain.
Do Surgeons Have the Highest IQ?
No. Surgical specialties cluster near the top of the Step 2 CK ranking. Plastic surgery and orthopaedics sit at 256. Neurosurgery sits at 254 (NRMP, 2024). But they trail dermatology's 257. And the spread is small. Surgeons score high because surgical fields are brutal to enter, not because surgery selects for sharper general intelligence.
It helps to see why a precise-looking IQ gap misleads. Take the 13-point Step spread. Run it through three guesses about how much that gap reflects general intelligence. Assuming a conversion that does not exist, you get a "g-attributable" difference between 6.5 and 10.7 IQ-equivalent points (IQ Career Lab analysis of NRMP, 2024). That is a scenario band, honest as a band and nothing tighter. Treating any single figure inside it as the answer would demand the conversion the literature has never established.
The Step 1 exam was once the headline number in these rankings. It went Pass/Fail in January 2022. That is why current rankings use Step 2 CK (Pereira, 2023). Older Step 1 means describe a system that no longer reports comparable scores. So a chart that ranks specialties by a three-digit Step 1 mean is using stale data.
Which Doctor Specialty Is the Hardest to Get Into?
By matched-applicant scores and match rates, four fields stand out. Dermatology, plastic surgery, neurosurgery, and orthopaedic surgery are the hardest to enter (NRMP, 2024). They combine few residency slots, elevated applicant Step 2 CK averages, and strong research expectations. Difficulty here means selectivity. That is a separate question from how bright the matched residents are.
The ranking honestly measures how hard a specialty is to enter. It says nothing measurable about the intelligence of the individual doctors inside it.
This is the version of the question the NRMP data answers well. It beats the IQ framing for almost anyone asking. Choosing a path? You want to know which doors are narrow. You want to know how the applicants who passed through them performed on Step 2 CK. You want to know how many slots existed. The table above is a competitiveness map. Read that way, it is reliable.
The mistake lives in the relabeling. "Hardest to match" is a sourced claim; "Highest IQ" is that same data wearing a costume.
What Is the Average IQ of Doctors?
The honest best estimate is a 115 to 130 band, well above the population mean of 100 on the bell-curve distribution. That is the answer most searchers came for. The caveat is that no one measured it head-on. Physicians almost never sit a test like the WAIS, so the band is an inference, not a measurement.
The deeper point is that the question may matter less than it seems. Here the most credible voice in the literature, bioethicist Ezekiel Emanuel, pushes back on the premise itself.
“In medicine, IQ is necessary to master and critically assess the volume and complexity of information integral to contemporary medical education. But past this threshold, success in medicine is ultimately more about emotional intelligence.”

Emanuel's argument appeared in JAMA in 2018. It reframes the exercise. Medicine has a cognitive entry threshold. Clearing it matters. But above that threshold, extra IQ points buy diminishing returns. The variables that separate good doctors from great ones shift toward temperament, communication, and emotional intelligence (Emanuel, 2018).
That reframing fits the data we have seen. Recall that Step 2 CK predicted the next exam at r = 0.52, yet predicted subjective resident performance at just r = 0.19 (Shirkhodaie, 2023). The exam captures knowledge well. It captures the human practice of medicine badly. That is what you would expect if Emanuel is right. The decisive variables live above the threshold, not below it.
What separates physicians past that threshold is closer to temperament and style, the domain a personality assessment maps rather than a cognitive test. A surgeon's steadiness under pressure is not a Step-score variable. Neither is a psychiatrist's capacity for empathy. Yet both of those traits shape a career far more than a 3-point exam gap ever could.
If anything, even the gentler claim that cognitive scores gate the specialty is shakier than it looks: a 2022 reanalysis by Sackett and colleagues found that earlier corrections for range restriction had overstated the link between cognitive scores and job performance (Sackett, 2022). Test scores do not assign you to a tier of medicine.
So What Should You Take From the Ranking?
Treat the NRMP ranking as a competitiveness map and nothing more. It reliably tells you which fields are hardest to enter and how their matched applicants performed on Step 2 CK. What it cannot tell you is which doctors are smartest. The number being ranked is a proxy of a proxy of an intelligence it never measured.
That distinction is not pedantic, and it compounds the further down the chain you read. The MCAT captures a thin slice of Step performance, Step performance captures a thin slice of real-world doctoring, and the specialty mean captures only where a self-selected pool happened to land. Each rung leaks. Critics argue the 2024 data still gestures at a real cognitive floor for medicine, and they have a point, but there is a line between using data well and laundering it into a conclusion. Came here for a status verdict? The data declines to issue one. Came to understand what the numbers really show? They show something more useful than a pecking order: a system that ranks competition with precision and intelligence not at all.
Curious why a 13-point gap across a 15-point standard deviation produces so much overlap? Our explainer on standard deviation in intelligence testing shows how heavily distributions blur the lines between any two groups whose means sit less than a full deviation apart. For the concept beneath all of this, the g-factor explainer covers what general intelligence really is, and why a licensing exam is such a leaky proxy for it.
Measure Your Cognitive Profile Directly
Skip the proxies. A validated assessment measures your reasoning across five cognitive domains and matches the result to high-fit career paths, no licensing exam required.
Erica reached the same conclusion. Her 257 Step score told a real story. It showed how competitive dermatology was and how hard she worked to enter it. It told no story about whether she was the smartest person in the room. Those were never the same question. The ranking everyone shares only ever answered the first.



