Voice doesn't have a gender — voices have acoustic properties (pitch, resonance, intonation) that listeners interpret as masculine, feminine, or androgynous. Three signals dominate: average pitch (F0), vocal tract resonance, and speech patterns. Most voices land clearly in one zone; many sit in the overlap. Here's how listeners and AI actually tell — and why your voice might not match what you expected.
Below: the three acoustic levers, the overlap zone where stereotype breaks down, how AI puts a number on perceived gender, what voice training can and can't do, and why none of this defines you.
Pitch is the loudest cue, but it's not the whole story. A 2024 PLOS One analysis of 47 speakers across five gender categories found that fundamental frequency dominates simple judgments, but as speech tasks get more complex (reading versus a sustained vowel), resonance, breathiness, speech rate, and spectral emphasis all start carrying weight. Listeners use a stack of cues, not one.
F0 is the rate your vocal folds vibrate — what most people call "pitch." Adult cisgender male F0 averages around 100–130 Hz; adult cisgender female F0 averages around 165–220 Hz, per Hollien's long-standing speaking-frequency norms. The same PLOS One study above measured cisgender women at a median 213.9 Hz and cisgender men at 124.2 Hz — close to Hollien's ranges, half a century later.
Listener perception of gender from pitch alone isn't a hard cutoff. Recognition rates stay above 80% when F0 is below roughly 138 Hz (read masculine) or above roughly 163 Hz (read feminine), and drop sharply in between. The zone between those values is where pitch alone stops disambiguating.
Formants are the resonances of your vocal tract — the frequencies your throat and mouth amplify on top of the F0 your folds produce. Longer vocal tracts (typically associated with male anatomy) produce lower formants and a "darker," chestier sound. Shorter tracts produce higher formants and a brighter, more head-forward sound.
Research on formant biofeedback in voice feminization shows that raising the second formant (F2) increases perceived femininity independently of F0. A 2024 integrative review of 45 voice-gender studies calls formants "crucial secondary cues," with higher average formant frequencies acting as strong predictors of perceived femininity in transgender women's voices. In simpler terms: if your formants read masculine, raising your pitch alone won't get listeners to read your voice as feminine.
This is why pitch-shifting plugins sound off. They move F0 without moving formants, producing the "chipmunk" or "Darth Vader" effect — the resonance doesn't match the pitch.
How pitch moves through a sentence matters as much as where it sits. More dynamic pitch variation, rising terminal patterns, and wider melodic contours read as more feminine. Flatter, falling patterns read as more masculine. Research on transmasculine voices found that even after testosterone therapy successfully lowers F0 into the cisgender male range, 23% of trans men still showed F0 standard deviation values above the highest cisgender male values — i.e. their pitch variability still patterned as feminine despite their average pitch reading masculine. Prosody is a separate dial.
Articulation, breathiness, and vocal weight (chest-heavy versus head-light) all layer on top. The key insight: F0 alone doesn't determine perceived gender. A voice with high F0 but masculine resonance still reads masculine. This is why pitch-only training plateaus quickly.
Voices are normally distributed, and the male and female distributions overlap significantly. Voice Science's review of average speaking frequencies gives Hollien's ranges as 90.5–165.2 Hz for adult males and 165–294 Hz for adult females. The top of male range and bottom of female range touch at around 165 Hz.
That overlap is real, common, and not pathological:
Deeper voices in cis women and higher voices in cis men are normal variation, not defects. The distributions overlap. That's the whole story for most people asking the question.
No. Voice is acoustic data — pitch, formants, prosody, vocal weight. Gender perception is the listener's interpretation of that data, shaped by cultural training. The same acoustic features that read as "masculine" in one cultural context might read neutrally in another, and the boundary between "feminine" and "masculine" voices shifts across decades and across languages.
The 2024 integrative review makes this concrete: listener identity changes the perception. Gender-diverse listeners rated voices on a non-binary scale and showed distinct perception patterns from cisgender listeners. Voice gender isn't a property of the voice — it's an event in the listener.
That matters because it means there's no "true" gender of your voice waiting to be uncovered. There's how listeners (and models trained on listener judgments) tend to read it.
A practical, four-step read you can do in about ten minutes:
Step 1 — F0 check. Record yourself reading a paragraph in your natural voice (no podcast voice, no performance). Drop the file into a free analyzer — Praat works, a smartphone tuner app works, our Voice Depth Analyzer works. Compare your average F0 against the perceptual zones:
| F0 range | Perceptual read |
|---|---|
| Below ~138 Hz | Reads masculine (>80% of listeners) |
| 138–163 Hz | Androgynous / depends on other cues |
| Above ~163 Hz | Reads feminine (>80% of listeners) |
Step 2 — Resonance check. Listen to your recording. Does the voice feel like it's coming from your chest (darker, fuller, "heavier") or from the front of your face (brighter, lighter)? Chest-heavy resonance reads masculine; forward, head-heavy resonance reads feminine. This is independent of pitch.
Step 3 — Prosody check. Record yourself describing your weekend casually. Listen back. Does pitch move dramatically up and down, or stay relatively flat? Do statements tend to end with rising pitch (more feminine pattern) or falling pitch (more masculine pattern)?
Step 4 — Stranger test. Send the recording to someone who hasn't heard you before and ask what they perceive. The naive listener is the closest thing to a ground truth, because gender perception lives in the listener.
If steps 1–3 disagree (e.g. your pitch reads masculine but your resonance and prosody read feminine), your voice is probably perceived as androgynous or mixed. That's a real outcome — it doesn't mean any of the measurements are wrong.
Gender-affirming voice and communication training is a recognized clinical specialty. ASHA's practice portal lists it as a service speech-language pathologists provide, covering pitch, resonance, intonation, articulation, and nonverbal communication. The research base is large and growing — there are decades of peer-reviewed work on outcomes, methods, and timelines.
The clinical evidence is consistent: pitch alone isn't enough. The Journal of Voice study on formant biofeedback found that successful feminization training raises F0 and formants together, with F3 in particular shifting significantly post-treatment. A 2024 Journal of Voice acoustic outcomes study followed trans women through a 10-week training program covering pitch elevation and articulation-resonance work — F0 stayed elevated at 3 months and 1 year post-training, with a modest 16 Hz drift back over the year that's worth knowing about.
Typical components of an evidence-based feminization protocol:
Testosterone HRT does most of the F0 work for trans men. Brigham Young's longitudinal study of trans men on testosterone found post-therapy average F0 of 116.8 Hz versus 110.6 Hz in cis men and 192.5 Hz in cis women — almost all participants landed inside the cis male range within 9–12 months, with the biggest changes in months 3–6.
But hormones don't change everything. The same study found 23% of participants had pitch variability outside the cis male range, and vocal tract length stayed shorter than cis male averages. Resonance and prosody don't shift automatically with HRT — they need behavioral training.
Community-developed resources fill in where peer-reviewed training protocols are sparse. Romeo's Trans Masculine Voice Training Guide on r/transvoice is the most-cited peer-developed reference in transmasc training communities — it walks through weight (chest-anchored versus head-anchored production), resonance lowering via larynx position, and flatter prosody. To be clear about what it is: it's a community resource, not peer-reviewed clinical literature, and frames things in language transmasc trainees actually use rather than in SLP jargon. Worth reading as a complement to clinical training, not a substitute for working with an SLP if you can access one.
Typical components of an evidence-based masculinization protocol (with or without HRT):
A note on safety: pushing pitch beyond what's comfortable causes vocal fold strain and, over time, nodules and polyps per ASHA's voice disorder guidance. Working with an SLP trained in gender-affirming voice is the gold standard. Self-practice with AI feedback can supplement, not replace, that work for high-intensity training.
You don't need to train your voice to be valid. Some trans people pursue voice training, many don't, both are normal. If voice training serves your goals, the techniques above work. If it doesn't, your voice is fine as it is.
The acoustic pipeline mirrors what listeners do, with more precision and less cultural noise. Modern systems extract:
These features feed into either a classifier (binary or scored on a continuous masculine-to-feminine axis) or a regression model. Recent work has moved toward continuous scoring on a perceived masculinity/femininity scale rather than binary classification, which matches how human listeners actually hear voices — most people aren't 100% one or the other.
Honest about limits: AI is good at clear-cut cases (F0 well outside the overlap zone, consistent resonance) and genuinely ambiguous in the overlap zone, just as humans are. Phone-mic recordings roll off below 80 Hz and above 8 kHz, which clips formant information. Background noise pushes models toward octave errors on F0. A 5-second clip carries less prosody signal than a 30-second one.
A critical framing: AI doesn't know your gender. It estimates how listeners would perceive your voice. Those are different things. The same applies to clinical voice assessment — you're measuring perception, not identity.
Want to hear what AI reads your voice as? Try our Gender Voice Analysis — upload a recording, get an estimate on the masculine-feminine axis with the acoustic features behind it. Useful as a self-check, for voice acting practice, and as a feedback signal during gender-affirming voice training. Free, no signup, instant. The estimate reflects how listeners are likely to perceive your voice — not anything about you as a person.
For a fuller acoustic read, layer on Vocal Analysis for tone and breath support, Voice Depth Analyzer for F0 with norms, or Voice Health Analyzer for fatigue markers if you're training intensively.
Cross-gender voice acting works using the same toolkit as gender-affirming voice training, applied temporarily for performance. Voice actors and narrators describe the toolkit as pitch, placement, pacing, accent, and attitude — with placement (where the voice resonates) often doing more work than pitch.
Nancy Cartwright voices Bart Simpson with a raspy, adolescent timbre that lives in the male-child resonance range — achieved through technique, not raw pitch shifting. Female voice actors voicing young male characters is so common in animation that Fox initially asked Cartwright not to do interviews to avoid publicizing it. The technique is the technique — pitch, resonance, prosody, character — and it works in both directions.
For voice actors training cross-gender voices, the Gender Voice Analysis tool works as fast feedback: try a take, hear how it reads, adjust.
Voice type (soprano, alto, tenor, bass — the six vocal range categories covered in our vocal range article) is about your singing range and tessitura. Voice gender is about how listeners perceive your speaking voice. They overlap but aren't the same axis.
If you want a singing-range read, use the Voice Type Classifier for Fach placement. If you want a perceived-gender read on your speaking voice, use the Gender Voice Analysis. They answer different questions.
For the AI read on your own voice: Gender Voice Analysis places your voice on the masculine-feminine perceptual axis with the acoustic features behind it. The Voice Type Classifier handles singing-voice Fach placement. Vocal Analysis covers tone, breath support, and pitch stability. The Voice Depth Analyzer returns your F0 with norms and percentiles. The Voice Health Analyzer flags fatigue markers. All free, no signup.