September 2022

S M T W T F S
    123
45678910
11121314151617
181920 21222324
2526 27282930 

Style Credit

Expand Cut Tags

No cut tags
Saturday, October 4th, 2003 01:52 pm
As I mentioned in the previous post, I had to come up with a list of irregular plurals in English. This is what I have so far. I'm posting it here because 1) I think they're very pretty words, and 2) if you see any that I'm missing, please post.


calves - calf
elves - elf
halves - half
hooves - hoof
knives - knife
leaves - leaf
lives - life
loaves - loaf
scarves - scarf
selves - self
sheaves - sheaf
shelves - shelf
staves - staff
thieves - thief
wives - wife
wolves - wolf
feet - foot
geese - goose
lice - louse
dice - die
men - man
mice - mouse
teeth - tooth
women - woman
people - person
brethren - brother
children - child
oxen - ox
algae - alga
alumnae - alumna
amoebae - amoeba
antennae - antenna
coronae - corona
faunae - fauna
florae - flora
formulae - formula
larvae - larva
nebulae - nebula
novae - nova
placentae - placenta
pupae - pupa
retinae - retina
supernovae - supernova
vertebrae - vertebra
alumnus - alumni
bacillus - bacilli
cacti - cactus
foci - focus
fungi - fungus
hippopotami - hippopotamus
magi - magus
nuclei - nucleus
octopi - octopus
radii - radius
stimuli - stimulus
syllabi - syllabus
termini - terminus
thesauri - thesaurus
addenda - addendum
bacteria - bacterium
curricula - curriculum
data - datum
errata - erratum
genera - genus
media - medium
memoranda - memorandum
millenia - millenium
ova - ovum
strata - stratum
symposia - symposium
stadia - stadium
apices - apex
appendices - appendix
cervices - cervix
indices - index
matrices - matrix
vortices - vortex
analyses - analysis
axes - axis
bases - basis
crises - crisis
diagnoses - diagnosis
ellipses - ellipsis
emphases - emphasis
hypotheses - hypothesis
metamorphoses - metamorphosis
neuroses - neurosis
oases - oasis
paralyses - paralysis
parentheses - parenthesis
synopses - synopsis
syntheses - synthesis
theses - thesis
criteria - criterion
phenomena - phenomenon
automata - automaton
schemata - schema
stigmata - stigma
cherubim - cherub
seraphim - seraph
beaux - beau
tableaux - tableau
Saturday, October 4th, 2003 04:51 pm (UTC)
So if people use the wrong pluralization -- which they definitely do... will it be recognized as a plural word if it follows the standard English pattern of -s -es?

'brothers', for example.
Saturday, October 4th, 2003 05:27 pm (UTC)
That is the idea, and it works about as well as any algorythmic approach to natural language can.

De-morphemization is in two steps. First words are de-pluralized (check for irregulars, then -ies, then -es, then s). After that, it checks for various other rules (-ing, -ed, -er, -est, -ly-, -ize, -less, -ness, -able). The code has no sense of meaning, so if a word matches the check for a plural word, it get depluralized, whether or not that is the 'correct' pluralization.

However, the standard rules fail pretty often, English being a complete whore of a language. 'Vignettes', for instance, will de-pluralize to 'vignett'. 'Brother' ends up as 'broth' (as does 'brothers' and 'brethren'). So you might get some soup related false-postives in this case, but no false-negatives. And you'll only get the false-positives when the output happens to be a word, which is moderately rare.