Idiolect is an individual's unique use of language, including speech. This unique usage encompasses vocabulary, grammar, and pronunciation.

An idiolect is the variety of language unique to an individual. This differs from a dialect, a common set of linguistic characteristics shared among a group of people.

The term is etymologically related to the Greek prefix idio- (meaning "own, personal, private, peculiar, separate, distinct") and -lect, abstracted from dialect,[1] and ultimately from Ancient Greek λέγω, légō, 'I speak'.

Language consists of sentence constructs, word choice, and expression of style. Meanwhile, an idiolect is an individual's personal use of these facets. Every person has a unique idiolect depending on their language, socioeconomic status, and geographical location. Forensic linguistics psychologically analyses idiolects.[2]

The notion of language is used as an abstract description of the language use, and of the abilities of individual speakers and listeners.[3][better source needed] According to this view, a language is an "ensemble of idiolects... rather than an entity per se".[3] Linguists study particular languages by examining the utterances produced by native speakers.

This contrasts with a view among non-linguists, at least in the United States, that languages as ideal systems exist outside the actual practice of language users: Based on work done in the US, Nancy Niedzielski and Dennis Preston describe a language ideology that appears to be common among American English speakers. According to Niedzielski and Preston, many of their subjects believe that there is one "correct" pattern of grammar and vocabulary that underlies Standard English, and that individual usage comes from this external system.[4]

Linguists who understand particular languages as a composite of unique, individual idiolects must nonetheless account for the fact that members of large speech communities, and even speakers of different dialects of the same language, can understand one another. All human beings seem to produce language in essentially the same way.[5] This has led to searches for universal grammar, as well as attempts to further define the nature of particular languages.

Forensic linguistics includes attempts to identify whether a person produced a given text by comparing the style of the text with the idiolect of the individual in question. The forensic linguist may conclude that the text is consistent with the individual, rule out the individual as the author, or deem the comparison inconclusive.[6]

In 1995 Max Appedole relied in part on an analysis of Rafael Sebastián Guillén Vicente's writing style to identify him as Subcomandante Marcos, a leader of the Zapatista movement. Although the Mexican government regarded Subcomandante Marcos as a dangerous guerilla, Appedole convinced the government that Guillén was a pacifist. Appedole's analysis is considered an early success in the application of forensic linguistics to criminal profiling in law enforcement.[7][8]

In 1998 Ted Kaczynski was identified as the "Unabomber" by means of forensic linguistics. The FBI and Attorney General Janet Reno pushed for the publication of an essay of Kaczynski's, which led to a tip-off from Kaczynski's brother, who recognized the writing style, his idiolect.[9]

In 1978 four men were accused and convicted of murdering Carl Bridgewater. No forensic linguistics was involved in their case at the time. Today, forensic linguistics reflects that the idiolect used in the interview of one of the men was very similar to that man's reported statement. Since idiolects are unique to an individual, forensic linguistics reflects that it is very unlikely that one of these files was not created by using the other.[10]

Idiolect analysis is different for an individual depending on whether the data being analyzed is from a corpus made up entirely from texts or audio files, since written work is more thought out in planning and precise in wording than in spontaneous speech, where informal language and conversation fillers (i.e. umm..., you know, etc.) fill corpus samples.[11] Corpora with large amounts of input data allow for the generation of word frequency and synonym lists, normally through the use of the top ten bigrams created from it (context of word usage is taken into account here, when determining whether a bigram is legitimate in certain circumstances).[12]

Whether a word or phrase is part of an idiolect is determined by where the word is in comparison to the window's head word, the edge of the window. This window is kept to 7-10 words, with a sample that is being considered as a feature of the idiolect being possibly +5/-5 words away from the "head" word of the window (which is normally in the middle).[13] Data in corpus pertaining to idiolect get sorted into three categories: irrelevant, personal discourse marker(s), and informal vocabulary.[14] Samples that are at the end of the frame and far from this head word are often ruled to be superfluous. Superfluous data then is run through different functions than non-superfluous data to see if this word or phrase is a part of an individual's idiolect.[15]