Syntactic category

A syntactic category is a syntactic unit that theories of syntax assume.[1] Word classes, largely corresponding to traditional parts of speech (e.g. noun, verb, preposition, etc.) are syntactic categories. In phrase structure grammars, the phrasal categories (e.g. noun phrase, verb phrase, prepositional phrase, etc.) are also syntactic categories. Dependency grammars, however, do not acknowledge phrasal categories (at least not in the traditional sense).[2]

Word classes considered as syntactic categories may be called lexical categories, as distinct from phrasal categories. The terminology is somewhat inconsistent. The terminology is dependent on which grammarian theory we're learning about.[2] However, many grammars also draw a distinction between lexical categories (which tend to consist of content words, or phrases headed by them) and functional categories (which tend to consist of function words or abstract functional elements, or phrases headed by them). The term lexical category therefore has two distinct meanings. Moreover, syntactic categories should not be confused with grammatical categories (also known as grammatical features), which are properties such as tense, gender, etc.

For instance, many nouns in English denote concrete entities, they are pluralized with the suffix -s, and they occur as subjects and objects in clauses. Many verbs denote actions or states, they are conjugated with agreement suffixes (e.g. -s of the third person singular in English), and in English they tend to show up in medial positions of the clauses in which they appear.

The third criterion is also known as distribution. The distribution of a given syntactic unit determines the syntactic category to which it belongs. The distributional behavior of syntactic units is identified by substitution.[3] Like syntactic units can be substituted for each other.

Additionally, there are also informal criteria one can use in order to determine syntactic categories. For example, one informal means of determining if an item is lexical, as opposed to functional, is to see if it is left behind in "telegraphic speech" (that is, the way a telegram would be written; e.g., Pants fire. Bring water, need help.)[4]

The traditional parts of speech are lexical categories, in one meaning of that term.[5] Traditional grammars tend to acknowledge approximately eight to twelve lexical categories, e.g.

The lexical categories that a given grammar assumes will likely vary from this list. Certainly numerous subcategories can be acknowledged. For instance, one can view pronouns as a subtype of noun, and verbs can be divided into finite verbs and non-finite verbs (e.g. gerund, infinitive, participle, etc.). The central lexical categories give rise to corresponding phrasal categories:[6]

In terms of phrase structure rules, phrasal categories can occur to the left of the arrow while lexical categories cannot, e.g. NP → D N. Traditionally, a phrasal category should consist of two or more words, although conventions vary in this area. X-bar theory, for instance, often sees individual words corresponding to phrasal categories. Phrasal categories are illustrated with the following trees:

The lexical and phrasal categories are identified according to the node labels, phrasal categories receiving the "P" designation.

Dependency grammars do not acknowledge phrasal categories in the way that phrase structure grammars do.[2] What this means is that the interaction between lexical and phrasal categories disappears, the result being that only the lexical categories are acknowledged.[7] The tree representations are simpler because the number of nodes and categories is reduced, e.g.

The distinction between lexical and phrasal categories is absent here. The number of nodes is reduced by removing all nodes marked with "P". Note, however, that phrases can still be acknowledged insofar as any subtree that contains two or more words will qualify as a phrase.

Many grammars draw a distinction between lexical categories and functional categories.[8] This distinction is orthogonal to the distinction between lexical categories and phrasal categories. In this context, the term lexical category applies only to those parts of speech and their phrasal counterparts that form open classes and have full semantic content. The parts of speech that form closed classes and have mainly just functional content are called functional categories:

Adjective (A) and adjective phrase (AP), adverb (Adv) and adverb phrase (AdvP), noun (N) and noun phrase (NP), verb and verb phrase (VP), preposition and prepositional phrase (PP)
Coordinate conjunction (C), determiner (D), negation (Neg), particle (Par), preposition (P) and prepositional phrase (PP), subordinate conjunction (Sub), etc.

There is disagreement in certain areas, for instance concerning the status of prepositions. The distinction between lexical and functional categories plays a big role in Chomskyan grammars (Transformational Grammar, Government and Binding Theory, Minimalist Program), where the role of the functional categories is large. Many phrasal categories are assumed that do not correspond directly to a specific part of speech, e.g. inflection phrase (IP), tense phrase (TP), agreement phrase (AgrP), focus phrase (FP), etc. (see also Phrase → Functional categories). In order to acknowledge such functional categories, one has to assume that the constellation is a primitive of the theory and that it exists separately from the words that appear. As a consequence, many grammar frameworks do not acknowledge such functional categories, e.g. Head Driven Phrase Structure Grammar, Dependency Grammar, etc.

Note: The abbreviations for these categories vary across systems; see Part-of-speech tagging § Tag sets.

Early research suggested shifting away from the use of labelling, as they were considered to be non-optimal for the analysis of syntactic structure, and should therefore be eliminated.[9] Collins (2002) argued that, although labels such as Noun, Pronoun, Adjective and the like were unavoidable and undoubtedly useful for categorizing syntactic items, providing labels for the projections of those items, was not useful and was, in fact, detrimental to structural analysis, since there were disagreements and discussions about how exactly to label these projections. The labeling of projections such as Noun Phrases (NP), Verb Phrases (VP), and others have since been a topic of discussion amongst syntacticians, who have since then been working on labelling algorithms to solve the very problem brought up by Collins.

In line with both Phrase Structure Rules and X-bar theory, syntactic labelling plays an important role within Chomsky's Minimalist Program (MP). Chomsky first developed the MP by means of creating a theoretical framework for generative grammar that can be applied universally among all languages. In contrast to Phrase Structure Rules and X-bar theory, many of the research and proposed theories done on labels are fairly recent and still ongoing.