Indo-Aryan languages

Indo-Aryan
	Indic
Geographic; distribution	South Asia
Linguistic classification	Indo-European Indo-Iranian Indo-Aryan; ;
Proto-language	Proto-Indo-Aryan
ISO 639-2 / 5	inc
Linguasphere	59= (phylozone)
Glottolog	indo1321
	Present-day geographical distribution of the major Indo-Aryan language groups. Romani, Domari, Kholosi and Lomavren are outside the scope of the map. Pashai (Dardic) Chitrali (Dardic) Shina (Dardic) Kohistani (Dardic) Kashmiri (Dardic) Punjabi (Northwestern) Sindhi (Northwestern) Rajasthani (Western) Gujarati (Western) Bhili and Khandeshi (Western) Himachali (= W. Pahari, Northern) Garhwali-Kumaoni (= C. Pahari, Northern) Nepali (= E. Pahari, Northern) Western Hindi (Central) Eastern Hindi (Central) Bihari (Eastern) Bengali-Assamese (Eastern) Oriya (Eastern) Halbi (Eastern) Marathi-Konkani (Southern) Sinhala-Maldivian (Southern) (not shown: Kunar (Dardic), Chinali-Lahuli)

The Indo-Aryan or Indic languages form a major language family of South Asia. They constitute a branch of the Indo-Iranian languages, themselves a branch of the Indo-European language family. As of the early 21st century more than 800 million people speak Indo-Aryan languages, primarily in India, Bangladesh, Nepal, Pakistan and Sri Lanka.^[2] Moreover, large immigrant and expatriate Indo-Aryan-speaking communities live in Northwestern Europe, Western Asia, North America, Southeast Africa and Australia. There are well over 200 known Indo-Aryan languages.^[3]

Modern Indo-Aryan languages descend from Old Indo-Aryan languages such as early Sanskrit, through Middle Indo-Aryan languages (or Prakrits).^[4]^[5]^[6]^[7] The largest such languages in terms of L1 speakers are Hindi–Urdu (about 329 million),^[8] Bengali (242 million),^[9] Punjabi (about 120 million),^[10] Marathi, (112 million), Gujarati (60 million), Rajasthani (58 million), Bhojpuri (51 million), Odia (35 million), Maithili (about 34 million), Sindhi (25 million), Nepali (16 million), Assamese (15 million), and Chhattisgarhi (18 million). A 2005 estimate placed the total number of native speakers of Indo-Aryan languages at nearly 900 million.^[11]

Classification[edit]

Theories[edit]

The Indo-Aryan family as a whole is thought to represent a dialect continuum, where languages are often transitional towards neighboring varieties.^[12] Because of this, the division into languages vs. dialects is in many cases somewhat arbitrary. The classification of the Indo-Aryan languages is controversial, with many transitional areas that are assigned to different branches depending on classification.^[13] There are concerns that a tree model is insufficient for explaining the development of New Indo-Aryan, with some scholars suggesting the wave model.

Subgroups[edit]

There has been great difficulty in forming clear subfamilies within Indo-Aryan. Some of the issues are whether Gujarati is more affiliated with Marathi–Konkani or with Hindi (and Rajasthani); whether Dardic constitutes an areal zone, an Indo-Aryan genetic group, or a separate Indo-Iranian subfamily; whether Eastern Hindi is allied with Western Hindi or Bihari; and whether a Northwestern zone encompasses Punjabi or not. Ultimately, in an area with high linguistic contact such as South Asia, the deficiencies of the tree model are apparent.

The following table of proposals is expanded from Masica (1991).

Indo-Aryan subgroups
Model	Odia	Bihari	E. Hindi	W. Hindi	Rajasthani	Gujarati	Pahari	E. Punjabi	W. Punjabi	Sindhi	Dardic	Marathi– Konkani	Sinhala– Dhivehi	Romani
Hoernlé (1880)	E		E~W	W			N	W	?	W	?	S	?	?
Grierson (–1927)	E		C~E	C					NW		non-IA	S		non-IA
Chatterji (1926)	E			Midland	SW		N	NW			non-IA	S		NW
Grierson (1931)	E		Inter.	Midland	Inter.				NW		non-IA	S		non-IA
Katre (1968)	E		C						NW		Dardic	S		?
Nigam (1972)	E		C	C (+NW)	C		?	NW			N	S		?
Cardona (1974)	E	C				(S)W	NW					(S)W		?
Turner (–1975)	E	C				SW	C (C.)~NW (W.)	NW				SW		C
Kausen (2006)	E		C		W		N	NW			Dardic	S		Romani
Kogan (2016)	E	?	C	C~NW	NW		C~NW	C	NW		non-IA	S	Insular	C
Ethnologue (2020)^[14]	E		EC	C	W		EC (E.)~W (C., W.)	W		NW		S		W
Glottolog (2020)^[15]	E	Bihari	C				N	NW				S	Dhivehi-Sinhala	C
English Wikipedia	E		C		W		N	NW			Dardic	S		W

Anton I. Kogan, in 2016, conducted a lexicostatistical study of the New Indo-Aryan languages based on a 100-word Swadesh list, using techniques developed by the glottochronologist and comparative linguist Sergei Starostin.^[16] That grouping system is notable for Kogan’s exclusion of Dardic from Indo-Aryan on the basis of his previous studies showing low lexical similarity to Indo-Aryan (43.5%) and negligible difference with similarity to Iranian (39.3%).^[17] He also calculated Sinhala–Dhivehi to be the most divergent Indo-Aryan branch. Nevertheless, the modern consensus of Indo-Aryan linguists tends towards the inclusion of Dardic based on morphological and grammatical features.

Inner-outer hypothesis[edit]

The inner-outer hypothesis argues for a core and periphery of Indo-Aryan languages, with Outer Indo-Aryan (generally including Eastern and Southern Indo-Aryan, and sometimes Northwestern Indo-Aryan, Dardic and Pahari) representing an older stratum of Old Indo-Aryan that has been mixed to varying degrees with the newer stratum that is Inner Indo-Aryan. It is a contentious proposal with a long history, with varying degrees of claimed phonological and morphological evidence.

Rudolf Hoernlé in 1880 first proposed a binary grouping of Indo-Aryan languages, between an “outer” Magadhi-descended group containing Odia, Bengali, Eastern Hindi, and Marathi and an “inner” Sauraseni-descended group comprising Western Hindi, Punjabi, Sindhi, and Nepali.^[18] George Abraham Grierson further developed this theory into what is known as the Inner-outer hypothesis of Indo-Aryan languages, reassigning the Northwest zone (without Punjabi) to the Outer branch and further introducing a mediate sub-branch for Eastern Hindi.^[19] The grouping was postulated to be the result of two waves (by Hoernlé) or two periods (by Grierson) of the Indo-Aryan migration reflecting different dialect groups of Old Indo-Aryan. Masica lists some of the features that were seen as evidence for the separation:

Preservation of /s/ (inner) vs. other substitutions such as /ʃ/, /ɦ/, /x/ (outer)
Loss of short vowels (inner) vs. retention (outer)
Past in -i- (inner) vs. -l- (outer)
Analytic (inner) vs synthetic (outer)

The earliest refutations of this hypothesis were put forth by Suniti Kumar Chatterji, invoking many instances of language contact and influence that have led to the concerns about the practicality of the tree model for Indo-Aryan.^[20] Some of Chatterji’s evidence was based on the lack of uniformity in many of these feature within the supposed branches; e.g. /ɦ/ is found as a development of /s/ in the “inner” Rajasthani, Punjabi, and Bhili, as well as universally in the numerals reflecting more general rules of phonological change rather than a division. Masica (1991) enumerates more objections. Grierson did rework the hypothesis to counter these criticisms after conducting the Linguistic Survey of India but Chatterji remained unconvinced.

Modern discussion had largely ignored the inner-outer hypothesis, and instead examined isoglosses and regional groupings (leading to some clear subgroupings such as Eastern Indo-Aryan) until work by Franklin Southworth and more recent examinations by Claus Peter Zoller. Southworth put forth a crucial necessity which had been hitherto ignored in Indo-Aryan classification: “exclusively shared innovations” as a diagnostic for grouping. To that end, Southworth found the following features that distinguish Outer IA:^[21]

Past indicative and perfective participle in -l- < MIA -illa/-ulla/-alla (the primary innovation)
Gerundive from OIA -(i)tavya
r̥ > a (subject to lexical diffusion from Inner IA)
Loss of phonemic vowel length in i and u, instead determined positionally
Word-initial stress (evidenced by vowel length, subject to lexical diffusion in Gujarati and Sindhi)
l > n
Loss of non-initial post-consonantal h

Not all of these correspond perfectly to an Outer IA grouping (e.g. some Rajasthani lects and Haryanvi continue OIA -(i)tavya) but do constitute shared innovation. For Southworth, Outer encompasses Bengali, Assamese, Odia, Bihari, Marathi, Konkani, Gujarati, and Sindhi. George Cardona and Dhanesh Jain responded, “I think it fair to say that these conclusions are not sufficiently backed up by detailed facts about the chronology of changes to merit their being accepted as established”; that is to say, for many of the supposed differences between Inner and Outer, there is no clear historical evidence attesting that they reflect OIA divisions and not more recent changes or areal diffusions.^[22]^:22

Claus Peter Zoller also investigated shared innovations in Outer IA, but with a different historical explanation for the differences. He limited examination to d~ḍ alternation; c, j > ċ, (d)z; and the –l(l)– past, finding the rest of Southworth’s evidence unconvincing. He argued for a more wave model-type conception of Inner IA, imposing itself on the older OIA layer that is preserved in Outer IA, with greater retention radiating outwards from Inner IA to Outer IA, Dardic, and Nuristani. “[A]n individual language is either more Outer and less Inner Language or vice versa, depending on the amount of typical Outer Language features characterizing that individual language.”^[23]

Chundra A. Cathcart at the University of Zurich in 2019 conducted a probabilistic assessment of the Inner-outer hypothesis using various statistical approaches to modelling sound change (adopting the suggestion of phonology-first analysis put forth by Masica) based on data from the Comparative Dictionary of the Indo-Aryan Languages compiled by Ralph Lilley Turner. His logistic normal distribution model found evidence for a core-periphery distinction while the Dirichlet distribution model is less convincing. Cathcart concluded that “neither model provides full support for” the Inner-outer hypothesis, but there is “at least vague support for an areal core and periphery” that could be in line with Zoller’s model but not with Southworth’s.^[24]

A comparison of what languages constitute Outer IA in the various proposals put forth historically is shown below.

Outer Indo-Aryan languages
Theory	Bengali–Assamese–Odia	Bihari	E. Hindi	Marathi–Konkani	Gujarati	Sindhi	W. Punjabi	Rajasthani	Pahari	Dardic	Sinhala–Dhivehi
Hoernlé (1880)	Yes	Yes	Yes	Yes	No	No	N/A	N/A	No	N/A	N/A
Grierson (1927)	Yes	Yes	Mediate	Yes	No	Yes	Yes	No	No	N/A	N/A
Southworth (2005)	Yes (core)	Yes (core)	No	Yes (core)	Yes	Yes	No	No	No	Maybe	No
Zoller (2016)	A. O. > B.	No	No	Yes (K. > M.)	Yes	Yes	Yes	Partially	Yes	Yes	Yes
Cathcart (2019, logistic)	Yes	No	Maybe	No	No	No	No	No	Mostly yes	N/A	Yes
Ethnologue (2020)^[14]	Yes	Yes	No	Yes	No	Yes	No	No	No	Yes	Yes

Groups[edit]

The below classification follows Masica (1991), and Kausen (2006).

Percentage of Indo-Aryan speakers by native language:

Hindustani (including Hindi and Urdu) (25.4%)

Bengali (20.7%)

Punjabi (9.4%)

Marathi (5.6%)

Gujarati (3.8%)

Bhojpuri (3.1%)

Maithili (2.6%)

Odia (2.5%)

Sindhi (1.9%)

Others (25%)

Dardic[edit]

The Dardic languages (also Dardu or Pisaca) are a group of Indo-Aryan languages largely spoken in the northwestern extremities of the Indian subcontinent. Dardic was first formulated by George Abraham Grierson in his Linguistic Survey of India but he did not consider it to be a subfamily of Indo-Aryan. The Dardic group as a genetic grouping (rather than areal) has been scrutinised and questioned to a degree by recent scholarship: Southworth, for example, says “the viability of Dardic as a genuine subgroup of Indo-Aryan is doubtful” and “the similarities among [Dardic languages] may result from subsequent convergence”.^[21]^:149

The Dardic languages are thought to be transitional with Punjabi and Pahari (e.g. Zoller describes Kashmiri as “an interlink between Dardic and West Pahāṛī”)^[23]^:83, as well as non-Indo-Aryan Nuristani; and are renowned for their relatively conservative features in the context of Proto-Indo-Aryan.

Kashmiri: Kashmiri, Kishtwari;
Shina: Brokskad, Kundal Shahi, Shina, Ushojo, Kalkoti, Palula, Savi;
Chitrali: Kalasha, Khowar;
Kohistani: Bateri, Chilisso, Gowro, Indus Kohistani, Kalami, Tirahi, Torwali, Wotapuri-Katarqalai;
Pashayi
Kunar: Dameli, Gawar-Bati, Nangalami, Shumashti.

Northern Zone[edit]

The Northern Indo-Aryan languages, also known as the Pahari (‘hill’) languages, are spoken throughout the Himalayan regions of the subcontinent. They are thought to be transitional with Dardic, Punjabi, Bihari, and the Hindi languages, among others. The official language of Nepal, Nepali, is a Pahari language; Nepali is also one of India’s scheduled languages.

Eastern Pahari: Nepali, Jumli, Doteli;
Central Pahari: Garhwali, Kumaoni;
Western Pahari (Himachali): Dogri, Kangri, Bhadarwahi, Churahi, Bhateali, Bilaspuri, Chambeali, Gaddi, Pangwali, Mandeali, Mahasu Pahari, Jaunsari, Kullu, Pahari Kinnauri, Hinduri, Sirmauri.

Northwestern Zone[edit]

Northwestern Indo-Aryan languages are spoken throughout the northwestern regions of the Indian subcontinent. Punjabi is spoken predominantly in the Punjab region and is the official language of Punjab; in addition to being the most widely-spoken language in Pakistan. To the south, Sindhi and its variants are spoken; primarily in Sindh province. Northwestern languages are ultimately thought to be descended from Shauraseni Prakrit.

Punjabi
- Eastern Punjabi: Punjabi, Doabi, Majhi, Malwai, Puadhi, Sansi;
- Western Punjabi (Lahnda): Saraiki, Hindko, Pahari-Pothwari, Inku†, Sarazi;
Sindhi: Sindhi, Jadgali, Kutchi, Luwati, Memoni, Khetrani, Kholosi.

Western Zone[edit]

Western Indo-Aryan languages, are spoken in the central and western areas within India, such as Madhya Pradesh and Rajasthan, in addition to contiguous regions in Pakistan. Gujarati is the official language of Gujarat, and is spoken by over 50 million people. In Europe, various Romani languages are spoken by the Romani people, an itinerant community who historically migrated from India. The Western Indo-Aryan languages are thought to have diverged from their northwestern counterparts, although they have a common antecedent in Shauraseni Prakrit.

Rajasthani: Standard Rajasthani, Bagri, Marwari, Mewati, Dhundari, Harauti, Mewari, Shekhawati, Dhatki, Malvi, Nimadi, Gujari, Goaria, Loarki, Kanjari, Od;

Gujarati: Gujarati, Jandavra, Saurashtra, Aer, Vaghri, Parkari Koli, Kachi Koli, Wadiyara Koli;

Bhil: Kalto, Vasavi, Wagdi, Gamit, Vaagri Booli;
- Northern Bhil: Bauria, Bhilori, Magari;
- Central Bhil: Bhili proper, Bhilali, Chodri, Dhodia, Dhanki, Dubli;
- Bareli: Palya Bareli, Pauri Bareli, Rathwi Bareli, Pardhi;

Khandeshi
Lambadi
Domaaki
Domari

Romani: Carpathian Romani, Balkan Romani, Vlax Romani;
- Northern Romani: Sinte Romani, Finnish Kalo, Baltic Romani.

Central Zone (Madhya or Hindi)[edit]

Within India, Hindi languages are spoken primarily in the Hindi belt regions and Gangetic plains, including Delhi and the surrounding areas; where they are often transitional with neighbouring lects. Many of these languages, including Braj and Awadhi, have rich literary and poetic traditions. Urdu, a Persianized derivative of Khariboli, is the official language of Pakistan and also has strong historical connections to India, where it also has been designated with official status. Hindi, a standardized and Sanskritized register of Khariboli, is the official language of the Government of India. Together with Urdu, it is the third most-spoken language in the world.

Western Hindi: Hindustani (including Standard Hindi and Standard Urdu), Braj, Haryanvi, Bundeli, Kannauji, Parya;

Eastern Hindi: Bagheli, Chhattisgarhi, Surgujia;
- Awadhi: Fiji Hindi.

Eastern Zone[edit]

Eastern Indo-Aryan languages are spoken throughout the eastern subcontinent, including Odisha and Bihar; alongside other regions surrounding the northwestern Himalayan corridor. Bengali is the seventh most-spoken language in the world, and has a strong literary tradition; the national anthems of India and Bangladesh are written in Bengali. Assamese and Odia are the official languages of Assam and Odisha, respectively. Eastern Indo-Aryan languages are ultimately derived from Magadhi Prakrit.

Bihari: Tharu,^[25] Majhi, Kurmali, Sadri (Nagpuri), Maithili, Angkika, Bajjika, Musasa, Kumhali;
- Kuswaric:^[26] Danwar, Bote-Darai
- Bhojpuri: Caribbean Hindustani;
- Magahi: Khortha;

Halbic: Halbi, Bhatri, Kamar, Mirgan, Nahari;

Odia: Baleswari, Garhjati (Northwestern Odia), Central Odia, Ganjami, Sambalpuri, Desia, Bodo Parja, Reli, Kupia;

Bengali–Assamese: Bishnupriya Manipuri, Sylheti, Hajong, Chittagonian, Chakma, Tanchangya, Rohingya;
- Bengali-Gauda: Bengali, Bangali, Rarhi, Varendri, Sundarbani, Manbhumi, Dhakaiya Kutti, Dobhashi;
- Kamarupic: Assamese, Kamrupi, Goalpariya, Rangpuri, Surjapuri, Rajbanshi;

Southern Zone[edit]

Marathi-Konkani languages are ultimately descended from Maharashtri Prakrit; whereas Insular Indo-Aryan languages are descended from Elu Prakrit and possess several characteristics that markedly distinguish them from most of their mainland Indo-Aryan counterparts.

Marathi-Konkani
- Marathic: Marathi, Varhadi, Andh, Berar-Deccan Marathi, Phudagi, Katkari, Varli, Kadodi;
- Konkanic: Konkani, Canarese Konkani, Maharashtrian Konkani.

Insular Indo-Aryan
- Sinhala
- Maldivian: Dhivehi, Mahl.

Unclassified[edit]

The following languages are related to each other, but are otherwise unclassified within Indo-Aryan:

Chinali–Lahul Lohar:^[27] Chinali, Lahul Lohar.

History[edit]

Proto-Indo-Aryan[edit]

Proto-Indo-Aryan, or sometimes Proto-Indic, is the reconstructed proto-language of the Indo-Aryan languages. It is intended to reconstruct the language of the pre-Vedic Indo-Aryans. Proto-Indo-Aryan is meant to be the predecessor of Old Indo-Aryan (1500–300 BCE) which is directly attested as Vedic and Mitanni-Aryan. Despite the great archaicity of Vedic, however, the other Indo-Aryan languages preserve a small number of archaic features lost in Vedic.

Mitanni-Aryan hypothesis[edit]

Some theonyms, proper names and other terminology of the Mitanni exhibit an Indo-Aryan superstrate, suggest that an Indo-Aryan elite imposed itself over the Hurrians in the course of the Indo-Aryan expansion. In a treaty between the Hittites and the Mitanni, the deities Mitra, Varuna, Indra, and the Ashvins (Nasatya) are invoked. Kikkuli‘s horse training text includes technical terms such as aika (cf. Sanskrit eka, “one”), tera (tri, “three”), panza (pancha, “five”), satta (sapta, seven), na (nava, “nine”), vartana (vartana, “turn”, round in the horse race). The numeral aika “one” is of particular importance because it places the superstrate in the vicinity of Indo-Aryan proper as opposed to Indo-Iranian in general or early Iranian (which has aiva)^[28]

Another text has babru (babhru, “brown”), parita (palita, “grey”), and pinkara (pingala, “red”). Their chief festival was the celebration of the solstice (vishuva) which was common in most cultures in the ancient world. The Mitanni warriors were called marya, the term for “warrior” in Sanskrit as well; note mišta-nnu (= miẓḍha, ≈ Sanskrit mīḍha) “payment (for catching a fugitive)” (M. Mayrhofer, Etymologisches Wörterbuch des Altindoarischen, Heidelberg, 1986–2000; Vol. II:358).

Sanskritic interpretations of Mitanni royal names render Artashumara (artaššumara) as Ṛtasmara “who thinks of Ṛta” (Mayrhofer II 780), Biridashva (biridašṷa, biriiašṷa) as Prītāśva “whose horse is dear” (Mayrhofer II 182), Priyamazda (priiamazda) as Priyamedha “whose wisdom is dear” (Mayrhofer II 189, II378), Citrarata as Citraratha “whose chariot is shining” (Mayrhofer I 553), Indaruda/Endaruta as Indrota “helped by Indra” (Mayrhofer I 134), Shativaza (šattiṷaza) as Sātivāja “winning the race price” (Mayrhofer II 540, 696), Šubandhu as Subandhu “having good relatives” (a name in Palestine, Mayrhofer II 209, 735), Tushratta (tṷišeratta, tušratta, etc.) as *tṷaiašaratha, Vedic Tvastar “whose chariot is vehement” (Mayrhofer, Etym. Wb., I 686, I 736).

Indian subcontinent[edit]

Dates indicate only a rough time frame.

Proto-Indo-Aryan (before 1500 BCE, reconstructed)
Old Indo-Aryan (ca. 1500–300 BCE)
- early Old Indo-Aryan: includes Vedic Sanskrit (ca. 1500 to 500 BCE)
- late Old Indo-Aryan: Epic Sanskrit, Classical Sanskrit (ca. 200 CE to 1300 CE)
- Mitanni Indo-Aryan (ca. 1400 BCE) (middle Indo-Aryan features)
Middle Indo-Aryan or Prakrits, (ca. 300 BCE to 1500 CE)
- early Buddhist texts (ca. 6th or 5th century BCE)
- early Middle Indo-Aryan: e.g. Ashokan Prakrits, Pali, Gandhari, (ca. 300 BCE to 200 BCE)
- middle Middle Indo-Aryan: e.g. Dramatic Prakrits, Elu (ca. 200 BCE to 700 CE)
- late Middle Indo-Aryan: e.g. Abahattha (ca. 700 CE to 1500 CE)
Early Modern Indo-Aryan (Late Medieval India): e.g. early Dakhini and emergence of the Dehlavi dialect

Old Indo-Aryan[edit]

The earliest evidence of the group is from Vedic Sanskrit, that is used in the ancient preserved texts of the Indian subcontinent, the foundational canon of the Hindu synthesis known as the Vedas. The Indo-Aryan superstrate in Mitanni is of similar age to the language of the Rigveda, but the only evidence of it is a few proper names and specialized loanwords.^[29]

While Old Indo-Aryan is the earliest stage of the Indo-Aryan branch, from which all known languages of the later stages Middle + New Indo-Aryan are derived, some documented Middle Indo-Aryan variants cannot fully be derived from the documented form of Old Indo-Aryan (i.e. Sanskrit), but betray features that must go back to other undocumented variants/dialects of Old Indo-Aryan.^[30]

From Vedic Sanskrit, “Sanskrit” (literally “put together”, “perfected” or “elaborated”) developed as the prestige language of culture, science and religion as well as the court, theatre, etc. Sanskrit of the later Vedic texts is comparable to Classical Sanskrit, but is largely mutually unintelligible with Vedic Sanskrit.^[31]

Middle Indo-Aryan (Prakrits)[edit]

Mitanni inscriptions show some Middle Indo-Aryan characteristics along with Old Indo-Aryan, for example sapta in Old Indo-Aryan becomes satta (pt develops into Middle Indo-Aryan tt). According to S.S. Misra this language can be similar to Buddhist-hybrid Sanskrit which might not be a mixed language but an early middle Indo-Aryan occurring much before Prakrit.^{[n 1]}^{[n 2]}

Outside the learned sphere of Sanskrit, vernacular dialects (Prakrits) continued to evolve. The oldest attested Prakrits are the Buddhist and Jain canonical languages Pali and Ardhamagadhi Prakrit, respectively. Inscriptions in Ashokan Prakrit were also part of this early Middle Indo-Aryan stage.

By medieval times, the Prakrits had diversified into various Middle Indo-Aryan languages. Apabhraṃśa is the conventional cover term for transitional dialects connecting late Middle Indo-Aryan with early Modern Indo-Aryan, spanning roughly the 6th to 13th centuries. Some of these dialects showed considerable literary production; the Śravakacāra of Devasena (dated to the 930s) is now considered to be the first Hindi book.

The next major milestone occurred with the Muslim conquests in the Indian subcontinent in the 13th–16th centuries. Under the flourishing Turco-Mongol Mughal Empire, Persian became very influential as the language of prestige of the Islamic courts due to adoptation of the foreign language by the Mughal emperors. However, Persian was soon displaced by Hindustani. This Indo-Aryan language is a combination with Persian, Arabic, and Turkic elements in its vocabulary, with the grammar of the local dialects.

The two largest languages that formed from Apabhraṃśa were Bengali and Hindustani; others include Assamese, Sindhi, Gujarati, Odia, Marathi, and Punjabi.

New Indo-Aryan[edit]

Medieval Hindustani[edit]

In the Central Zone Hindi-speaking areas, for a long time the prestige dialect was Braj Bhasha, but this was replaced in the 19th century by Dehlavi-based Hindustani. Hindustani was strongly influenced by Persian, with these and later Sanskrit influence leading to the emergence of Modern Standard Hindi and Modern Standard Urdu as registers of the Hindustani language.^[32]^[33] This state of affairs continued until the division of the British Indian Empire in 1947, when Hindi became the official language in India and Urdu became official in Pakistan. Despite the different script the fundamental grammar remains identical, the difference is more sociolinguistic than purely linguistic.^[34]^[35]^[36] Today it is widely understood/spoken as a second or third language throughout South Asia^[37] and one of the most widely known languages in the world in terms of number of speakers.

Outside the Indian subcontinent[edit]

Domari[edit]

Domari is an Indo-Aryan language spoken by older Dom people scattered across the Middle East. The language is reported to be spoken as far north as Azerbaijan and as far south as central Sudan.^[38]^:1 Based on the systematicity of sound changes, linguists have concluded that the ethnonyms Domari and Romani derive from the Indo-Aryan word ḍom.^[39]

Lomavren[edit]

Lomavren is a nearly extinct mixed language, spoken by the Lom people, that arose from language contact between a language related to Romani and Domari^[40] and the Armenian language.

Romani[edit]

The Romani language is usually included in the Western Indo-Aryan languages.^[41] Romani varieties, which are mainly spoken throughout Europe, are noted for their relatively conservative nature; maintaining the Middle Indo-Aryan present-tense person concord markers, alongside consonantal endings for nominal case. Indeed, these features are no longer evident in most other modern Central Indo-Aryan languages. Moreover, Romani shares an innovative pattern of past-tense person, which corresponds to Dardic languages, such as Kashmiri and Shina. This is believed to be further indication that proto-Romani speakers were originally situated in central regions of the subcontinent, before migrating to northwestern regions. However, there are no known historical sources regarding the development of the Romani language specifically within India.

Research conducted by nineteenth-century scholars Pott (1845) and Miklosich (1882–1888) demonstrated that the Romani language is most aptly designated as a New Indo-Aryan language (NIA), as opposed to Middle Indo-Aryan (MIA); establishing that proto-Romani speakers could not have left India significantly earlier than AD 1000.

The principal argument favouring a migration during or after the transition period to NIA is the loss of the old system of nominal case, coupled with its reduction to a two-way nominative-oblique case system,. A secondary argument concerns the system of gender differentiation, due to the fact that Romani has only two genders (masculine and feminine). Middle Indo-Aryan languages (named MIA) generally employed three genders (masculine, feminine and neuter), and some modern Indo-Aryan languages retain this aspect today.

It is suggested that loss of the neuter gender did not occur until the transition to NIA. During this process, most of the neuter nouns became masculine, while several became feminine. For example, the neuter aggi “fire” in Prakrit morphed into the feminine āg in Hindi, and jag in Romani. The parallels in grammatical gender evolution between Romani and other NIA languages have additionally been cited as indications that the forerunner of Romani remained on the Indian subcontinent until a later period, possibly as late as the tenth century.

Sindhic migrations[edit]

Kholosi, Jadgali, and Luwati represent offshoots of the Sindhic subfamily of Indo-Aryan that have established themselves in the Persian gulf region, perhaps through sea-based migrations. These are of a later origin than the Rom and Dom migrations which represent a different part of Indo-Aryan as well.

Phonology[edit]

Consonants[edit]

Stop positions[edit]

The normative system of New Indo-Aryan stops consists of five points of articulation: labial, dental, “retroflex“, palatal, and velar, which is the same as that of Sanskrit. The “retroflex” position may involve retroflexion, or curling the tongue to make the contact with the underside of the tip, or merely retraction. The point of contact may be alveolar or postalveolar, and the distinctive quality may arise more from the shaping than from the position of the tongue. Palatals stops have affricated release and are traditionally included as involving a distinctive tongue position (blade in contact with hard palate). Widely transcribed as [tʃ], Masica (1991:94) claims [cʃ] to be a more accurate rendering.

Moving away from the normative system, some languages and dialects have alveolar affricates [ts] instead of palatal, though some among them retain [tʃ] in certain positions: before front vowels (esp. /i/), before /j/, or when geminated. Alveolar as an additional point of articulation occurs in Marathi and Konkani where dialect mixture and others factors upset the aforementioned complementation to produce minimal environments, in some West Pahari dialects through internal developments (*t̪ɾ, t̪ > /tʃ/), and in Kashmiri. The addition of a retroflex affricate to this in some Dardic languages maxes out the number of stop positions at seven (barring borrowed /q/), while a reduction to the inventory involves *ts > /s/, which has happened in Assamese, Chittagonian, Sinhala (though there have been other sources of a secondary /ts/), and Southern Mewari.

Further reductions in the number of stop articulations are in Assamese and Romany, which have lost the characteristic dental/retroflex contrast, and in Chittagonian, which may lose its labial and velar articulations through spirantisation in many positions (> [f, x]). ^[42]

Stop series	Language(s)
/p/, /t̪/, /ʈ/, /tʃ/, /k/	Hindi, Punjabi, Dogri, Sindhi, Gujarati, Sinhala, Odia, Standard Bengali, dialects of Rajasthani (except Lamani, NW. Marwari, S. Mewari)
/p/, /t̪/, /ʈ/, /tɕ/, /k/	Sanskrit^[43], Maithili, Magahi, Bhojpuri^[44]
/p/, /t̪/, /ʈ/, /ts/, /k/	Nepali, dialects of Rajasthani (Lamani and NW. Marwari), Northern Lahnda’s Kagani, Kumauni, many West Pahari dialects (not Chamba Mandeali, Jaunsari, or Sirmauri)
/p/, /t̪/, /ʈ/, /ts/, /tʃ/, /k/	Marathi, Konkani, certain W. Pahari dialects (Bhadrawahi, Bhalesi, Padari, Simla, Satlej, maybe Kulu), Kashmiri
/p/, /t̪/, /ʈ/, /ts/, /tʃ/, /tʂ/, /k/	Shina, Bashkarik, Gawarbati, Phalura, Kalasha, Khowar, Shumashti, Kanyawali, Pashai
/p/, /t̪/, /ʈ/, /k/	Rajasthani’s S. Mewari
/p/, /t̪/, /t/, /ts/, /tɕ/, /k/	E. and N. dialects of Bengali (Dhaka, Mymensing, Rajshahi)
/p/, /t/, /k/	Assamese
/p/, /t/, /tʃ/, /k/	Romani
/t̪/, /ʈ/, /k/ (with /i/ and /u/)	Sylheti
/t̪/, /t/	Chittagonian

Nasals[edit]

Sanskrit was noted as having five nasal-stop articulations corresponding to its oral stops, and among modern languages and dialects Dogri, Kacchi, Kalasha, Rudhari, Shina, Saurasthtri, and Sindhi have been analysed as having this full complement of phonemic nasals /m/ /n/ /ɳ/ /ɲ/ /ŋ/, with the last two generally as the result of the loss of the stop from a homorganic nasal + stop cluster ([ɲj] > [ɲ] and [ŋɡ] > [ŋ]), though there are other sources as well.^[45]

Charts[edit]

The following are consonant systems of major and representative New Indo-Aryan languages, mostly following Masica (1991:106–107), though here they are in IPA. Parentheses indicate those consonants found only in loanwords: square brackets indicate those with “very low functional load”. The arrangement is roughly geographical.

Romani
p	t	(ts)	tʃ	k	pʲ	tʲ	kʲ
b	d	(dz)	dʒ	ɡ	bʲ	dʲ	ɡʲ
pʰ	tʰ		tʃʰ	kʰ
m	n					nʲ
(f)	s		ʃ	x	(fʲ)	sʲ
v	(z)		ʒ	ɦ	vʲ	zʲ
	ɾ	l				lʲ
			j

Shina
p	t̪	ʈ	ts	tʃ	tʂ	k
b	d̪	ɖ		dʒ	ɖʐ	ɡ
pʰ	t̪ʰ	ʈʰ	tsʰ	tʃʰ	tʂʰ	kʰ
m	n	ɳ		ɲ		ŋ
(f)	s	ʂ		ɕ
	z	ʐ		ʑ		ɦ
	ɾ l	ɽ
w				j

Kashmiri
p	t̪	ʈ	ts	tʃ	k	pʲ	t̪ʲ	ʈʲ	tsʲ	kʲ
b	d̪	ɖ		dʒ	ɡ	bʲ	d̪ʲ	ɖʲ		ɡʲ
pʰ	t̪ʰ	ʈʰ	tsʰ	tʃʰ	kʰ	pʲʰ	t̪ʲʰ	ʈʲʰ	tsʲʰ	kʲʰ
m	n			ɲ		mʲ	nʲ
	s			ʃ			sʲ
	z				ɦ		zʲ			ɦʲ
	ɾ l						ɾʲ lʲ
w				j		wʲ

Saraiki
p	t̪	ʈ	tʃ	k
b	d̪	ɖ	dʒ	ɡ
pʰ	t̪ʰ	ʈʰ	tʃʰ	kʰ
bʱ	d̪ʱ	ɖʱ	dʒʱ	ɡʱ
ɓ		ɗ	ʄ	ɠ
m	n	ɳ	ɲ	ŋ
mʱ	nʱ	ɳʱ
	s		(ʃ)	(x)
	(z)			(ɣ) ɦ
	ɾ l	ɽ
	ɾʱ lʱ	ɽʱ
w			j
wʱ

Punjabi
p	t̪	ʈ	tʃ	k
b	d̪	ɖ	dʒ	ɡ
pʰ	t̪ʰ	ʈʰ	tʃʰ	kʰ
m	n	ɳ	[ɲ	ŋ
(f)	s	ʃ
	(z)			ɦ
	ɾ l	ɽ ɭ
[w]			[j]

Nepali
p	t̪	ʈ	ts	k
b	d̪	ɖ	dz	ɡ
pʰ	t̪ʰ	ʈʰ	tsʰ	kʰ
bʱ	d̪ʱ	ɖʱ	dzʱ	ɡʱ
m	n			ŋ
mʱ	nʱ
	s		ʃ	ɦ
	ɾ l
	ɾʱ lʱ
[w]			[j]

Sylheti^[46]
	t̪	ʈ
b	d̪	ɖ		ɡ
m	n			ŋ
ɸ	s		ʃ	x
	z			ɦ
	r l

Sindhi
p	t̪	ʈ	tʃ	k
b	d̪	ɖ	dʒ	ɡ
pʰ	t̪ʰ	ʈʰ	tʃʰ	kʰ
bʱ	d̪ʱ	ɖʱ	dʒʱ	ɡʱ
ɓ		ɗ	ʄ	ɠ
m	n	ɳ	ɲ	ŋ
mʱ	nʱ	ɳʱ
	s		(ʃ)	(x)
	(z)			(ɣ) ɦ
		ɾ l	ɽ
		ɾʱ lʱ	ɽʱ
w			j
wʱ

Marwari
p	t̪	ʈ	tʃ	k
b	d̪	ɖ	dʒ	ɡ
pʰ	t̪ʰ	ʈʰ	tʃʰ	kʰ
bʱ	d̪ʱ	ɖʱ	dʒʱ	ɡʱ
ɓ	ɗ̪	ɗ		ɠ
m	n	ɳ
mʱ	nʱ
	s			ɦ
	ɾ l	ɽ ɭ
w			j
wʱ

Hindustani
p	t̪	ʈ	tʃ	(q)	k
b	d̪	ɖ	dʒ	(ɣ)	ɡ
pʰ	t̪ʰ	ʈʰ	tʃʰ	(x)	kʰ
bʱ	d̪ʱ	ɖʱ	dʒʱ		ɡʱ
m	n	(ɳ)
(f)	s	(ʂ)	ʃ	(ʒ)
	(z)				ɦ
	[r] ɾ l	ɽ
		ɽʱ
ʋ[w]			j

Assamese
p	t	k
b	d	g
pʰ	tʰ	kʰ
bʱ	dʱ	ɡʱ
m	n	ŋ
	s	x
	z	ɦ
	ɹ l
[w]

Bengali
p	t̪	ʈ	tʃ	k
b	d̪	ɖ	dʒ	ɡ
pʰ	t̪ʰ	ʈʰ	tʃʰ	kʰ
bʱ	d̪ʱ	ɖʱ	dʒʱ	ɡʱ
m	n
			ʃ	ɦ
	ɾ l	ɽ
[w]			[j]

Gujarati
p	t̪	ʈ	tʃ	k
b	d̪	ɖ	dʒ	ɡ
pʰ	t̪ʰ	ʈʰ	tʃʰ	kʰ
bʱ	d̪ʱ	ɖʱ	dʒʱ	ɡʱ
m	n	ɳ
mʱ	nʱ	ɳʱ
	s		ʃ	ɦ
	ɾ l	ɭ
	ɾʱ lʱ
w			j

Marathi
p	t̪	ʈ	ts	tʃ	k
b	d̪	ɖ	dz	dʒ	ɡ
pʰ	t̪ʰ	ʈʰ		tʃʰ	kʰ
bʱ	d̪ʱ	ɖʱ	dzʱ	dʒʱ	ɡʱ
m	n	ɳ
mʱ	nʱ
	s			ʃ	ɦ
	ɾ l	ɭ
	ɾʱ lʱ
w			j
wʱ

Odia
p	t̪	ʈ	tʃ	k
b	d̪	ɖ	dʒ	ɡ
pʰ	t̪ʰ	ʈʰ	tʃʰ	kʰ
bʱ	d̪ʱ	ɖʱ	dʒʱ	ɡʱ
m	n	ɳ
	s			ɦ
	ɾ l	[ɽ] ɭ
		[ɽʱ]
[w]			[j]

Sinhala
p	t̪	ʈ	tʃ	k
b	d̪	ɖ	dʒ	ɡ
ᵐb	ⁿ̪d̪	ᶯɖ		ᵑɡ
m	n		ɲ	ŋ
	s			ɦ
	ɾ l
w			j

Language and dialect[edit]

In the context of South Asia, the choice between the appellations “language” and “dialect” is a difficult one, and any distinction made using these terms is obscured by their ambiguity. In one general colloquial sense, a language is a “developed” dialect: one that is standardised, has a written tradition and enjoys social prestige. As there are degrees of development, the boundary between a language and a dialect thus defined is not clear-cut, and there is a large middle ground where assignment is contestable. There is a second meaning of these terms, in which the distinction is drawn on the basis of linguistic similarity. Though seemingly a “proper” linguistics sense of the terms, it is still problematic: methods that have been proposed for quantifying difference (for example, based on mutual intelligibility) have not been seriously applied in practice; and any relationship established in this framework is relative.^[47]

Indo-European topics
Part of a series on

Languages [show]
Philology[show]
Origins[show]
Archaeology[show]
Peoples and societies[show]
Religion and mythology[show]
Indo-European studies [show]
v t e

LEAVE A COMMENT Cancel reply

Recent Posts

Recent Comments

Archives

Categories

Meta