Three Ways of Knowing the Same People


There is a community of about 7 million people in eastern India — spread across Jharkhand, West Bengal, Bihar, and Odisha — who remember, in song, walking through a mountain pass guided by a deity, to reach a golden land of five rivers, which they then had to abandon overnight. Modern genetics can trace where their great-great-grandfathers came from, 4,000 years ago. And ancient Sanskrit texts called their kings demons.

The community is the Santal people. The Santals are one branch of the larger Munda family — along with the Mundari, the Ho, the Kharia — who together form the Austroasiatic substrate of eastern India. They are not marginal or obscure; they are among the oldest populations in South Asia with a continuous oral tradition. And they are knowable through three completely different archives: their own songs, the genetic record, and the outside world’s mythological gaze.

Each archive gives a radically different picture. The friction between the three is the real history.


Lens One: The Song

In the 1870s, a Norwegian missionary named Lars Skrefsrud sat with Santal elders in the Santhal Parganas — the upland territory the British had administratively carved out of Bengal and Bihar — and wrote down what they told him. The result was the Horkoren Mare Hapramko Reak Katha, “The Story of the Ancestors of the Santal People.” It is an origin narrative. It is also, with careful reading, a migration itinerary.

The journey runs in five named stages.

Hihiri Pipiri. The primordial homeland. Pipiri-am in Santali means butterfly — a butterfly homeland: mild, temperate, pleasant. This is before the trouble begins.

Harata. Near-annihilation. A great fire reduces the entire people to a single couple. The Santal oral tradition preserves this catastrophe without explaining its cause. Was it a real fire? A famine? A military bottleneck — a moment when the population was compressed by external force and nearly extinguished? One plausible reading: the Harata episode encodes a genuine demographic crisis, possibly corresponding to the period when Munda populations came under severe pressure. This is contested and not yet demonstrated linguistically — the honest position is that we do not know what event it encodes.

Sasan Beda. Recovery and codification. The survivors reach an alluvial plain by a great river and begin again. Seven clans crystallize: Murmu, Kisku, Hemrom, Marandi, Soren, Tudu, Hansdak. Exogamous clan laws are established — you may not marry within your clan. This is not incidental detail; this is the founding of a society. The seven clans persist today, unchanged, as the organizing structure of Santal identity.

Jarpi. An impassable mountain range. The people cannot cross. They propitiate মারাং বুরু (Marang Buru, “Great Mountain”) — the supreme deity of the Santal pantheon. Marang Buru reveals the সিন দুয়ার sin duar · door/gate; from Sanskrit dvāra, cognate with English 'door'; a door has two leaves — Latin duo, Greek duo, Sanskrit dvi (two) — and in Santali, du- carries the sense of opening or beginning; all these meanings converge in this single word (Sin Duar, “Sun Door”) — a pass through the mountains. The people cross.

Chae Champa. The golden kingdom. The Santals arrive in a fortified, agriculturally advanced, sovereign land. This is the apex of the narrative: prosperity, self-governance, cultural coherence. And then — according to the tradition — they had to leave. Overnight.

The Sacred Thread Episode

There is one detail in the Horkoren that I keep returning to because it is so precisely structured as a memory of loss.

The seven sons — the founders of the seven clans — once wore the sacred thread. The sacred thread, the yajñopavīta, is the Brahmanical marker of twice-born caste status. It confers ritual standing — access to Vedic rites, the right to be initiated, the acknowledgment of full humanity within the social order the Brahmanical system created.

In the Santal narrative: while the seven sons bathed in a river, the Dhaimana snake crept out and stole all seven threads. Since that day, the Santals do not wear the sacred thread. And since that day, they kill the Dhaimana snake on sight.

The architecture of this memory is striking. It does not say “we never had the sacred thread.” It says: we had it, and it was stolen. The specificity — seven threads, seven sons, one snake, the river, the bathing — is the specificity of traumatic memory, not of symbolic invention. This encodes a claim: there was a time when the Santal people held ritual standing comparable to the twice-born castes, and that standing was taken from them.

Taken by whom? The snake is a figure, not a person. But the Brahmanical order that denied Munda peoples ritual status was very much composed of persons. One plausible reading: the Dhaimana snake compresses the process by which Austroasiatic peoples were written out of the Vedic ritual hierarchy — a process that happened over centuries, collapsed into a single image.

The Kherwal Identity

The Santals do not primarily call themselves Santals. The autonym — the name a people calls itself — is খেরওয়াল (Kherwāl). The Kherwar identity encompasses Santals and several related Munda-speaking peoples: a confederal self-designation predating the colonial administrative categories that sorted them into “tribes.”

In the 19th century, the Kherwar revivalist movement emerged in the Santhal Parganas, explicitly attempting to restore a pre-Hindu identity — to recover something of the selfhood that existed before the sacred thread was stolen. It resisted both Brahmanical Hinduism and Christian missionary conversion, insisting on the integrity of Bon Bonga (forest spirit religion) and the Santal seasonal calendar. It was an argument that the Horkoren narrative was not merely a story but a political claim.


Lens Two: The Y Chromosome

The genetic picture for the Munda peoples is striking and, at first glance, counterintuitive.

The majority of Munda men — across Santali, Mundari, and Ho communities — carry Y-chromosome haplogroup O1b1a1a. This is a Southeast Asian male lineage. It is widespread in Mon, Khmer, and Vietnamese populations today. It is not a South Asian lineage; it originated somewhere in mainland Southeast Asia and spread north and west.

The genetic inference — synthesized by linguist Paul Sidwell in 2018 — is that the male ancestors of the Munda peoples arrived on the Odisha coast, by sea, approximately 2,000–1,500 BCE, four thousand years ago. They came from the east.

The maternal (mitochondrial DNA) picture is entirely different: Munda maternal lineages are overwhelmingly South Asian, shared with neighboring populations. This asymmetry — Southeast Asian fathers, South Asian mothers — is the genetic signature of a male-biased migration. A relatively small group of men, arriving by boat on the eastern coast of the subcontinent, integrating with the existing South Asian population. The oral memory “we came from elsewhere” is genetically corroborated, specifically in the paternal line.

Now hold this next to the oral tradition. The Horkoren describes a journey ending with the Santals arriving through a mountain pass — the Sin Duar revealed by Marang Buru. The genetics says the male ancestors came from the east, by sea.

These do not contradict each other. They operate at different timescales.

The genetic origin is 4,000 years old. The oral migration memory may be a much more recent memory: the movement of the Munda peoples within the subcontinent, centuries or millennia after the Southeast Asian seafarers had long since settled and integrated with South Asian populations. The song remembers the last great migration. The genome remembers the first arrival. Neither is wrong. They are different things remembered by different media.

This is what I mean by the friction between archives. Each is accurate. Each is incomplete. The real history lives in the gap between them.


Lens Three: The Sanskrit Gaze

The Mahabharata has a genealogy of the eastern kingdoms. It is not flattering.

Vanga (Bengal), Anga (eastern Bihar), Pundra (north Bengal), Suhma (southwest Bengal), and Kalinga (Odisha) are all described as founded by the illegitimate sons of Bali — the অসুর (Asura, “demon-king”), defeated and pressed into the underworld by Vamana, the dwarf avatar of Vishnu. The eastern kingdoms are, in Brahmanical cosmography, demonic in origin.

অসুর asura · demon; but in earlier Vedic usage also 'lord, powerful one' — the fully negative sense developed as Aryan/non-Aryan cultural boundaries hardened

This is not mythology in the sense of being fictional. This is mythology in the sense of being a structured political statement dressed as cosmogony. The Mahabharata is acknowledging — in the only register available to it — that the eastern kingdoms had non-Aryan ruling elites of genuine power who could not simply be ignored, and so had to be incorporated into the cosmology as subordinate, marked, permanently stigmatized.

The Buddhist text Arya Manjusri Mulakalpa — composed around the 8th or 9th century CE — describes the people of Vanga and Harikela (coastal Bengal) speaking “Asura-speech.” But here the text does something unexpected: it praises this speech as “fluent and poetic like the current of the Ganga.” The designation Asura here is sociolinguistic — it means: eastern, powerful, non-Aryan, and eloquent. The Asura-speakers are not monstrous; they are eloquent in a register the text cannot fully assimilate into its own categories.

The Arthashastra of Kautilya — the 4th-century BCE manual of statecraft — advises on asura-vijayi: demonic or total victory, the complete seizure of an enemy’s land, goods, cattle, and family. This is the vocabulary of conquest. The complete subjugation and erasure of a defeated people.

The Munda oral histories — the memory of Chae Champa abandoned overnight, the sacred thread stolen, the migration into the hills and forest margins — are the subaltern record of populations on the receiving end of exactly this. The Arthashastra describes the policy from above. The Horkoren describes the same events from below, encoded in song, transmitted across thirty generations.


The Territory of Three States

I want to be precise about geography, because the colonial map distorts this.

The Santal migration itinerary covers territory that is today Jharkhand, Bihar, and Odisha. It is not narrowly “Bengali” history. The Kherwal identity predates the existence of West Bengal, Bihar, and Jharkhand as administrative entities by several thousand years. The British created the Santhal Parganas as a separate administrative unit in 1855 — after the Santhal Rebellion (হুল, Hul, “the uprising”) nearly overthrew British authority in the region. Before the British, there was no hard administrative boundary separating the Santali-speaking uplands from the Bengali-speaking lowlands. There was a zone of contact, trade, labor, and conflict.

Ancient kingdoms of eastern India

The Anga kingdom — centered in what is now eastern Bihar — appears in the same Mahabharata genealogy as Vanga (Bengal) and Kalinga (Odisha). All three labeled as Asura-descended. The stigmatization was regional, not narrowly Bengali. The outside world’s gaze covered the entire eastern arc, from the Ganga delta to the Odisha coast to the Jharkhand plateau.

This matters. The deep history of Bengali is not narrowly Bengali. It is the deep history of the eastern plateau — a landscape that modern state boundaries have divided but that was, for thousands of years, one terrain.


The Words as Bridge

ঢেঁকি (dheki, “rice-husking lever”). ডাঙা (danga, “raised ground”). ঝাড় (jhar, “forest, waterfall”). হাঁড়ি (hari, “earthen pot”). The words from the first post in this series.

These are the words of the people described in the three lenses above. The Santal oral epic describes people who farmed rice and worked highland ground. The genetic evidence confirms their ancestral territory was the eastern plateau — the Rāṛh, the Jharkhand uplands, the Odisha laterite. The Sanskrit texts acknowledge their power by stigmatizing it.

And their words are in the Bengali language. Not borrowed from a prestigious neighbor but absorbed from a population that was linguistically incorporated — whose vocabulary for daily life traveled into the new language because there were no Sanskrit alternatives for the things it described.

When I say ঢেঁকি (dheki), I am saying something the Santal people said. I do not know its history when I say it. Most Bengali speakers do not. But it is there, in the grain of the language, 4,000 years deep.

That is what substrate words are. Not decoration, not curiosity — the living residue of incorporation.

Sources

  1. Lars Skrefsrud (comp.); trans. P.O. Bodding. Traditions and Institutions of the Santals (Horkoren Mare Hapramko Reak' Katha) (1887 (Santali); 1942 (English trans.)). Universitetet i Oslo. Etnografiske Museum Bulletin No. 6. OCLC 7289140
  2. Paul Sidwell. "The Austroasiatic Language Family". The Oxford Handbook of the Languages of South Asia (2018). Oxford University Press. Ed. Cardona & Jain
  3. David Reich et al.. "Reconstructing Indian Population History". Nature (2009). Vol. 461, pp. 489–494 doi:10.1038/nature08365
  4. Vagheesh Narasimhan et al.. "The Formation of Human Populations in South and Central Asia". Science (2019). Vol. 365, eaat7487 doi:10.1126/science.aat7487
  5. H.H. Risley. The Tribes and Castes of Bengal (1891). Bengal Secretariat Press, Calcutta. 4 vols.

Next in this series: The Iron That Cut Their Own Forest