What’s in a name

Even simple information such as the names of candidates can provide a rich field for analysis. (For example, a few years ago, we conducted a similar analysis on the names of students in Tamil Nadu, and found that north Indian surnames consistently outperform.)

Even simple things such as the pattern in the length of names lend themselves to analysis.

State-wise name length

Gujarat is among the states whose candidates have rather long names. Rajendrasinh Ghanshyamsinh Rana (Rajubhai Rana) of BJP who won at Bhavnagar is an example of such a name. In Maharashtra, Bhonsle Shrimant Chh. Udyanraje Pratapsinhmaharaj of NCP who won at Satara is another such example.

At the other extreme, G.D. is the full official name of a BSP candidate at Tikamgarh, MP – the shortest name in our elections. Several candidates in Utttar Pradesh (Anil, Aman, Asha, Boby) and Kerala (Babu, Baby) have 4-letter names.

Here’s a map of the length of the candidate names – darker regions representing longer candidate names, and lighter regions the shorter names.



Name frequencyThe most common name among candidates was Om Prakash – with 122 such candidates (spelt also as Om Parkash). This is apart from several Om Prakash Singhs, Om Prakash Sharmas and others.

Ashok Kumar is the second most popular name, followed by Ram Singh.

This can possibly lead to confusion at the ballot. For example, at Kollam, Kerala, there were two people named N. Peethambarakurup – one the Congress candidate (who won) and the other an independent. At Thiruvallur, TN, Badaun, UP,  and in several other places, there were other candidates with the exact same name as the winning candidate. However, in none of the cases did the doppelganger receive enough votes to make a difference to the victory.

Word cloud

Breaking up the names, the surname or middle name Singh is by far the most common among all candidates.

Name trends

Over the years, the “Singh”s have the strongest representation among the candidates, though the “Kumar”s have grown steadily and significantly to take the second rank. The representation of the “Lal”s has declined from second rank to third, the “Nath”s from 3rd to5th, and the “Das”s from 5th to 7th. The representation of “Yadav”s has grown steadily as well.

Name ranks

We rarely come across data that is useless or irrelevant. Most data, even plain text, even the names of candidates, can yield insights if looked at in different ways.

Hopefully, this post will inspire some of you to look at your data with a different lens.

Leave a Reply