Making of the election visuals

This post is by Anand, Chief Data Scientist at Gramener

I was on va­ca­tion in December – which is a rare oc­cur­rence, since my work is such fun that I don’t need much of a va­ca­tion. But the up­side of a form­al va­ca­tion is that I get to switch my phone off and get sev­er­al un­in­ter­rup­ted hours.

Code for the 2013 election scraper

So, on Sat Dec 14 2013, af­ter watch­ing Men in Black III, I got bored and scraped the 2013 as­sembly elec­tion res­ults from the ECI web­site. (My memory isn’t very good. I know the date only be­cause I track my code on git and my movie watch­ing on Excel.)

Unlike many oth­er ECI pages, this one was fairly easy. You can see the code here – it’s fairly small – and I had the names, parties and votes for each of the 7,238 can­did­ates from the 2013 elec­tions.

The next week­end, I took a shot at scrap­ing the elec­tion stat­ist­ics page that had res­ults for every as­sembly elec­tion. This was a tougher chal­lenge by an or­der of mag­nitude. Most of the old res­ults are in PDF files. I tried a few PDF table pars­ing solu­tions, like Tabula, but they all in­volved sig­ni­fic­ant manu­al ef­fort, or were not ac­cur­ate enough.

Finally, I used xp­df to con­vert the PDF to text, and then parsed the text. There where a num­ber of quirks that need to be taken in­to ac­count. The de­tailed res­ults, for ex­ample, start when the word “DETAILED RESULTS” are men­tioned for the first time in the doc­u­ment. The ex­cep­tions are Goa 1989 and UP 1996. Similarly, most elec­tions have a seri­al num­ber be­fore the name of the can­did­ate. Some 2008-2009 elec­tions, how­ever, have two num­bers be­fore the name. I’ve no idea what these are.

But af­ter a week of strug­gling the vari­ations, I fi­nally had a parser that cap­tured all the as­sembly elec­tion res­ults. All of this code and data is avail­able on the datameet elec­tion data re­pos­it­ory un­der an open li­cense.

Next came the visu­al­isa­tion, for which I settled on this one:


Each row shows who won the as­sembly elec­tions and stayed in power for how long – ef­fect­ively cap­tur­ing the his­tory of Indian as­sembly elec­tions at one shot.

Clicking on the top level visu­al drills down in­to the de­tails of each elec­tion, broken down in a vari­ety of ways, an­swer­ing ques­tions such as which party has won in a given con­stitu­ency, and how did it per­form over time? What kind of pres­ence does a party have in a state? What is the geo­graph­ic dis­tri­bu­tion of the win­ning party? etc.



Over sev­er­al months, this evolved in­to mul­tiple branches of visu­al­isa­tions that you now see power the CNN-IBN Microsoft Analytics Centre and the Economic Times con­stitu­ency track­er.

My fa­vour­ite mo­ment was on 7th April, on the day of the launch, when I was at the stu­dio and ended up sneak­ing my way in­to na­tion­al tele­vi­sion (which, of course, is a lot more fun than be­ing on it le­git­im­ately.)

At the mo­ment, all of our pub­lic fa­cing visu­als fo­cus on the Lok Sabha elec­tions, but the ori­gins were in the as­sembly elec­tions, and we’ll be im­prov­ing and mak­ing those visu­als pub­lic as well in a few months.

2 thoughts on “Making of the election visuals”

  1. Brilliant work by the team. But I ser­i­ously think your Team missed on Vote Share %. It is a very im­port­ant at­trib­ute and shows sig­ni­fic­ant traits bey­ond seats. You had votes fro every seat. so you could have shown Pol-date/State/Year-Wise Vote Percent

Leave a Reply