Invasion of the info age Indiana

Ishan Srivastava

They incessantly talk about mining and scraping and digging out. An innocent bystander might mistake the discussion to be connected to geological activity. But this is a crowd made up of people passionate about only one thing data. Loads and loads of it. From geeks wearing t-shirts with Hacker written over them to social scientists and NGO activists in kurtas. From 14-year-old hackers to greyhaired policy researchers. They usually meet over mailing lists, Google groups and video calls. But now the movement is getting more visible with events in Gurgaon, Bangalore, Hyderabad and Chennai.

And if you sit through these discussions, you will be surprised at whats thrown up. You will, for instance, find out that when it rains ground water level actually goes down in Orissa even as it goes up in Rajasthan (as expected). That students born in August and September, on an average, do better than other students in 10th and 12th exams in India, while those born in June score the lowest on average. It is the excitement of finding such counter-intuitive facts that brings enthusiasts together. Some just work towards procuring this information while others work to derive meaningful trends from it. Earlier if you wanted to find such trends, you would have to go join a research firm and work with data provided by them. Today, availability of such information has converted every inquisitive person into a scientist and every computer into a laboratory, says S Anand, chief data scientist at startup Gramener. com and a key person behind the movement. Official reports don’t always address your questions. Working with data allows you to ask questions which matter to you and find your own answers.

Some are also driven by gaps in information that exist in our society. Many large cities in India, including metros like Chennai, dont even have an updated list of bus stops or bus routes. Some people in data meets embark on documenting this information and making it available to the public in a usable form for free.

It all started in January last year when Anand was at Infosys along with his colleague and friend Thejesh G N. The duo dabbled in data analysis and visualisation techniques before getting in touch with other like-minded people. The interactions started with mailing lists and soon led to Skype sessions. A common theme was where do we get more data. We were of the view that in India if we looked hard enough for data we would find it, he says.

Data is obtained through various means. It can be publically available information or information accessed by making use of the RTI Act. Other sources include research from books and outreach programmes, i. e. collecting data from the field.

Beginning October last year, the group also started holding small meetings with about 10-20 participants at the first few sessions. Its first formal event, the Open Data Camp (ODC), was held at Googles office in Bangalore on March 24 this year. Around 250 registered, of which about 150 attended. Another one took place at Hyderabad on June 23 at the Indian School of Business campus. The group has been growing online, too, with 2-3 registrations on its mailing list every day. The community is expanding mainly through word of mouth, says Nisha Thompson, project manager at India Water Portal and a key figure behind the organisation of data meets.

Data meets in various cities are independent of each other, have different organisers and have their own focus, based on members interests and backgrounds. While people interested in the tech aspects dominate the Pune and Gurgaon sessions, people in Bangalore focus more on finding sources of data and their application. In Hyderabad, they focus on corporate data while the Chennai crowd is predominantly drawn from the social sector. The common thread that binds the community, albeit loosely, is shared interests without a larger defined agenda.

In terms of focus, August last year was a turning point for most such groups. It was the first time they were approached by NGOs who needed data as well as analysis on the information they had. This was the perfect collaboration. It went from a geek forum to geek plus NGO forum, says Anand.

Soon, individuals with a strong background in technology and who excelled in techniques like scraping (pulling data from web pages and other forms of readable formats like PDF) were joined by social researchers and activists who saw this exercise as an effective tool to create more transparency and accountability in governance. Transparent Chennai, a non-profit organisation that takes up pedestrians problems, slum issues, accountability of councillors and public toilets, was one such organisation. Data about the poor is simply not collected and most of what is provided to us can just be a bundle of papers, says Nithya Raman, founder of Transparent Chennai and also a speaker at a recent data meet in Bangalore.

The NGO has been working with data meets to create sets of data like measuring access to water and mapping out people living in recognised and unrecognised slums. More often than not, the result is an improvement upon existing government data, which is refined further to make it relevant for more users. In case of slums, for instance, their data suggests that more people live in unrecognised slums, with no clear rights, than in the official database. We are also inviting municipal corporation officials to attend data meets from now on, says Raman. From research to action, this is how we are trying to close the loop.

However, it doesn’t mean that this is the only path that the data community will take. We have deliberately kept the community loose. We dont want to set directions. Even if we try to, it wont work, says Anand. It is primarily a knowledge sharing platform driven by people’s interests.

Along with larger events, there are a number of smaller events, too, which provide a platform for focused discussion as well as take interested data novices into the fold. They may go by various names hackathons, scrapathons, designjams, datajams aimed at different groups of people but the primary thing which brings them all together is the unrelenting love for data.

