Comparing school performance

Continuing the design jams, we had one at Akshara’s office last weekend. The dataset we decided to pursue was the Karnataka SSLC results, which we had for the 5 years.

We addressed two questions:

  1. How do Government schools perform when compared to private schools?
  2. How does the medium of instruction affect marks in different subjects?

When comparing Government and private schools, here’s the result.

govt-private-schools

Each box is a school. The size of the box represents the number of students from that school who appeared in the Class X exam. (Only schools with at least 60 students were considered.) The colour represents the average mark – red is low, and green is high.

What’s immediately obvious is that private schools perform much better on average than Government schools, what’s less clear is when this difference starts. The series of graphs below show the number of schools at various mark ranges. The first shows schools with an average of 0 – 30%. The next, from 0 – 40%, and so on until 80%. Then it shows schools with an average of 30% – 100%. The next, from 40% – 100%, and so on until 80% – 100%.

bschool-00-30bschool-00-40bschool-00-50bschool-00-60bschool-00-70bschool-00-80bschool-30-100bschool-40-100bschool-50-100bschool-60-100bschool-70-100bschool-80-100

From the first graph, you can see that there are as many poor schools (average 0 – 30%) among the private and Government schools. But from the last graph, you can see that there are far more good private schools (average 80 – 100%) than Government schools.

So, there are poor performing schools among the private schools as well. However, there are very few excellent Government schools.

We compared the impact of medium of instruction against the subjects as well. The table below shows boxes for each subject taken under each medium of instruction. The size of the box represents the number of students taking that combination. The colour indicates the average mark (red is low, green is high.)

subject-medium

Clearly, Sanksrit is a high scoring language. (At least one person at the design jam chose Sanskrit for this very reason.) Kannada scores well too – especially as a first or third language; but not as well as a second language.

On average, English medium students have the highest marks, followed by Kannada medium students. Students studying other in mediums of instruction perform poorly in most subjects barring their language.

There’s clearly a strong correlation between the medium and the subject. Kannada medium students score high in Kannada, Urdu medium students shore high in Urdu, and so on. But while English medium students do score high in English, they tend to score much better at Kannada, Urdu and Sanskrit!

You can explore these results at http://gramener/karnatakamarks/

Migration patterns

ISB’s SRITNE (Srini Raju Centre for IT and the Networked Economy) is a research canter that focuses on the business and societal value of IT. Gramener collaborates with SRITNE to develop and promote visual analytics, and to help foster a culture of open data within the community.

As part of this collaboration, we jointly presented ‘Visualisation of Migration Patterns in India’ at the Bangalore Open Data Camp. Access to the source dataset for this analysis was provided by ISB and Visualisations were done primarily on the Gramener Visualisation Server. MS Excel and R were used for exploratory analysis. The aim of this exercise was to take an outside, analytics view of the Migration Patterns to explore alternate possibilities of representation and visualisation, rather than from the lens of a Demographics Analysis Expert.

The Indian NSSO (National Sample Survey Office) had conducted the 64th round survey on ‘Employment & Unemployment and Migration Particulars’during July ’07 to June ’08 covering 1,25,578 households and 5,72,254 persons.

Of this sample, ~30% were found to be migrants, i.e. those whose last usual place of residence (UPR) was different from the present place of enumeration. In this survey, the usual place of residence of a person was defined as a village/town where the person had stayed continuously for six months or more. Amongst the migrants, a majority were found to be moving within the state (85%) as opposed to those moving across states (15%). Women formed a sizeable majority of this migrant population.

Intra-state migration patterns

image

The map on the left has the intra-state migration pattern (excluding inter-state numbers) showing the absolute number of migrants moving within each state/UT. Green indicates higher migration and red is the opposite. Based on this map, the 5 most populous states in India account for the top 5 intra-state movements, except for Bihar which comes a close 6th. If we rescale the numbers by taking migrants as a percent of the state/UT’s survey size, as shown in the right map, the results change completely. The top 5 states with highest percent churn are Andhra Pradesh, Himachal Pradesh, Kerala, Gujarat and Andaman & Nicobar Islands.

Inter-state migration patterns

image

If we now look at the Inter-state migration pattern (excluding within-state movements) by plotting the Net Inflow of migrants into each state/UT (left-hand-side map), the states with highest net outflow of migrants are Uttar Pradesh and Bihar, while those with highest net inflow are Maharashtra and Delhi. If we rescale the numbers, as a percent of the state/UT’s survey sample, the story changes, yet again. All the Union Territories in India have the highest Net percent Inflow, with Chandigarh showing the highest value at 41%.

Inter-state migration Heat-map

state-migration-heatmap

In order to get a sense of exchange of migrants happening between the states, we plotted the numbers on a heat-map. The y-axis of the heatmap has ‘From-State’ while ‘To-State’ is on the x-axis. The height of each heat-map box is proportional to the net outflow from the contributor-state, while the width of each box is proportional to the net inflow into the recipient-state. The colour is representative of the number of people moving between the states – darker the box, more the number of people.

As can be seen, the top destinations for people leaving UP are Delhi, Maharashtra and Uttaranchal respectively. For Bihar and Rajasthan, the top destinations are highlighted accordingly. What is more interesting is the pattern of top destinations for each of the states. A clear trend is the consistent preference of people across regions to migrate into states with geographical proximity. The survey had also covered a set of international in-migrants, wherein Bangladesh the top contributing country has a sizeable proportion of its migrants moving to West Bengal.

Migration across Rural-Urban areas

image

When migration was viewed from the perspective of movement across Rural – Urban areas, a surprising trend found was the extent movement within Rural Areas – more than half of migration in India happens amongst the Rural regions. About 40% of migration is towards Urban areas. A contra-trend noticed here was for the Union Territories and North-Eastern States – over 70% of migration in these areas is towards the Urban regions, unlike the rest of India.

Reasons for migration

migration-reason-age-gender

When an analysis of Reasons for Migration was done at the Country level, some key trends were observed. Women, who form a sizeable majority of the migrants primarily migrate on account of ‘Marriage’ and their typical age at marriage is between 15 and 24. For men, the key reason for migration is ‘Employment-related’ and this primarily happens in the age band of 18 to 40. Consequently, migration due to ‘Movement of Parent/Earning member’ forms another key reason. ‘Education’ is also found to be a driver of migration and this typically happens for men and women until the age of around 23 years.

When we looked at the Reasons for Migration vis-à-vis States, a few interesting patterns showed up. People in Tripura migrate mostly due to Forced Reasons/Disasters, whereas UP witnesses Marriage-related movement. Kerala and West Bengal witness migration because of Housing related reasons, whereas a lot of people in the scenic state of Himachal Pradesh migrate for post-retirement life.

migration-status-reason

It is evident from the above heatmap that a majority of the women who migrate for marriage, end up doing Domestic duties, while men who move for employment end up as Wage employees/labourers.

migration-reason-year-gender

The survey sample had a good mix of people who had migrated over the years, dating as far back as the 1930s. When we analysed the pattern of evolution of migration reasons, interesting trends emerged. Until Independence, migration was subdued and was restricted only to the women getting married. Post-independence, migration numbers have steadily increased over the next 60 years. After 1970s, increasingly more people started moving for Employment-related reasons. This was also accompanied with migration of the dependent families. It has been only after the 1990s that people move in significantly larger numbers and for reasons such as Business, Education, Housing, Post-retirement, Healthcare – more inline with the Indian Economic Development story over the past 60 years!

Karnataka ground water quality

We took Karnataka’s ground water quality data from a 2004 Karnataka Rural Water Supply and Sanitation Agency (KRWSSA) report (via IndiaWaterPortal), and tried to see if there were any patterns.

The executive summary of the report shows the number of villages in each district affected by problems of excess nitrate, iron, fluoride or total dissolved salts. We plotted those on the district map.

karnataka-water-quality-2004

The interesting pattern is that emerges is that the places that have excess nitrate (NO3) concentration are the ones that have excess iron (Fe) concentration as well. The places that have excess fluoride (F) concentration are the ones that have excess salts (TDS).

karnataka-water-quality-correlation

The correlation scatterplot alongside further demonstrates this point. There is a fairly good correlation between Fe and NO3, and between TDS and F. There’s little correlation across these groups however.

(From a glance at the scatterplots, though, it becomes immediately obvious that this is based on too few data points.)

We can also readily see that the bulk of the water quality issues are in interior Karnataka. The coastal areas are relatively fine.

Geographic visualisations are an extremely powerful way of inferring patterns when the underlying data is geographic in nature. At Gramener, we use our visualisation server to automatically create graphs such as these based on an underlying data source.

We have provided a sample of our tool at http://gramener.com/indiamap. You can enter your own data and see how it shows up on any district or state map. Happy mapping!