A few times every year, Gramener gets together to share what we’ve learnt, and train those new to the family. Last week, our Boot Camp was conducted across multiple locations for the first time, with the team connecting from Bangalore, Coimbatore, Delhi, Hyderabad, Mumbai, and New Jersey.
It was a packed two-day agenda:
- On analysis, we covered our analysis offerings, Autolysis (our automated analysis tool), and the Spectrum of Analytics, “From simple pivoting to deep learning”.
- On technology, we covered the online and self-hosted services we use for our infrastructure, and on handling large-scale data (in memory and in databases.)
- On design, we discussed Responsive Design, Data Infographics, D3 4.0 and its nuances, and our new charting library Gramex Charts.
- In our quarterly update, a key highlight as our Government work on Swacch Bharath, the Ministry of Commerce Trade dashboard, and Padma Awards.
- Finally, we awarded the Knights of Gramener — the “Sparks” award for innovation, the “Design Artist”, the “Magician Analyst”, the “Safe Hands”, the “All Rounder”, and more.
On Saturday, our Hyderabad team headed out to Leonia and the Bangalore team to Guhantara to wind down.
In all, it was an instructive and entertaining week for us — but that’s not all. We’d like you to join us as well. In a few months, we’re planning a series of events (public data projects, hackathons, and trainings) that are open to our clients and the public. Keep a watch on this space.
What configuration should a data scientist go for?
A KDnuggets poll indicates a 3-4 core 5-16GB Windows machine.
A StackExchange thread recommends a 16GB RAM, 1TB SSD Linux system with a GPU.
A Quora thread nudges converges around 16GB RAM.
RAM matters. Our experience is that RAM is the biggest bottleneck with large datasets. Things speed up an order of magnitude when all your processing is in-memory. A 16GB RAM is an ideal configuration. Do not go below 8GB.
Big drives. The next biggest driver is the hard disk speed. But you don’t necessarily need an SSD. If your data fits in memory, then most data access is sequential. An SSD is only ~2X faster than a regular hard disk, but much more expensive. (If you’re running a database, then an SSD makes more sense.) For hard disks, larger hard disks are also faster due to higher storage density. So prefer the 1 TB disks.
The CPU doesn’t matter. Make sure you have more cores than data intensive processes, but other than that, it’s not an issue.
However, one common theme we find is that heavy data science work happens on the cloud, not on the laptop. That’s what you need to be looking for — a good cloud environment that you can connect to.
For example, this Frontanalytics report recommends a basic laptop with long battery life, the ability multi-task (i.e. multiple cores), and a backlit keyboard for the night.
Maybe you just need USB port in your arms.
How large is Rs 1,000 crores? Here’s a picture.
Two years ago, when exploring the wealth of candidates, we put together a few visuals to show how large a bundle Rs 1 lakh would form, all the way up to Rs 10,000 crores — in denominations of Rs 1,000.
Post the demonetisation of these notes, we were amused to find that the top searches that led to our blog were:
- volume of 100 crore rupees
- indian money 1000 bundles
- what is the size of an 1000 thousand crores
- height of a bundle of 1000 rs notes
- weight of 1 crores in 100 rupees
- one 1000 rupees weight
For those looking for the answer: the notes don’t take much space, but they’re quite heavy.