One of our colleagues nearly lost a piece of analysis recently. Here’s the most boring / valuable advice she can get on how to organise analysis — or any form of work for that matter. Of course, you could always learn git.
Gramener has won the Silver prize in the prestigious Analytics Solutions category at this year’s ExpressITAwards. The awards were adjudged by eminent jury panel comprising of corporate strategists, academicians and thought-leaders from the IT industry. Team Gramener is proud to receive the award from the Chief guest of the event , Honourable Union minister Mrs Nirmala Seetharaman.
RAM matters. Our experience is that RAM is the biggest bottleneck with large datasets. Things speed up an order of magnitude when all your processing is in-memory. A 16GB RAM is an ideal configuration. Do not go below 8GB.
Big drives. The next biggest driver is the hard disk speed. But you don’t necessarily need an SSD. If your data fits in memory, then most data access is sequential. An SSD is only ~2X faster than a regular hard disk, but much more expensive. (If you’re running a database, then an SSD makes more sense.) For hard disks, larger hard disks are also faster due to higher storage density. So prefer the 1 TB disks.
The CPU doesn’t matter. Make sure you have more cores than data intensive processes, but other than that, it’s not an issue.
However, one common theme we find is that heavy data science work happens on the cloud, not on the laptop. That’s what you need to be looking for — a good cloud environment that you can connect to.
For example, this Frontanalytics report recommends a basic laptop with long battery life, the ability multi-task (i.e. multiple cores), and a backlit keyboard for the night.