A Data Scientist’s Laptop

What con­fig­ur­a­tion should a data sci­ent­ist go for?

A KDnuggets poll in­dic­ates a 3-4 core 5-16GB Windows ma­chine.

A StackExchange thread re­com­mends a 16GB RAM, 1TB SSD Linux sys­tem with a GPU.

Quora thread nudges con­verges around 16GB RAM.

RAM mat­ters. Our ex­per­i­ence is that RAM is the biggest bot­tle­neck with large data­sets. Things speed up an or­der of mag­nitude when all your pro­cessing is in-memory. A 16GB RAM is an ideal con­fig­ur­a­tion. Do not go be­low 8GB.

Big drives. The next biggest driver is the hard disk speed. But you don’t ne­ces­sar­ily need an SSD. If your data fits in memory, then most data ac­cess is se­quen­tial. An SSD is only ~2X faster than a reg­u­lar hard disk, but much more ex­pens­ive. (If you’re run­ning a data­base, then an SSD makes more sense.) For hard disks, lar­ger hard disks are also faster due to higher stor­age dens­ity. So prefer the 1 TB disks.

The CPU doesn’t mat­ter. Make sure you have more cores than data in­tens­ive pro­cesses, but oth­er than that, it’s not an is­sue.

However, one com­mon theme we find is that heavy data sci­ence work hap­pens on the cloud, not on the laptop. That’s what you need to be look­ing for — a good cloud en­vir­on­ment that you can con­nect to.

For ex­ample, this Frontanalytics re­port re­com­mends a ba­sic laptop with long bat­tery life, the abil­ity multi-task (i.e. mul­tiple cores), and a back­lit key­board for the night.

Maybe you just need USB port in your arms.

Damn. Not only did he not install it, he sutured a 'Vista-Ready' sticker onto my arm.

Leave a Reply