The Ten Commandments of Big Data

Resources   |   
Published June 30, 2014   |   
arvindl

1. Thou shalt decide on a business problem before committing any time, money, or animal sacrifice to the project.
2. Thou shalt start small and build, committing not to cataloging the Internet on thy first project.
3. Thou shalt not spend unnecessarily. Open Source begat Big Data, and from Open Source did commercial packages spring forth.
4. Thou shalt not fear the Unstructured Data, for Hadoop and NoSQL are with you.
5. Be fruitful and multiply thy commodity hardware, for dedicated, proprietary hardware is neither holy nor necessary.
6. Thou shalt speak in tongues, like those named PHP, Python, SQL and Java.
7. IT shalt not wage war on thy brethren the Businessmen, for without them thou hast no problem to solve, and without you they have no solution.
8. Thou shalt not use the unclean data. Thou must washest it first, ensuring the salvation of thine project thereby.
9. Thou shalt not overfit the model, lest thou burn in statistical hell forevermore.
10. Thou shalt trust thy results, but verify.

This article originally appeared here. Republished with permission. Submit your copyright complaints here.