Every Hadoop implementation encounters the occasional crisis, including moments when the folks running Hadoop feel like their hair is on fire. Sometimes it happens before you get to production, which can cause organizations to throw the Hadoop baby out with the bathwater. Often, these moments occur after the first production launch, which means you have a “success disaster” on your hands (although it will probably feel more like disaster than success).
Implementing and scaling Hadoop is enormously complicated. However, if you learn to recognize problems early, you can prevent your hair (and your Hadoop implementation) from igniting. Here are some signs of danger, along with lessons we’ve learned for heading them off.
Danger sign 1: You never get to production
Moving from proof of concept (POC) to production is a significant step for big data workloads. Scaling Hadoop jobs is fraught with challenges. Sometimes large jobs just won’t finish. A job that ran in testing won’t run at production scale. Data can also be an issue: the POC often uses unrealistically small or uniform datasets.