Why you should apply Kanban principles to big data projects

Published November 27, 2019   |   
arvindl

Kanban has proven to be a viable approach for many types of projects, from large ones with multiple layers of complexity, to smaller jobs that can comfortably be pulled off by one or two people. And with the rise of big data in the last decade, it made full sense that the most active researchers and developers in this field would turn their attention to Kanban as a solution to the problems they commonly face too. From what we’ve seen so far, it does indeed look like a very viable approach for this environment, even if it has some quirks that companies need to get used to. But in the end, that’s valid for pretty much any project organization methodology.

The unique challenges of big data

Big data brought various challenges to the table that people are still trying to adapt to. It’s not an easy field to get into, as it requires knowledge in different fields that don’t necessarily have any overlap between them. Between working with large databases, connected services, high-powered computation clusters, and the heavy mathematical requirements of the field, there’s a lot that one needs to know in order to handle their projects adequately. And Kanban has turned out to be a good way to combine all of these into a central point of reference, allowing project managers and anyone involved in those projects to get an easy overview of what everything is being used for, what tasks are assigned to different people, and how everything is broken down into smaller pieces.

Is Kanban ideal for you?

There have been some arguments over the appropriateness of using Kanban as a go-to solution for the problems in big data, but it does seem to have more features suitable for the field than other methodologies and approaches. Scrum commonly gets brought up in these discussions, and while it does have some merit in this field, it’s far from a universal solution fit for everyone. In fact, it’s important to understand that Scrum might actually have some detrimental effects on the typical big data project due to the way it works internally.

Wrangling large volumes of work

Something common to big data – and something that can be addressed very well with the help of Kanban – is the typically large amount of work that is always on the horizon. It’s not just about the volume of data to be processed; often, you’ll have to run lots of different analyses to get the right kind of output, and you might need to split the load between different researchers to even be able to process everything in time.

If you’ve already worked with Kanban before, you’re probably starting to see where this is going. It’s directly suited for handling these types of complex problems, and it can also handle them comfortably as they scale up, which is an even more important feature in many cases. Simply setting up a Kanban board online and getting your teammates on it can immediately bring some improvements to the situation. Kanbanize is a pretty popular solution that’s already actively used in many similar environments.

Easy adoption of different technologies

Kanban can also allow you to experiment with different solutions to solve the problems you’re facing, without having to go through a lot of work just to get them integrated into your workflow. Big data is a rapidly evolving field with a lot going on at any given time. New frameworks, libraries, research papers, and whatnot – and you have to stay in touch with all of that and always look for opportunities to integrate it into your own workflow. This can be very complicated without an adequate way of seeing the big picture from above, and it’s exactly where Kanban can come into play. It can reduce the integration of those new technologies to a set of small steps that you have to follow, and the rest comes down to actually going through with that.

Simplicity is key

And while Kanban can help you work with large, complex projects with ease, its beauty lies in its simplicity, and the way it allows you to transform any huge piece of work into smaller, sequential tasks that just follow each other in a logical manner. This is very important in a field like big data, where each new task can seem insurmountable when you’re first presented with it. And while it can take a while to get used to this style of work for someone who’s never done it before, it’s something that can completely transform the way you see working on big data projects.

The whole field is still going through rapid development, and it’s very likely that some of the trends we’re seeing now will stabilize for the long term, while others will fall out of favor. It’s important to pay attention to both sides, because you’ll need to constantly refine your toolset in order to be able to deal with the new problems that will come up along the way. And for now, learning to use Kanban properly for this type of work is one of the best investments of your time that you could make.