Big Data, the processing of data sets that do not fit on a single computer, has come of age. It’s not just the level of interest shown at conferences like Strata but also the types of people participating. Sure there are loads of companies out there with products in this space but there are also plenty of end users coming forward and many of these are outside of technology companies. At the London version, one of the speakers was Ben Goldacre, doctor and author of the awesome Bad Science, who discussed the impact of missing data which is a huge issue for medical studies. Even the Whitehouse has weighed in on behalf of Big Data and emphasized its importance to business.
If you are going to use a technology I’m a big fan of going to the source and thankfully in this space, a lot of the published work on this is freely available. So I’ve collected some of the papers that are key to this area: five are about Big Data itself and the bonus one is about operational monitoring for massively distributed systems.