6.1.3 What Does “Big Data” Mean? One possible distinction between data science and statistics is the amount of data we’re working with. Technology coverage in the 2010s (and continuing to the present) made it hard to resist the idea that big data represents some kind of revolution that has turned the whole world of information and technology topsy-turvy. But is this really true? Does big data change everything? Business analyst Doug Laney suggested that three characteristics make big data different from what came before: volume, velocity, and variety. Volume refers to the sheer amount of data. Velocity focuses on how quickly data arrives as well as how quickly those data become “stale.” Finally, variety reflects the fact that there may be many different kinds of data. Together, these three characteristics are often referred to as the “three Vs” model of big data. Note, however, that even before the dawn of the computer age we’ve had a variety of data, some of which arrives quite quickly, and that can add up to quite a lot of total storage over time. Think, for example, of the large variety and volume of data that has arrived annually at Library of Congress since the 1800s! So, it is difficult to tell that big data is fundamentally a brand new thing. Furthermore, there are some concerns that we should exercise when it comes to big data. For example, when a data set gets to a certain size (into the range of thousands of rows), conventional tests of statistical significance are meaningless, because even the most tiny and trivial results are statistically significant. We’ll talk more about statistical significance later in the semester; for the time being, though, it suffices to say that statistical significance is how researchers have traditionally determined whether their results are important or not. If big data makes statistical significance more likely, then researchers who have access to more data will get more important results, whether or not that’s actually true in practical terms! Besides that, the quality and suitability of the data matters a lot: More data does not always mean better data.
I find this part of this section. Will better help me understand about big data Business analyst Doug Laney suggested that three characteristics make big data different from what came before: volume, velocity, and variety. Volume refers to the sheer amount of data. Velocity focuses on how quickly data arrives as well as how quickly those data become “stale.” Finally, variety reflects the fact that there may be many different kinds of data. Together, these three characteristics are often referred to as the “three Vs” model of big data. Note, however, that even before the dawn of the computer age we’ve had a variety of data, some of which arrives quite quickly, and that can add up to quite a lot of total storage over time.