Flipped Statistics: The Changing Paradigm of Education Data

Part 4 of Our 4 Part Series on Big Data in Learning

Big data has led to big changes in statistics. One could argue that our existing paradigm has been rendered impotent in many cases by big data.

Historically it has been difficult and largely impossible to gather all data in a target audience or domain. We were taught that sampling data is the way to go, and then to avoid selection and sample bias, we avoid self-selection or snowball samples and randomize. All of this assumed scarcity of data. 

When data is scarce, we tend to sample and randomize, but this comes at a price—loss of detail.

In the age of digital abundance, in some cases, this model has been abandoned in favor of massive sets of unstructured data, where N=ALL and data is used as a huge battering ram to solve a problem. When data is scarce, we tend to sample and randomize, but this comes at a price—loss of detail. When N=ALL, we can mine the data, and messiness gets diluted in the big numbers. There’s no need to sample when you have massive amounts of data; the numbers do the talking. 

Scarcity vs. abundance 

Sampling vs. N=ALL

Exactitude vs. messiness

Causation vs. correlation

This takes us back to a form of pure empiricism, where we deduce from data and use this in decision-making. In this sense, it’s a form of pure science, with data as the mother load and algorithms doing the heavy lifting. The shift from causation to correlation is another interesting mental reboot, where the big data algorithms are often simply pattern matching or using algorithms that identify correlation, not causation. 

There is, of course, a danger here of inductive thinking, inferences based simply on what has happened before have been brought to the fore several times in recent stock-market crashes, bubbles, and financial crises. Some smart deduction and causal analysis may also be necessary to counter the inductive trend. It is no accident that this is happening now. The technological changes that have enabled the big data revolution have been more than massive data production. It is also the result of the plummeting cost of storage and processing along with advances in smart algorithms that put big data to good use. The fact that we have mobile devices and an increasing number of people spend an increasing amount of time online has provided the raw data.


Across the educational landscape, big data is changing learning by providing a sound basis for learners, teachers, managers, and policymakers to improve learning, teaching, and organizations. Above all, the big win in big data is to use it to improve the learning experience through personalized learning. Data must be brought to bear at every point for the learner through meaningful feedback, the presentation of the right material, and optimized paths through the learning experience. On the grander scale, too much data is hidden, so more and more open data is needed. Data must also be searchable. Data must also be governed and managed. There is also the issue of visualization. Whether for the learner, teacher, or at an organizational, national, or international level, data will only be understood through visualization. We must remember as well that data is also being used to do great harm. Big data in the hands of small minds can be dangerous. 

How do you feel about the application of big data in education? What innovations might it spur?  What unaddressed issues concern you? The use and collection of data is a pivotal topic that’s worthy of examination. Let’s keep our discussion going at hello@cogbooks.com. Share your thoughts, and perhaps, inspire our next series on the intersection of big data and education.

Give Students Greater Agency and Instructors More Control

CogBooks weaves student agency, instructor empowerment, and curriculum affordability ($39.95 per course) into a comprehensive, adaptive learning platform. This simple to adopt and manage tool is a direct-replacement for textbooks. Higher education institutions or instructors can choose CogBooks for a single course or create an entire degree program such as the Biospine Initiative at Arizona State University. The CogBooks adaptive learning platform has been used by more than 200,000 students worldwide. It is proven to reduce dropouts by 90%* while improving student performance by 24%.* Connect with us if you’re interested in learning more, creating a custom course, or developing an entire degree program.

*Data from a consecutive four-year study in Introduction to Biology for Non-Majors at Arizona State University.