Big Data Data sets that range from many terabytes to petabytes in size, and
that usually consist of less-structured information such as Web log files.
Hadoop cluster A type of scalable computer cluster inspired by the Google
Cluster Architecture and intended for cost-effectively processing
less-structured information.
Apache Hadoop The core of an open-source ecosystem that makes Big Data analysis
more feasible through the efficient use of commodity computer clusters.
Cascading A bridge from Hadoop to common Java-based programming techniques
not previously usable in cluster-computing environments.
NoSQL A class of non-relational data stores and data analysis techniques that
are intended for various kinds of less-structured data. Many of these
techniques are part of the Hadoop ecosystem.
Gray Data Data from multiple sources that isn’t formatted or vetted for specific needs, but worth exploring with the help of Hadoop cluster analysis techniques.