此文說的大數據,不是常規數據庫,不是人們經常談論的 data marts 以及 data warehouse。

來源: 2014-07-22 08:35:07 [博客] [舊帖] [給我悄悄話] 本文已被閱讀:

常規數據庫,以及人們經常談論的 data marts 以及 data warehouse, require data to be stored in relational database with fixed columns and constraints. like Primary key, foreign key, etc.
Of course data mart or data warehouse can do many data analysis and give many results.

Here, so called big data, means much larger data volume than regular data mart or data warehouse and the data structure is arbitrary, there is no fixed columns.

As more and more data is generate by all sorts of devices and social media network, there are demands for this kind of data analysis.

Apache Hadoop is one of this kind of big data system started in Yahoo.
Map Reduce is the model to process this kind of data which is started by Google.
Apache hadoop support a data analysis scripting language call Pig, and SQL kind of analysis language called Hive.