In DataLake data stored internally in a repository. You can say this format as a blob. The data in DataLake does not have a particular Schema or Format.
|Photo Credit: Srini|
- Let us take a traditional database, here, a database design and Scheme are to be defined before you enter data. In data-lake, there is no format for this. It is like a dump.
- This dump you can send to Hadoop repository for data analysis. This repository can be incremental. Also, you can build a large database.
Data lake Vs Hadoop
Data-Lake is a dump of data with no format. There are many pre-formats required before it sends for analytics. One is data security and encryption. These techniques to be done before you send your data to Hadoop repository.
In real-time, Hadoop data analytics need lot other pre-processing of data required to proceed further.