In a data lake, data stored internally in a repository. You can call it a blob. The data in the lake a no-format data, but you need a schema for the database.
Database
- In the database, the Schema definition you need before you store data on it.
- It should follow Codd's rules.
- Here data is completely formatted.
- The data stores here in Tables, so you need SQL language to read the records.
- Poor performance in terms of scalability.
Data lake
- It doesn't have any format - it's just a dump.
- You can send this dump to the Hadoop repository for data analysis.
- This repository can be incremental. You can build a database.
- The data lake is a dump of data with no format. It needs a pre-format before it sends for analytics.
- Data security and encryption: You need these before you send data to Hadoop.
- In real-time, you need to pre-process data.
- This data you need to send to the data warehouse to get insights.
0 Comments
Thanks for your message. We will get back you.