Featured post

Best Machine Learning Book for Beginners

You need a mixof different technologies for Data Science projects. Instead of learning many skills, just learn a few. The four main steps of any project are extracting the data, model development, artificial intelligence, and presentation. Attending interviews with many skills is not so easy. So keep the skills short.
A person with many skills can't perform all the work. You had better learn a few skills like Python, MATLAB, Tableau, and RDBMS. So that you can get a job quickly in the data-science project.
Out of Data Science skills, Machine learning is a new concept. Why because you can learn Python, like any other language. Tableau also the same. Here is the area that needs your 60% effort is Machine learning.  Machine Learning best book to start.

Related Posts How to write multiple IF-conditions in Python Simplified

Data lake Repository You Need to Know About it

In DataLake data stored internally in a repository. You can say this format as a blob. The data in DataLake does not have a particular Schema or Format.

data lake repository  example
Photo Credit: Srini

SQL Database

  • Let us take a traditional database, here, a database design and Scheme are to be defined before you enter data. In data-lake, there is no format for this. It is like a dump. 
  • This dump you can send to Hadoop repository for data analysis. This repository can be incremental. Also, you can build a large database.

Data lake Vs Hadoop

Data-Lake is a dump of data with no format. There are many pre-formats required before it sends for analytics. One is data security and encryption. These techniques to be done before you send your data to Hadoop repository.

In real-time, Hadoop data analytics need lot other pre-processing of data required to proceed further.

Comments

Popular posts from this blog

Hyperledger Fabric: 20 Real Interview Questions

Python IF Statements Multiple Conditions Examples

Best Machine Learning Book for Beginners