Posts

Showing posts from October, 2017

Featured Post

Step-by-Step Guide to Reading Different Files in Python

Image
 In the world of data science, automation, and general programming, working with files is unavoidable. Whether you’re dealing with CSV reports, JSON APIs, Excel sheets, or text logs, Python provides rich and easy-to-use libraries for reading different file formats. In this guide, we’ll explore how to read different files in Python , with code examples and best practices. 1. Reading Text Files ( .txt ) Text files are the simplest form of files. Python’s built-in open() function handles them effortlessly. Example: # Open and read a text file with open ( "sample.txt" , "r" ) as file: content = file.read() print (content) Explanation: "r" mode means read . with open() automatically closes the file when done. Best Practice: Always use with to handle files to avoid memory leaks. 2. Reading CSV Files ( .csv ) CSV files are widely used for storing tabular data. Python has a built-in csv module and a powerful pandas library. Using cs...

Hadoop Vs RDBMS Real Differences

Image
Hadoop comes into the picture to process a large volume of unstructured data. The structured data is already taken care of by traditional databases. Traditional databases. Traditional relational databases have been able to store massive data sets for a long time. An Oracle 10g database can store over 8 Petabytes while for many years DB2 databases have been capable of storing well over 500 Petabytes. Of course, this is all theoretical.  No customer has an Oracle or DB2 database that approaches sizes even close to that. Why? Because the speed, or velocity, at which data can be loaded and queries can be executed approaches zero well before then. Similarly, all traditional relational databases can store any variety of data as text or binary large objects. The problem is that large volumes of unstructured data cannot be moved fast enough to enable rapid search and retrieval. Hadoop Processing. Running constant and predictable workloads is what your existing data warehouse ha...

The In-and-Out of Nodes in Blockchain

Image
Blockchain is a decentralized technology or distributed ledger on which transactions are anonymously recorded. Which means the transaction ledger is maintained simultaneously across a network of unrelated computers or servers called “nodes”, like a spreadsheet that is duplicated thousands of times across a network of computers. The ledger contains a continuous and complete record (the “chain”) of all transactions performed which are grouped into blocks A block is only added to the chain if the nodes, which are members in the blockchain network with high levels of computing power, reach consensus on the next ‘valid’ block to be added to the chain. A transaction can only be verified and form part of a candidate block if all the nodes on the network confirm that the transaction is valid. Related 11 Useful Blockchain Advantages to Read now Blockchain Smart Contract The Perfect Example

Hadoop: How to Improve College a Mini Project

Image
This is based on my research of developing an engineering college using data analytics. This is a great subject that can be applied by all engineering aspirants in their final project. In my view it has dual benefits. The one is for student they can gain lot of analytics knowledge and application to develop engineering college to keep it in the list of top colleges. The second is for Engineering colleges they can benefit to improve quality of education and to become one of the top colleges. Hadoop: How to Improve College a Mini Project The project theme is data analytics: There are total 2 parts: Use Hadoop technologies to study student database what they did in School level- This gives lot of insights on the Student interests. Approach each student and get some innovative ideas to improve the college Use Faculty database to get the skills and projects what they did in previous years. This helps to get right faculty for new innovative project Basically the qualities of g...