Posts

Showing posts with the label structured vs unstructured

Featured Post

Step-by-Step Guide to Reading Different Files in Python

Image
 In the world of data science, automation, and general programming, working with files is unavoidable. Whether you’re dealing with CSV reports, JSON APIs, Excel sheets, or text logs, Python provides rich and easy-to-use libraries for reading different file formats. In this guide, we’ll explore how to read different files in Python , with code examples and best practices. 1. Reading Text Files ( .txt ) Text files are the simplest form of files. Python’s built-in open() function handles them effortlessly. Example: # Open and read a text file with open ( "sample.txt" , "r" ) as file: content = file.read() print (content) Explanation: "r" mode means read . with open() automatically closes the file when done. Best Practice: Always use with to handle files to avoid memory leaks. 2. Reading CSV Files ( .csv ) CSV files are widely used for storing tabular data. Python has a built-in csv module and a powerful pandas library. Using cs...

6 Exclusive Differences Between Structured and Unstructured data

Image
Here's a basic interview question for Big data engineers. Why it's basic means many Bachelor degrees now offering courses on Big data, as a beginner, understanding of data is a little tricky. So interviewers stress this point. Don't worry, I made it simplified. So you get a clear concept. I share here a total of six differences between these. In today's world, we have a lot of data. That data is the unstructured format.   Structured Data The major data format is text, which can be string or numeric. The date is also supported. The data model is fixed before inserting the data. Data is stored in the form of a table, making it easy to search. Not easy to scale. Version is maintained as a column in the table. Transaction management and concurrency are easy to support. Unstructured data The data format can be anything from text to images, audio to videos. The data model cannot be fixed since the nature of the data can change. Consider a tweet message that could be text foll...