Featured Post

8 Ways to Optimize AWS Glue Jobs in a Nutshell

Image
  Improving the performance of AWS Glue jobs involves several strategies that target different aspects of the ETL (Extract, Transform, Load) process. Here are some key practices. 1. Optimize Job Scripts Partitioning : Ensure your data is properly partitioned. Partitioning divides your data into manageable chunks, allowing parallel processing and reducing the amount of data scanned. Filtering : Apply pushdown predicates to filter data early in the ETL process, reducing the amount of data processed downstream. Compression : Use compressed file formats (e.g., Parquet, ORC) for your data sources and sinks. These formats not only reduce storage costs but also improve I/O performance. Optimize Transformations : Minimize the number of transformations and actions in your script. Combine transformations where possible and use DataFrame APIs which are optimized for performance. 2. Use Appropriate Data Formats Parquet and ORC : These columnar formats are efficient for storage and querying, signif

RDBMS Vs NOSQL awesome differences to read now

NoSQL and RDBMS or SQL are different from each other. You may ask what is the difference. Below explained in a way that you can understand quickly.

rdbms vs no sql

💡Traditional Database

  • A schema is required. All traditional data warehouses using RDBMS to store datamarts.
  • Databases understand SQL language. It has a specific format and rules to interact with traditional databases.
  • Less scalable. It has certain limitations. 
  • Expensive to make the databases as scalable
  • Data should be in a certain format.
  • Data stored in row format.

NoSQL database

The growing internet usage and involving a number of devices caused to invent databases that have the capability to store any kind of data.

NoSQL Special Features
  • The schema is not required. Ability to handle multiple data types. This is the power of NoSQL.
  • NoSQL is much suitable for analytical databases. Since those should be flexible, scalable, and able to store any formatted data.
  • The increased usage of web applications, the availability of broadband for the common man, caused the generating of a variety of data. So NoSQL is absolutely needed for the new generation businesses.
  • Data stored in column format. In the form of key-value pairs.
  • Python, Ruby, PHP, and Java are top languages you need to interact with NoSQL databases.

Comments

Popular posts from this blog

How to Fix datetime Import Error in Python Quickly

How to Check Kafka Available Brokers

SQL Query: 3 Methods for Calculating Cumulative SUM