Featured Post

8 Ways to Optimize AWS Glue Jobs in a Nutshell

Image
  Improving the performance of AWS Glue jobs involves several strategies that target different aspects of the ETL (Extract, Transform, Load) process. Here are some key practices. 1. Optimize Job Scripts Partitioning : Ensure your data is properly partitioned. Partitioning divides your data into manageable chunks, allowing parallel processing and reducing the amount of data scanned. Filtering : Apply pushdown predicates to filter data early in the ETL process, reducing the amount of data processed downstream. Compression : Use compressed file formats (e.g., Parquet, ORC) for your data sources and sinks. These formats not only reduce storage costs but also improve I/O performance. Optimize Transformations : Minimize the number of transformations and actions in your script. Combine transformations where possible and use DataFrame APIs which are optimized for performance. 2. Use Appropriate Data Formats Parquet and ORC : These columnar formats are efficient for storage and querying, signif

2 Top Differences Automation Vs Internet of Things

Five reasons why IoT automation provides opportunities to deliver better product or Services. The data from sensors is a golden asset to derive benefits and to apply in products or services.

Automation and IoT both are different 

Automation

The automation is based on the data collected from various devices and make it happen when something goes wrong you can say as automation.

The best example is based on sensor generated data the automation tool take corrective action during course of flying from one country to other.

  Internet of Things

  1. More mobile phones than fixed
  2. New architecture models (ex: Cloud computing)
  3. The new protocol (Ipv6)
  4. Everything is Sensor-laden
  5. More machines than people

The Growth of Internet Usage

The internet will be double in size every 5.32 years. More devices can be connected to the internet through IP. The internet limitation in IPv4 is 4 billion addresses.

But, the internet limitation for IPv6 is 2^128. The total IP traffic over the internet is 1 ZettaByte as of 2011.

Data process
Wisdom from Data

Data 

  1. Information-It is the data after you did clean the raw data.
  2. Knowledge-The ideas or patterns you obtain from cleaned data.
  3. Wisdom-Building models and you can make automate the certain task.

Comments

Popular posts from this blog

How to Fix datetime Import Error in Python Quickly

How to Check Kafka Available Brokers

SQL Query: 3 Methods for Calculating Cumulative SUM