Featured Post

8 Ways to Optimize AWS Glue Jobs in a Nutshell

Image
  Improving the performance of AWS Glue jobs involves several strategies that target different aspects of the ETL (Extract, Transform, Load) process. Here are some key practices. 1. Optimize Job Scripts Partitioning : Ensure your data is properly partitioned. Partitioning divides your data into manageable chunks, allowing parallel processing and reducing the amount of data scanned. Filtering : Apply pushdown predicates to filter data early in the ETL process, reducing the amount of data processed downstream. Compression : Use compressed file formats (e.g., Parquet, ORC) for your data sources and sinks. These formats not only reduce storage costs but also improve I/O performance. Optimize Transformations : Minimize the number of transformations and actions in your script. Combine transformations where possible and use DataFrame APIs which are optimized for performance. 2. Use Appropriate Data Formats Parquet and ORC : These columnar formats are efficient for storage and querying, signif

SAP HANA In-memory Real Usage

Below are the list of questions on SAP HANA In-memory. That explains the real usage.

1. What is in-memory computing?

A1) In-memory computing is a technology that allows the processing of massive quantities of data in main memory to provide immediate results from analysis and transaction. 

The data that is processed is ideally real-time data (that is, data that is available for processing or analysis immediately after it is created).

2. How in-memory computing works?

A2) Keep data in main memory to speed up data access. Minimize data movement by using the columnar storage concept, compression, and performing calculations at the database level. 

Divide and conquer. Use the multi-core architecture of modern processors and multi-processor servers (or even scale out into a distributed landscape) to grow beyond what can be supplied by a single server.

3. What is the benefit of keeping data in memory?

A3) Data accessing from main memory is much faster than accessing data from Disk.

4. If data is in memory(i.e RAM), what will happen in loss of Power?
  • In database technology, atomicity, consistency, isolation, and durability (ACID) is the following set of requirements that ensures that database transactions are processed reliably:
    • A transaction must be atomic. If part of a transaction fails, the entire transaction must fail and leave the database state unchanged.
    • The consistency of a database must be preserved by the transactions that it performs.
    • Isolation ensures that no transaction interferes with another transaction.
    • Durability means that after a transaction is committed, it remains committed. Although the first three requirements are not affected by the in-memory concept, durability is a requirement that cannot be met by storing data in main memory alone. Main memory is volatile storage. It loses its content when it is out of electrical power. To make data persistent, it must be on non-volatile storage, such as HDDs, solid-state drives (SSDs), or flash devices.

5. How SAP HANA will store data in non-volatile storage?

A5) The storage that is used by a database to store data (in this case, main memory) is divided into pages. When a transaction changes data, the corresponding pages are marked and written to non-volatile storage in regular intervals. 

In addition, a database log captures all changes that are made by transactions. Each committed transaction generates a log entry that is written to non-volatile storage, which ensures that all transactions are permanent.

6. How SAP HANA minimizes data movement?

A6) Although today's memory capacities allow keeping enormous amounts of data in-memory, compressing the data in-memory is still preferable. 

The goal is to compress data in a way that does not use up the performance that is gained while still minimizing data movement from RAM to the processor.

Related: SAP Hana Best Selected Interview Questions (Part 2 of 3)

Comments

Popular posts from this blog

How to Fix datetime Import Error in Python Quickly

How to Check Kafka Available Brokers

SQL Query: 3 Methods for Calculating Cumulative SUM