Featured Post

8 Ways to Optimize AWS Glue Jobs in a Nutshell

Image
  Improving the performance of AWS Glue jobs involves several strategies that target different aspects of the ETL (Extract, Transform, Load) process. Here are some key practices. 1. Optimize Job Scripts Partitioning : Ensure your data is properly partitioned. Partitioning divides your data into manageable chunks, allowing parallel processing and reducing the amount of data scanned. Filtering : Apply pushdown predicates to filter data early in the ETL process, reducing the amount of data processed downstream. Compression : Use compressed file formats (e.g., Parquet, ORC) for your data sources and sinks. These formats not only reduce storage costs but also improve I/O performance. Optimize Transformations : Minimize the number of transformations and actions in your script. Combine transformations where possible and use DataFrame APIs which are optimized for performance. 2. Use Appropriate Data Formats Parquet and ORC : These columnar formats are efficient for storage and querying, signif

Big Data: Top NoSQL Interview Questions (2 of 5)

Oracle+NoSQL+JOBS
1) What is most important character of NoSQL?
High Availability

2)Different types of NoSQL databases?
Key-Value stores
Column Stores
Graph Stores
Document Stores

3)What is oracle NoSQL database?
Oracle NoSQL Database is a distributed key-value database designed to provide highly reliable, scalable, and available data storage across a configurable set of systems.

4)What is the DB engine being used in Oracle NoSQL database?
Oracle NoSQL Database uses Oracle Berkeley DB Java Edition as the underlying data storage engine.

5)What is oracle NoSQL database?
Oracle NoSQL Database is a shared-nothing system designed to run and scale on commodity hardware. Key-value pairs are hash partitioned across server groups known as shards. At any point in time, a single key-value pair is always associated with a unique shard in the system.

6) What are unique features of Oracle NoSQL?
Oracle NoSQL Database leverages the high availability features in Berkeley DB in order to provide resiliency, fault tolerance, and read scalability.

7)What is the concept of shared nothing in Oracle NoSQL database?
A shared-nothing system partitions the data into disjoint subsets (called shards), each shard managed by a node (along with replicas for providing high availability).

8) How oracle NoSQL acheives high availablity?
Availability is achieved by adding redundancy to the system. In NoSQL systems, redundancy is commonly achieved by maintaining multiple copies of the data on multiple nodes. Each shard comprises two or more nodes (called replicas) that have identical copies of the data.

9) What is ACID in Oracle NoSQL?
Atomicity, consistency, isolation, and durability (ACID) are the key characteristics provided by transactions. Oracle NoSQL Database leverages the transaction capabilities of the underlying Berkeley DB storage engine.

10)How data will be modeled in No SQL Database?
Oracle NoSQL Database provides a key-value paradigm to the application developer. Every entity (record) is a set of key-value pairs. A key has multiple components, specified as an ordered list. The major key identifies the entity and consists of the leading components of the key. The subsequent components are called minor keys.
This organization is similar to a directory path specification in a file system (for example, /Major/minor1/minor2/). The "value" part of the key-value pair is simply an uninterpreted string of bytes of arbitrary length.

Comments

Popular posts from this blog

How to Fix datetime Import Error in Python Quickly

How to Check Kafka Available Brokers

SQL Query: 3 Methods for Calculating Cumulative SUM