Featured Post

8 Ways to Optimize AWS Glue Jobs in a Nutshell

  Improving the performance of AWS Glue jobs involves several strategies that target different aspects of the ETL (Extract, Transform, Load) process. Here are some key practices. 1. Optimize Job Scripts Partitioning : Ensure your data is properly partitioned. Partitioning divides your data into manageable chunks, allowing parallel processing and reducing the amount of data scanned. Filtering : Apply pushdown predicates to filter data early in the ETL process, reducing the amount of data processed downstream. Compression : Use compressed file formats (e.g., Parquet, ORC) for your data sources and sinks. These formats not only reduce storage costs but also improve I/O performance. Optimize Transformations : Minimize the number of transformations and actions in your script. Combine transformations where possible and use DataFrame APIs which are optimized for performance. 2. Use Appropriate Data Formats Parquet and ORC : These columnar formats are efficient for storage and querying, signif

SQL Interview Success: Unlocking the Top 5 Frequently Asked Queries

 Here are the five top commonly asked SQL queries in the interviews. These you can expect in Data Analyst, or, Data Engineer interviews.

Commonly asked in interviews

Top SQL Queries for Interviews

01. Joins

The commonly asked question pertains to providing two tables, determining the number of rows that will return on various join types, and the resultant.



















Inner join


5 rows will return

The result will be:


1  1

1   1

1   1

1    1

3    3

02. Substring and Concat

Here, we need to write an SQL query to make the upper case of the first letter and the small case of the remaining letter.










SELECT CONCAT(UPPER(SUBSTRING(name, 1, 1)), LOWER(SUBSTRING(name, 2))) AS capitalized_name FROM Table1;

03. Case statement

SQL Query


SELECT Code1, Code2, 


        WHEN Code1 = 'A' AND Code2 = 'AA' THEN "A" | "AA"

        WHEN Code1 = 'B' AND Code2 = 'BB' THEN "B" | "BB"

        WHEN Code1 = 'C' AND Code2 = 'CC' THEN "C" | "CC"

    END AS Combined



04. Question on NULLS

a). What is a NULL value in SQL, and what does it signify?

A NULL value in SQL represents the absence of a value in a field. It signifies that the value is unknown or undefined.

b). How are NULL values treated in SQL comparisons and arithmetic operations?

Comparisons involving NULL typically result in NULL (UNKNOWN), except for the IS NULL and IS NOT NULL operators, which specifically check for NULL values. Arithmetic operations involving NULL generally yield NULL as the result.

c). What is the difference between a NULL value and an empty string ('') in SQL?

NULL represents the absence of a value, while an empty string ('') is a value – it signifies a string with zero characters.

d). How can you check if a column contains NULL values in SQL?

You can use the IS NULL or IS NOT NULL operators in a WHERE clause to check for NULL or non-NULL values in a column.

e). How do you handle NULL values in SQL queries to avoid unexpected results?

Handling NULL values involves using functions like COALESCE, IFNULL, or CASE statements to replace NULL values with a specified default value or handle them appropriately in expressions.

f). Can you perform calculations involving columns with NULL values in SQL? If so, how?

Yes, you can perform calculations involving columns with NULL values. However, you need to handle NULL values explicitly using functions like COALESCE or ISNULL to replace them with appropriate values before performing calculations.

g). What happens if you use the DISTINCT keyword in a SQL query with NULL values?

The DISTINCT keyword eliminates duplicate values from the result set. If NULL values are present, they are treated as unique values, and only one NULL value is included in the result set.

h). How can you replace NULL values with a specific value in SQL queries?

You can use the COALESCE function or the CASE statement to replace NULL values with a specific value in SQL queries.

i). Explain the behavior of aggregate functions like SUM() and AVG() when NULL values are present in the data.

Aggregate functions like SUM() and AVG() ignore NULL values and only operate on non-NULL values in the specified column.

j). Can we create an index on a column with NULL values in SQL? If yes, how does it work?

Yes, you can create an index on a column containing NULL values. Indexes in SQL databases typically store references to rows based on the indexed column's values. If the column contains NULL values, those rows are also indexed, but they need additional storage space due to the indexing structure. When querying using the indexed column, NULL values are included in the result set just like other values.

These answers cover the basic concepts and practices related to NULL values in SQL.

05. Window functions ROW_NUMBER(), RANK(), DENSE_RANK()

Here's the SQL query for ROW_NUMBER(). Similarly, you can use RANK() and DENSE_RANK() functions in place of ROW_NUMBER().

SELECT *,   


FROM Person;  


In addition to the above queries, we need to check how to write Semi join and Anti join


Popular posts from this blog

How to Fix datetime Import Error in Python Quickly

How to Check Kafka Available Brokers

SQL Query: 3 Methods for Calculating Cumulative SUM