Featured Post

8 Ways to Optimize AWS Glue Jobs in a Nutshell

Image
  Improving the performance of AWS Glue jobs involves several strategies that target different aspects of the ETL (Extract, Transform, Load) process. Here are some key practices. 1. Optimize Job Scripts Partitioning : Ensure your data is properly partitioned. Partitioning divides your data into manageable chunks, allowing parallel processing and reducing the amount of data scanned. Filtering : Apply pushdown predicates to filter data early in the ETL process, reducing the amount of data processed downstream. Compression : Use compressed file formats (e.g., Parquet, ORC) for your data sources and sinks. These formats not only reduce storage costs but also improve I/O performance. Optimize Transformations : Minimize the number of transformations and actions in your script. Combine transformations where possible and use DataFrame APIs which are optimized for performance. 2. Use Appropriate Data Formats Parquet and ORC : These columnar formats are efficient for storage and querying, signif

8 top AWS tricky interview Questions

In this post, I have explained AWS (Amazon Web Services) tricky interview questions. The EBS, AMI, S3 and Amazon instance included in my questions

Q1. Explain Elastic Block Storage? What type of performance can you expect? How do you back it up? How do you improve performance?

A1. EBS is a virtualized SAN or storage area network. That means it is RAID storage to start with so it’s redundant and fault-tolerant. If disks die in that RAID you don’t lose data.

Great! It is also virtualized, so you can provision and allocate storage, and attach it to your server with various API calls. No calling the storage expert and asking him or her to run specialized commands from the hardware vendor.
 
Performance on EBS can exhibit variability. That is it can go above the SLA performance level, then drop below it. The SLA provides you with an average disk I/O rate you can expect.

This can frustrate some folks especially performance experts who expect reliable and consistent disk throughput on a server. Traditional physically hosted servers behave that way. Virtual AWS instances do not.
AWS Interview Questions
AWS Interview Questions
Backup EBS volumes by using the snapshot facility via an API call or via a GUI interface like elastic fox. Improve performance by using Linux software raid and striping across four volumes.
Q2. What is S3? What is it used for? Should encryption be used?

A2. S3 stands for Simple Storage Service. You can think of it like FTP storage, where you can move files to and from there, but not mount it like a file system. 

AWS automatically puts your snapshots there, as well as AMIs there. Encryption should be considered for sensitive data, as S3 is a proprietary technology developed by Amazon themselves, and as yet unproven vis-a-vis a security standpoint.
Q3. What is an AMI? How do I build one?

A3. AMI stands for Amazon Machine Image. It is effectively a snapshot of the root file system. Commodity hardware servers have a BIOS that points the master boot record of the first block on a disk. 
A disk image though can sit anywhere physically on a disk, so Linux can boot from an arbitrary location on the EBS storage network.
Build a new AMI by first spinning up and instance from a trusted AMI. Then adding packages and components as required. Be wary of putting sensitive data onto an AMI. 
For instance, your access credentials should be added to an instance after spinup. With a database, mount an outside volume that holds your MySQL data after spinup as well.
Q4. Can I vertically scale an Amazon instance? How?

A4. Yes. This is an incredible feature of AWS and cloud virtualization. Spinup a new larger instance than the one you are currently running. Pause that instance and detach the root EBS volume from this server and discard it. 

Then stop your live instance, detach its root volume. Note the unique device ID and attach that root volume to your new server. And then start it again. Voila, you have scaled vertically in-place!! Read online for more questions

Comments

Popular posts from this blog

How to Fix datetime Import Error in Python Quickly

How to Check Kafka Available Brokers

SQL Query: 3 Methods for Calculating Cumulative SUM