Featured Post

8 Ways to Optimize AWS Glue Jobs in a Nutshell

Image
  Improving the performance of AWS Glue jobs involves several strategies that target different aspects of the ETL (Extract, Transform, Load) process. Here are some key practices. 1. Optimize Job Scripts Partitioning : Ensure your data is properly partitioned. Partitioning divides your data into manageable chunks, allowing parallel processing and reducing the amount of data scanned. Filtering : Apply pushdown predicates to filter data early in the ETL process, reducing the amount of data processed downstream. Compression : Use compressed file formats (e.g., Parquet, ORC) for your data sources and sinks. These formats not only reduce storage costs but also improve I/O performance. Optimize Transformations : Minimize the number of transformations and actions in your script. Combine transformations where possible and use DataFrame APIs which are optimized for performance. 2. Use Appropriate Data Formats Parquet and ORC : These columnar formats are efficient for storage and querying, signif

R objects useful command to delete them

R-Commands
R-Commands
The entities that R creates and manipulates are known as objects. These may be variables, arrays of numbers, character strings, functions, or more general structures built from such components. During an R session, objects are created and stored by name. This post tells you how to delete them.

The R command
> objects()

(alternatively, ls()) can be used to display the names of (most of) the objects which are currently stored within R. The collection of objects currently stored is called the workspace. The data visualization in R Language with GGplot a good idea to start.

To remove objects the function rm is available:
> rm(x, y, z, ink, junk, temp, foo, bar)

All objects created during an R session can be stored permanently in a file for use in future R sessions.

At the end of each R session you are given the opportunity to save all the currently available objects. If you indicate that you want to do this, the objects are written to a file called .RData5 in the current directory, and the command lines used in the session are saved to a file called .Rhistory.

When R is started at later time from the same directory it reloads the workspace from this file. At the same time the associated commands history is reloaded.
  • It is recommended that you should use separate working directories for analyses conducted  ith R. 
  • It is quite common for objects with names x and y to be created during an analysis. Names like this are often meaningful in the context of a single analysis, but it can be quite hard to decide what they might be when the several analyses have been conducted in the same directory.

Comments

Popular posts from this blog

How to Fix datetime Import Error in Python Quickly

How to Check Kafka Available Brokers

SQL Query: 3 Methods for Calculating Cumulative SUM