Posts

Featured Post

8 Ways to Optimize AWS Glue Jobs in a Nutshell

Image
  Improving the performance of AWS Glue jobs involves several strategies that target different aspects of the ETL (Extract, Transform, Load) process. Here are some key practices. 1. Optimize Job Scripts Partitioning : Ensure your data is properly partitioned. Partitioning divides your data into manageable chunks, allowing parallel processing and reducing the amount of data scanned. Filtering : Apply pushdown predicates to filter data early in the ETL process, reducing the amount of data processed downstream. Compression : Use compressed file formats (e.g., Parquet, ORC) for your data sources and sinks. These formats not only reduce storage costs but also improve I/O performance. Optimize Transformations : Minimize the number of transformations and actions in your script. Combine transformations where possible and use DataFrame APIs which are optimized for performance. 2. Use Appropriate Data Formats Parquet and ORC : These columnar formats are efficient for storage and querying, signif

2 User Input Python Sample Programs

Image
Here are the Python programs that work on taking user input and giving responses to the user. These are also called interactive programs.  Python enables you to read user input from the command line via the input() function or the raw_input() function. Typically, you assign user input to a variable containing all characters that users enter from the keyboard. User input terminates when users press the <return> key (included with the input characters). #1 User input sample program The following program takes input and replies if the given input value is a string or number. my_input = input("Enter something: ")  try:       x = 0 + eval(my_input)       print('You entered the number:', my_input)  except:       print(userInput,'is a string') Output Enter something:  100 You entered the number: 100 ** Process exited - Return Code: 0 ** Press Enter to exit terminal.  #2 User input sample program The following program takes two inputs from the user and calcula

The Quick and Easy Way to Analyze Numpy Arrays

Image
The quickest and easiest way to analyze NumPy arrays is by using the numpy.array() method. This method allows you to quickly and easily analyze the values contained in a numpy array. This method can also be used to find the sum, mean, standard deviation, max, min, and other useful analysis of the value contained within a numpy array. Sum You can find the sum of Numpy arrays using the np.sum() function.  For example:  import numpy as np  a = np.array([1,2,3,4,5])  b = np.array([6,7,8,9,10])  result = np.sum([a,b])  print(result)  # Output will be 55 Mean You can find the mean of a Numpy array using the np.mean() function. This function takes in an array as an argument and returns the mean of all the values in the array.  For example, the mean of a Numpy array of [1,2,3,4,5] would be  result = np.mean([1,2,3,4,5])  print(result)  #Output: 3.0 Standard Deviation To find the standard deviation of a Numpy array, you can use the NumPy std() function. This function takes in an array as a par

These 10 Skills You Need to Become Data Analyst

Image
To become a data analyst with Python, there are several technical skills you need to learn. Here are the key ones: #1 Python Programming Python is widely used in data analysis due to its simplicity, versatility, and the availability of powerful libraries. You should have a strong understanding of Python fundamentals, including data types, variables, loops, conditional statements, functions, and file handling. #2 Data Manipulation Libraries Familiarize yourself with libraries like NumPy and Pandas, which are essential for data manipulation and analysis. NumPy provides support for efficient numerical operations, while Pandas offers data structures (e.g., DataFrames) for easy data manipulation, cleaning, and transformation. #3 Data Visualization Gain proficiency in data visualization libraries like Matplotlib and Seaborn. These libraries enable you to create insightful visual representations of data, such as line plots, scatter plots, bar charts, histograms, and heatmaps. #4 SQL (Structu

How to Find Non-word Character: Python Regex Example

Image
In Python, the regular expression pattern \W matches any non-word character. Here's an example of usage. The valid word characters are [a-zA-Z0-9_]. \W (upper case W) matches any non-word character. Regex examples to find non-word char #1 Example import re text = "Hello, world! How are you today?" non_words = re.findall(r'\W', text) print(non_words) In the above example, the re.findall() function is used to find all non-word characters in the text string using the regular expression pattern \W. The output will be a list of non-word characters found in the string: Output [',', '!', ' ', ' ', '?'] This includes punctuation marks and spaces but excludes letters, digits, and underscores, which are considered word characters in regular expressions. #2 Example import re text = "Hello, world! How are non-word-char:! you today?" non_words = re.findall(r'non-word-char:\W', text) print(non_words) Output ['non-wo

How to Write ETL Logic in Python: Sample Code to Practice

Image
Here's an example Python code that uses the mysql-connector library to connect to a MySQL database, extract data from a table, transform it, and load it as a JSON file. Here's an example: Python ETL Sample Code import mysql.connector import json # Connect to the MySQL database cnx = mysql.connector.connect(user='username', password='password',                               host='localhost',                               database='database_name') # Define a cursor to execute SQL queries cursor = cnx.cursor() # Define the SQL query to extract data query = ("SELECT column1, column2, column3 FROM table_name") # Execute the SQL query cursor.execute(query) # Fetch all rows from the result set rows = cursor.fetchall() # Transform the rows into a list of dictionaries result = [] for row in rows:     result.append({'column1': row[0], 'column2': row[1], 'column3': row[2]}) # Save the result as a JSON file with open('ou

Quick Guide to AI Prompt Engineering

Image
Here are roles & responsibilities of AI prompt engineer, which has growing demand in the USA and the rest of the world. The new Prompt engineering is a process that designs effective and engaging conversation starters for chatbots and virtual assistants. Guide on Prompt Engineering A key aspect of chatbot development, prompt engineering involves a deep understanding of user interests and behavior , as well as the ability to create prompts that are both relevant and varied.  Discover more about the importance of prompt engineering and its role in creating successful chatbots and virtual assistants. Prompt Writer for Apps like ChatGPT A prompt writer for an app like ChatGPT is someone who creates prompts or conversation starters for the chatbot to engage users in conversation. These prompts are designed to be interesting, relevant, and varied to keep users engaged and encourage them to keep interacting with the chatbot. Prompt writers for chatbots like ChatGPT often have a backgroun

How to Write Lambda Function Quickly in Python: 5 Examples

Image
Here are the top python lambda function examples for your project and interviews. "Python's lambda functions are a powerful way to create small, anonymous functions on the fly. In this post, we'll explore some examples of how to use lambda functions in Python. 5 Best Python Lambda Function Examples #1 Sorting a List of Tuples by the Second Element This lambda function sorts a list of tuples based on the second element of each tuple. python code my_list = [(1, 2), (4, 1), (9, 10), (13, 6), (5, 7)] sorted_list = sorted(my_list, key=lambda x: x[1]) print(sorted_list) Output: [(4, 1), (1, 2), (13, 6), (5, 7), (9, 10)] ** Process exited - Return Code: 0 ** Press Enter to exit terminal #2 Finding the Maximum Value in a List of Dictionaries This lambda function finds the maximum value in a list of dictionaries based on a specific key. python code my_list = [{'name': 'Alice', 'age': 25}, {'name': 'Bob', 'age': 30}, {'name': &