Featured Post

Top Questions People Ask About Pandas, NumPy, Matplotlib & Scikit-learn — Answered!

Image
 Whether you're a beginner or brushing up on your skills, these are the real-world questions Python learners ask most about key libraries in data science. Let’s dive in! 🐍 🐼 Pandas: Data Manipulation Made Easy 1. How do I handle missing data in a DataFrame? df.fillna( 0 ) # Replace NaNs with 0 df.dropna() # Remove rows with NaNs df.isna(). sum () # Count missing values per column 2. How can I merge or join two DataFrames? pd.merge(df1, df2, on= 'id' , how= 'inner' ) # inner, left, right, outer 3. What is the difference between loc[] and iloc[] ? loc[] uses labels (e.g., column names) iloc[] uses integer positions df.loc[ 0 , 'name' ] # label-based df.iloc[ 0 , 1 ] # index-based 4. How do I group data and perform aggregation? df.groupby( 'category' )[ 'sales' ]. sum () 5. How can I convert a column to datetime format? df[ 'date' ] = pd.to_datetime(df[ 'date' ]) ...

How to verify SSH Installed in Hadoop Cluster Quickly

Below command helps, whether SSH is installed or not on your Hadoop cluster.

[hadoop-user@master]$ which ssh
/user/bin/bash
[hadoop-user@master] $ which sshd
/user/bin/sshd
[hadoop-user@master] $ which ssh -keygen
/user/bin/sshd

If you do not get proper response as above. That means that SSH is not installed on your cluster.

Resolution:


If you receive an error message

/user/bin/which: no ssh in (/user/bin: /user/sbin....)

You need to install open SSH (www.openssh.com) vial Linux package manager. Or by downloading the source directly.

Note: This is usually done by System Admin.

Comments

Popular posts from this blog

SQL Query: 3 Methods for Calculating Cumulative SUM

5 SQL Queries That Popularly Used in Data Analysis

Big Data: Top Cloud Computing Interview Questions (1 of 4)