Showing posts with the label HDFS commands

Featured Post

How to Check Column Nulls and Replace: Pandas

Here is a post that shows how to count Nulls and replace them with the value you want in the Pandas Dataframe. We have explained the process in two steps - Counting and Replacing the Null values. Count null values (column-wise) in Pandas ## count null values column-wise null_counts = df.isnull(). sum() print(null_counts) ``` Output: ``` Column1    1 Column2    1 Column3    5 dtype: int64 ``` In the above code, we first create a sample Pandas DataFrame `df` with some null values. Then, we use the `isnull()` function to create a DataFrame of the same shape as `df`, where each element is a boolean value indicating whether that element is null or not. Finally, we use the `sum()` function to count the number of null values in each column of the resulting DataFrame. The output shows the count of null values column-wise. to count null values column-wise: ``` df.isnull().sum() ``` ##Code snippet to count null values row-wise: ``` df.isnull().sum(axis=1) ``` In the above code, `df` is the Panda

The best helpful hdfs file system commands (3 of 4)

dus- hadoop fs -dus PATH dus reports the sum of the file sizes in aggregate rather than individually. expunge- hadoop fs -expunge Empties the trash. If the trash feature is enabled, when a file is deleted, it is first moved into the temporary Trash/folder. The file will be permanently deleted from the Trash/folder only after user-configurable delay. get - hadoop -fs -get [-ignorecrc] [-crc] SRC LOCASDST Copies files to the local file system.