Featured Post

How to Read a CSV File from Amazon S3 Using Python (With Headers and Rows Displayed)

Image
  Introduction If you’re working with cloud data, especially on AWS, chances are you’ll encounter data stored in CSV files inside an Amazon S3 bucket . Whether you're building a data pipeline or a quick analysis tool, reading data directly from S3 in Python is a fast, reliable, and scalable way to get started. In this blog post, we’ll walk through: Setting up access to S3 Reading a CSV file using Python and Boto3 Displaying headers and rows Tips to handle larger datasets Let’s jump in! What You’ll Need An AWS account An S3 bucket with a CSV file uploaded AWS credentials (access key and secret key) Python 3.x installed boto3 and pandas libraries installed (you can install them via pip) pip install boto3 pandas Step-by-Step: Read CSV from S3 Let’s say your S3 bucket is named my-data-bucket , and your CSV file is sample-data/employees.csv . ✅ Step 1: Import Required Libraries import boto3 import pandas as pd from io import StringIO boto3 is...

R language five useful real functions

In Data Science R language plays a crucial role. In the R language, there are five top functions present. These functions I have explained in this post.
#5-key-points-in-r

1. Storing Values

  • Stores a value to variable. The value can be same or mixed data type.
  • It is available /* */ to give comments for your scripts inside
  • Char, Double, Boolean and Decimal are more frequently used data types

2. Reading data from files

  • Large data objects will usually be read as values from external files rather than entered during an R session at the keyboard. 
  • R input facilities are simple and their requirements are fairly strict and even rather inflexible. There is a clear presumption by the designers of R that you will be able to modify your input files using other tools, such as file editors or Perl1 to fit in with the requirements of R. Generally this is very simple.
  • If variables are to be held mainly in data frames, as we strongly suggest they should be, an entire data frame can be read directly with the read.table() function. 
  • There is also a more primitive input function, scan(), that can be called directly. For more details on importing data into R and also exporting data, see the R Data Import/Export manual.

3. Accessing builtin datasets

  • Around 100 datasets are supplied with R (in package datasets), and others are available in packages (including the recommended packages supplied with R). To see the list of datasets currently available use data().
  • All the datasets supplied with R are available directly by name. However, many packages still use the obsolete convention in which data was also used to load datasets into R, for example data(infert) and this can still be used with the standard packages (as in this example). 
  • In most cases this will load an R object of the same name. However, in a few cases it loads several objects, so see the on-line help for the object to see what to expect.

4. Grouped expressions

  • R is an expression language in the sense that its only command type is a function or expression which returns a result. Even an assignment is an expression whose result is the value assigned,and it may be used wherever any expression may be used; in particular multiple assignments are possible. 
  • Commands may be grouped together in braces, {expr_1; ...; expr_m}, in which case the value of the group is the result of the last expression in the group evaluated. Since such a group is also an expression it may, for example, be itself included in parentheses and used a part of an even larger expression, and so on

5. Writing your own functions

  • R language allows the user to create objects of mode function. These are true R functions that are stored in a special internal form and may be used in further expressions and so on. In the process, the language gains enormously in power,convenience and elegance, and learning to write useful functions is one of the main ways to make your use of R comfortable and productive. 
  • It should be emphasized that most of the functions supplied as part of the R system, such as mean(), var(), postscript() and so on, are themselves written in R and thus do not differ materially from user written functions.

Comments

Popular posts from this blog

SQL Query: 3 Methods for Calculating Cumulative SUM

5 SQL Queries That Popularly Used in Data Analysis

Big Data: Top Cloud Computing Interview Questions (1 of 4)