Posts

Showing posts with the label data science data to insights

Featured Post

SQL Query: 3 Methods for Calculating Cumulative SUM

Image
SQL provides various constructs for calculating cumulative sums, offering flexibility and efficiency in data analysis. In this article, we explore three distinct SQL queries that facilitate the computation of cumulative sums. Each query leverages different SQL constructs to achieve the desired outcome, catering to diverse analytical needs and preferences. Using Window Functions (e.g., PostgreSQL, SQL Server, Oracle) SELECT id, value, SUM(value) OVER (ORDER BY id) AS cumulative_sum  FROM your_table; This query uses the SUM() window function with the OVER clause to calculate the cumulative sum of the value column ordered by the id column. Using Subqueries (e.g., MySQL, SQLite): SELECT t1.id, t1.value, SUM(t2.value) AS cumulative_sum FROM your_table t1 JOIN your_table t2 ON t1.id >= t2.id GROUP BY t1.id, t1.value ORDER BY t1.id; This query uses a self-join to calculate the cumulative sum. It joins the table with itself, matching rows where the id in the first table is greater than or

How to Identify Data Relevant for Data Science Analytics

Your government, your web server, your business partners, even your body. While we aren’t drowning in a sea of data, we’re finding that almost everything can (or has) been instrumented. We frequently combine publishing industry data from Nielsen Book Scan with our own sales data, publicly available Amazon data, and even job data to see what’s happening in the publishing industry. Data is everywhere Sites like Infochimps and Factual provide access to many large datasets, including climate data, MySpace activity streams, and game logs from sporting events. Factual enlists users to update and improve its datasets, which cover topics as diverse as endocrinologists to hiking trails. How the data is growing Much of the data we currently work with is the direct consequence of Web 2.0, and of Moore’s Law applied to data. The Web has people spending more time online and leaving a trail of data wherever they go. Mobile applications leave an even richer data trail since many of them a