Featured Post

The Quick and Easy Way to Analyze Numpy Arrays

Image
The quickest and easiest way to analyze NumPy arrays is by using the numpy.array() method. This method allows you to quickly and easily analyze the values contained in a numpy array. This method can also be used to find the sum, mean, standard deviation, max, min, and other useful analysis of the value contained within a numpy array. Sum You can find the sum of Numpy arrays using the np.sum() function.  For example:  import numpy as np  a = np.array([1,2,3,4,5])  b = np.array([6,7,8,9,10])  result = np.sum([a,b])  print(result)  # Output will be 55 Mean You can find the mean of a Numpy array using the np.mean() function. This function takes in an array as an argument and returns the mean of all the values in the array.  For example, the mean of a Numpy array of [1,2,3,4,5] would be  result = np.mean([1,2,3,4,5])  print(result)  #Output: 3.0 Standard Deviation To find the standard deviation of a Numpy array, you can use the NumPy std() function. This function takes in an array as a par

5 Top features of Columnar Databases (1 of 2 )

The traditional RDBMS - Since the days of punch cards and magnetic tapes, files have been physically contiguous bytes that are accessed from start (open file) to finish (end-of-file flag = TRUE).

Yes, the storage could be split up on a disk and the assorted data pages connected by pointer chains, but it is still the same model. Then the file is broken into records (more physically contiguous bytes), and records are broken into fields (still more physically contiguous bytes).

A file is processed in record by record (read/fetch next) or sequentially navigated in terms of a physical storage location (go to end of file, go back/forward n records, follow a pointer chain, etc.). There is no parallelism in this model. There is also an assumption of a physical ordering of the records within the file and an ordering of fields within the records.
A lot of time and resources have been spent sorting records to make this access practical; you did not do random access on a magnetic tape and you could not do it with a deck of punch cards.

When we got to RDBMS and SQL, this file system model was still the dominant mindset. Even Dr. Codd fell prey to it. He first had to have a PRIMARY KEY in all tables, which corresponded to the sort order of a sequential file.

 Later, he realized that a key is a key and there is no need to make one of them special in RDBMS. However, SQL had already incorporated the old terminology and the early SQL engines were built on existing file systems, so it stuck.

Also read: Part-2 | Part-1
  • The columnar model takes a fundamentally different approach. But it is one that works nicely with SQL and the relational model.
  • In RDBMS, a table is an unordered set of rows that have exactly the same kind of rows. A row is an unordered set of columns all of the same kind, each of which holds scalar values drawn from a known domain. You access the columns by name, not by a physical position in the storage, but you have the "SELECT*" and other shorthand conventions to save typing.
  • The logical model is as follows: a table is a set of rows with one and only one structure; a row is a set of columns; a column is a scalar value drawn from one and only one domain. Storage usually follows this pattern with conventional file systems, using files for tables, records for rows, and fields for columns. But that has nothing to do with physical storage.
  • In the columnar model, we take a table and store each column in its own structure. Rows and tables are reassembled from these rows. Looking at the usual picture of a table, you can see why they are called vertical storage as opposed to horizontal storage models

Comments

Popular posts from this blog

How to Decode TLV Quickly

7 AWS Interview Questions asked in Infosys, TCS