Featured Post

How to Check Column Nulls and Replace: Pandas

Image
Here is a post that shows how to count Nulls and replace them with the value you want in the Pandas Dataframe. We have explained the process in two steps - Counting and Replacing the Null values. Count null values (column-wise) in Pandas ## count null values column-wise null_counts = df.isnull(). sum() print(null_counts) ``` Output: ``` Column1    1 Column2    1 Column3    5 dtype: int64 ``` In the above code, we first create a sample Pandas DataFrame `df` with some null values. Then, we use the `isnull()` function to create a DataFrame of the same shape as `df`, where each element is a boolean value indicating whether that element is null or not. Finally, we use the `sum()` function to count the number of null values in each column of the resulting DataFrame. The output shows the count of null values column-wise. to count null values column-wise: ``` df.isnull().sum() ``` ##Code snippet to count null values row-wise: ``` df.isnull().sum(axis=1) ``` In the above code, `df` is the Panda

SAN Storage: All about its 4 Real Usages

The storage area network fundamentals everyone must know you understand about applications. These applications may refer to horizontal applications (e.g., backup, archiving, data replication, disaster protection, and data warehousing) or vertical applications (e.g., online transaction processing (OLTP), enterprise resource planning (ERP) business applications, electronic commerce, broadcasting, prepress, medical, and geophysics).

SAN is also well suited to making performance and high availability more scalable and more affordable in applications such as clustering and data sharing. This article discusses two major horizontal applications, backup and data sharing, and how they interact with SAN. The other important point is, if you are a job seeker the below list is helpful. This is just a like a one time SAN interviews refresher. So you can do well in interviews.


1. Realtime (or window-less) backup

The importance of window-less backup (also called hot backup) becomes obvious when it addresses the large volume of data in a SAN centralized backup library. Realtime backup essentially lets you back up a volume or file periodically and automatically without affecting normal system operations.


The technique commonly used is called a snapshot, where you make a copy of the volume needing backup, and then back up the copy while accessing and modifying the original volume in normal operations. Network Integrity leads in development, and EMC and HDS have implemented solutions in currently available products. Major providers of total backup solutions include ADIC, ATL, StorageTek, Hewlett-Packard (HP), Exabyte, and Overland.


2. Resource sharing

A storage subsystem attached to multiple computer platforms is divided into partitions, each partition being accessible only to its owning platform or to a certain number of homogeneous platforms. The administrator can reassign storage capacity to different platforms as needs change.


One of the benefits of SAN connectivity is its ability to share resources (e.g., a large tape library) among multiple backup servers. Such sharing enables administrators to consolidate backups-from many different servers to locally attached tape drives-into one tape library.


3. Dynamic resource sharing

All storage is available to any connected host; hosts are allocated storage as they need it. If one host needs the storage, it can use any or all the available space. If a host deletes a file, that space is available to any other host. This dynamic storage sharing operates automatically and transparently. Dynamic resource sharing means that the systems administrator doesn't have to partition the storage before storing the data.

Data copy sharing: This process involves replication of the data. Data is the same across copies at the time of copy creation, but the copies can change independently afterward. There is no assurance that they will remain identical. Data access is usually prevented during replication so the copy accurately reflects all the data at a particular time.


For large amounts of data, the time needed to copy it may be important, , and the amount of storage necessary to store the copy could be very large. SAN facilitates data-copy sharing by allowing high-bandwidth connections to transfer large volumes of data.

4. True data sharing

If you are sharing data without making a copy, multiple computer platforms can access the same physical instance of the recorded data on a storage subsystem. This type of sharing is called true data sharing. Different levels of performance and complexity exist in implementing true data sharing:

The first level is when heterogeneous platforms can access data, but only the original data owner can modify it.

The second level is when multiple heterogeneous platforms can update and rewrite a data item, but only one at a time. In this case, you must use a locking mechanism to momentarily prevent a platform from updating the data.

The third level is called concurrent data sharing and exists when all platforms can either read or update the data at the same time.

The advantages of true data sharing are numerous. With only one copy of data, you never need to replicate the data for use elsewhere, you simplify data maintenance, and you eliminate problems due to out of sync conditions. True Data Sharing among platforms running heterogeneous operating systems requires translating to one common operating system. Examples of vendors offering implementations of true data sharing in a SAN architecture are Sequent, Mercury Computer Systems, DataDirect, Transoft, Retrieve, and Network Disk.

Comments

Popular posts from this blog

Explained Ideal Structure of Python Class

How to Check Kafka Available Brokers

6 Python file Methods Real Usage