Introduction to PivotPal Tools

A comprehensive guide to understanding and utilizing the core functionalities of the PivotPal Python package.


Table of Contents

  1. 1. Value Distribution Table: Dive deep into the distribution of values for a specified column.

    Learn More:
  2. 2. Unique Values in Dataset: Uncover the unique values present in each column.

    Learn More:
  3. 3. Missing Values Analysis: Identify and understand the gaps in your dataset.

    Learn More:
  4. 4. Duplicate Row Analysis: Detect and quantify duplicate rows in your dataset.

    Learn More:

1. Value Distribution Table

Overview: Analyzes the distribution of values for a specified column in a dataset.

Explanation: This function provides a breakdown of the distribution of values for a specified column. It's useful for understanding the spread and frequency of data points within a column.

pp.distribution(your_data, 'column_name')
Column NameCount%
Value1Count1%
Value2Count2%
Value3Count3%

The table above showcases the distribution of values for the specified column. It provides a count and percentage distribution for each unique value.

2. Unique Values in Dataset

Overview: Provides a count of unique values for each column in the dataset.

Explanation: This function enumerates the unique values present in each column of the dataset, helping to understand data diversity.

pp.unique(your_data)
Column NameUnique Count
Column1Count1
Column2Count2
Column3Count3

The table above lists the unique value counts for each column in the dataset. This helps in understanding the diversity and spread of data within columns.

3. Missing Values Analysis

Overview: Provides a summary of missing values for each column in the dataset.

Explanation: This function identifies columns with missing values, providing a count and percentage of missing data. It's crucial for data cleaning and preprocessing.

pp.missing(your_data)
Column NameMissing CountMissing %
Column1Count1%
Column2Count2%
Column3Count3%

The table above highlights columns with missing values. It provides a count and percentage of missing data for each column, aiding in data quality assessment.

4. Duplicate Row Analysis

Overview: Identifies and counts duplicate rows in the dataset.

Explanation: Duplicate rows can skew analysis and lead to incorrect conclusions. This function helps in identifying and potentially removing them.

pp.duplicates(your_data)
Row IndexDuplicate Count
Index1Count1
Index2Count2
Index3Count3

The table above lists rows that are duplicated in the dataset along with their count.