KitDocumentation

ColumnQuantile

Gets the values for a numerical column that corresponding to the amount for a given quantile. The user provides the given quantile percentage and code is written to find that quantile. The quantile can be computed for specified one or more columns, or for all numerical columns. Additionally, you can specify 1 or more demical value to get the quantiles for.

Options

columns: Specifies columns to calculate quantile values
where: Specifies a condition for the calculation
group: Groups the data before performing the operation
addToDataframe: Indicates whether to add the result to the dataframe
quantile: One or more quantiles to extract
variablesForQuantiles: Create a variable for each quantile for easy use

Examples

Example 1 - Get Default Quantile of a Single Column

By default, quantile returns the median (0.5 quantile) when no specific value is provided. In this example, we compute the median CustomerAge to understand the central tendency of the column.
#> ColumnQuantile CustomerAge
AFLEFT 
bankTransactionsDfQuantile = bankTransactionsDf['CustomerAge'].quantile() AFRIGHT

Example 2 - Get Specific Quantiles of a Single Column

Rather than just retrieving the default median, we can specify the exact quantiles we want. Here we calculate the 25%, 50%, 75%, and 100% of CustomerAge to better understand its distribution.
#> ColumnQuantile --columns CustomerAge --quantile .25 .5 .75 1
AFLEFT 
bankTransactionsDfQuantile = bankTransactionsDf['CustomerAge'].quantile( [ .25, .5, .75, 1 ] ) AFRIGHT

Example 3 - Create Variables for Each Quantile

In addition to computing quantiles, we can assign each quantile value to its own variable. This allows for easier use later in code when comparing or referencing specific thresholds.
#> ColumnQuantile --columns CustomerAge --quantile .25 .5 .75 1 --variablesForQuantiles
AFLEFT 
bankTransactionsDfQuantile = bankTransactionsDf['CustomerAge'].quantile( [ .25, .5, .75, 1 ] )
bankTransactionsDfQuantile0 = bankTransactionsDfQuantile.iloc[0]
bankTransactionsDfQuantile1 = bankTransactionsDfQuantile.iloc[1]
bankTransactionsDfQuantile2 = bankTransactionsDfQuantile.iloc[2]
bankTransactionsDfQuantile3 = bankTransactionsDfQuantile.iloc[3] AFRIGHT