ColumnQuantile
Gets the values for a numerical column that corresponding to the amount for a given quantile. The user provides the given quantile percentage and code is written to find that quantile. The quantile can be computed for specified one or more columns, or for all numerical columns. Additionally, you can specify 1 or more demical value to get the quantiles for.
Options
columns: Specifies columns to calculate quantile values
where: Specifies a condition for the calculation
group: Groups the data before performing the operation
addToDataframe: Indicates whether to add the result to the dataframe
quantile: One or more quantiles to extract
variablesForQuantiles: Create a variable for each quantile for easy use
Examples
Example 1 - Get Default Quantile of a Single Column
By default, quantile returns the median (0.5 quantile) when no specific value is provided. In this example, we compute the median CustomerAge to understand the central tendency of the column.
#> ColumnQuantile CustomerAge
AFLEFT
bankTransactionsDfQuantile = bankTransactionsDf['CustomerAge'].quantile() AFRIGHT
Example 2 - Get Specific Quantiles of a Single Column
Rather than just retrieving the default median, we can specify the exact quantiles we want. Here we calculate the 25%, 50%, 75%, and 100% of CustomerAge to better understand its distribution.
#> ColumnQuantile --columns CustomerAge --quantile .25 .5 .75 1
AFLEFT
bankTransactionsDfQuantile = bankTransactionsDf['CustomerAge'].quantile( [ .25, .5, .75, 1 ] ) AFRIGHT
Example 3 - Create Variables for Each Quantile
In addition to computing quantiles, we can assign each quantile value to its own variable. This allows for easier use later in code when comparing or referencing specific thresholds.
#> ColumnQuantile --columns CustomerAge --quantile .25 .5 .75 1 --variablesForQuantiles
AFLEFT
bankTransactionsDfQuantile = bankTransactionsDf['CustomerAge'].quantile( [ .25, .5, .75, 1 ] )
bankTransactionsDfQuantile0 = bankTransactionsDfQuantile.iloc[0]
bankTransactionsDfQuantile1 = bankTransactionsDfQuantile.iloc[1]
bankTransactionsDfQuantile2 = bankTransactionsDfQuantile.iloc[2]
bankTransactionsDfQuantile3 = bankTransactionsDfQuantile.iloc[3] AFRIGHT