ColumnStandardDeviation

Calculates the standard deviation of one or more columns. Furthermore, can calculate the standard deviation while being grouped by another column or category, perform a rolling standard deviation or sliding window standard deviation, can calculate the standard deviation for only rows that meet a condition, and can add the result to the dataframe as a new column

Options

columns: Specifies columns to calculate standard deviation values

where: Specifies a condition for the calculation

group: Groups the data before performing the operation

rolling: Specifies the rolling window overwhich to perform the calculation

addToDataframe: Indicates whether to add the result to the dataframe

Examples

Example 1 - Get Standard Deviation of a Single Column

Standard deviation provides insight into the variability or spread of values in a column. In this example, we compute the standard deviation of the Humidity3pm column to understand how much the afternoon humidity tends to vary day to day.

#> ColumnStandardDeviation Humidity3pm
AFLEFT 
weatherDfStd = weatherDf['Humidity3pm'].std() AFRIGHT

Example 2 - Get Standard Deviation of Multiple Columns

When multiple columns are of interest, we can compute their standard deviations in a single step. This example evaluates the variation in morning temperature, afternoon temperature, and rainfall across the entire dataset.

#> ColumnStandardDeviation Temp3pm Temp9am Rainfall
AFLEFT 
weatherDfStd = weatherDf [ ['Temp3pm', 'Temp9am', 'Rainfall'] ].std() AFRIGHT

Example 3 - Get Grouped Standard Deviations

To compare how variability differs across groups, we can compute the standard deviation within each group. In this example, we group by RainTomorrow and calculate the standard deviations of Sunshine and Rainfall to understand how weather variability differs between rainy and non-rainy days.

#> ColumnStandardDeviation Sunshine Rainfall --group RainTomorrow
AFLEFT 
weatherDfStd = weatherDf.groupby('RainTomorrow') [ ['Sunshine', 'Rainfall'] ].std()
weatherDfStd = pd.DataFrame(weatherDfStd).reset_index() AFRIGHT