ColumnStandardDeviation
Calculates the standard deviation of one or more columns. Furthermore, can calculate the standard deviation while being grouped by another column or category, perform a rolling standard deviation or sliding window standard deviation, can calculate the standard deviation for only rows that meet a condition, and can add the result to the dataframe as a new column
Options
columns: Specifies columns to calculate standard deviation values
where: Specifies a condition for the calculation
group: Groups the data before performing the operation
rolling: Specifies the rolling window overwhich to perform the calculation
addToDataframe: Indicates whether to add the result to the dataframe
Examples
Example 1 - Get Standard Deviation of a Single Column
Standard deviation provides insight into the variability or spread of values in a column. In this example, we compute the standard deviation of the Humidity3pm column to understand how much the afternoon humidity tends to vary day to day.
#> ColumnStandardDeviation Humidity3pm
AFLEFT
weatherDfStd = weatherDf['Humidity3pm'].std() AFRIGHT
Example 2 - Get Standard Deviation of Multiple Columns
When multiple columns are of interest, we can compute their standard deviations in a single step. This example evaluates the variation in morning temperature, afternoon temperature, and rainfall across the entire dataset.
#> ColumnStandardDeviation Temp3pm Temp9am Rainfall
AFLEFT
weatherDfStd = weatherDf [ ['Temp3pm', 'Temp9am', 'Rainfall'] ].std() AFRIGHT
Example 3 - Get Grouped Standard Deviations
To compare how variability differs across groups, we can compute the standard deviation within each group. In this example, we group by RainTomorrow and calculate the standard deviations of Sunshine and Rainfall to understand how weather variability differs between rainy and non-rainy days.
#> ColumnStandardDeviation Sunshine Rainfall --group RainTomorrow
AFLEFT
weatherDfStd = weatherDf.groupby('RainTomorrow') [ ['Sunshine', 'Rainfall'] ].std()
weatherDfStd = pd.DataFrame(weatherDfStd).reset_index() AFRIGHT