RowCount
Counts the number of rows in the dataframe. This kit can count the rows within a dataframe as follows:
- Count the number of total rows
- Count the rows grouped by a column
- Count the rows where a value appears
- Count the rows that match a criteria
- Count the rows with missing data
- Count the rows without missing data
- Count the number of total rows
- Count the rows grouped by a column
- Count the rows where a value appears
- Count the rows that match a criteria
- Count the rows with missing data
- Count the rows without missing data
Options
where: Specifies a condition for counting rows
values: Specifies values for counting rows
group: Counts the rows of each data group
notMissing: Specifies whether to count rows with non-missing values
missing: Specifies whether to count rows with missing values
Examples
Example 1 - Count Total Number of Rows
One of the simplest checks you can make is to count how many rows are in a dataframe. This is useful to verify that the data loaded correctly or to understand the dataset size before applying transformations.
#> RowCount --print
AFLEFT
pizzeriasDfRowCount = pizzeriasDf.shape[0]
print(pizzeriasDfRowCount) #)2 AFRIGHT
Example 2 - Count Rows with Missing Values
This example counts how many rows contain missing values in any column. It’s a useful first step when deciding whether to clean, impute, or drop incomplete records.
#> RowCount --missing
AFLEFT
pizzeriasDfMissingRowCount = pizzeriasDf.isnull().any(axis=1).sum() AFRIGHT
Example 3 - Count Rows with No Missing Values
In contrast to counting rows with missing data, this counts only those rows that are fully populated. It helps you isolate the clean records in your dataset.
#> RowCount --notMissing
AFLEFT
pizzeriasDfNotMissingRowCount = pizzeriasDf.shape[0] - pizzeriasDf.isnull().any(axis=1).sum() AFRIGHT
Example 4 - Count Rows Containing a Specific Value
This example checks how many rows contain the exact value 3.0 in any column. This can be useful when searching for numeric markers or flags across a dataframe.
#> RowCount --values 3.0
AFLEFT
pizzeriasDfRowCount_1 = (pizzeriasDf.isin([3.0]).any(axis=1).sum()) AFRIGHT