Skip to content

rockfish.metrics

metrics

Classes

Functions

session_length(dataset: LocalDataset, session_length_field: str = 'session_length') -> LocalDataset

Returns a new dataset with the metadata of table and a field containing the session length.

Parameters:

Name Type Description Default
dataset LocalDataset

Input dataset.

required
session_length_field str

New field name to hold session length.

'session_length'

Raises:

Type Description
rf.errors.TableMetadataMissingError

When the dataset does not contain dataset metadata.

count_all(dataset: LocalDataset, field: str, *, nlargest: Optional[int] = None) -> LocalDataset

Returns a new table containing the distinct values for the specified field and a new field with the number times the value occurred. The new field is named field_count.

Parameters:

Name Type Description Default
dataset LocalDataset

Input dataset.

required
field str

Field name to count.

required
nlargest Optional[int]

Limit results to nlargest values.

None

interarrivals(dataset: LocalDataset, time_field: str, interarrival_field: str = 'interarrival', unit: Precision = 's') -> LocalDataset

Create a table containing the interarrival times.

Metadata fields are determined using the schema metadata.

The table will contain the metadata and the interarrival times. Each session has n-1 rows due to the first point not having a interarrival delta.

Parameters:

Name Type Description Default
dataset LocalDataset

Input dataset.

required
time_field str

Field in dataset containing times.

required
interarrival_field str

Field to be added with interarrival times.

'interarrival'

Raises:

Type Description
arrow.MetadataMissingError

When table does not contain schema metadata.

aggregate(dataset: LocalDataset, field: str, agg_func_name: AggregateMethod, group_fields: Optional[list[str]] = None, output_field: Optional[str] = None) -> LocalDataset

Returns a new table containing the aggregate values in the specified field for each group. If group_fields == [], the aggregation will be applied to the entire dataset.

Parameters:

Name Type Description Default
dataset LocalDataset

Input dataset.

required
field str

Field name to compute aggregation.

required
agg_func_name AggregateMethod

Aggregation function to use. For numerical field: "sum", "mean", "min", "max", "variance", "stddev", "count", "count_distinct". For categorical field: "count", "count_distinct".

required
group_fields Optional[list[str]]

Optional. The fields to group by. The precedence order for group fields is this group_fields > table_metadata.group_fields > table_metadata.metadata. Default is None.

None
output_field Optional[str]

Optional. Name of the output field. Default is None. The output will be named field_agg_func_name.

None

transitions_within_sessions(dataset: LocalDataset, field: str, metadata_fields: Optional[list[str]] = None, k_gram: Optional[int] = 2, collapse: bool = False) -> LocalDataset

Creates a new dataset that lists session keys to define sessions and the state transitions for a given stateful field within each session.

Parameters:

Name Type Description Default
dataset LocalDataset

The input dataset.

required
field str

A stateful field.

required
metadata_fields Optional[list[str]]

Optional. The metadata fields to group by. If provided, it will take precedence over the schema metadata. Default is None.

None
k_gram Optional[int]

Optional. The number of states to consider in each transition. Default is 2. If None, the full collapsed transitions for all states will be considered, regardless of the value of collapse.

2
collapse bool

Whether to collapse repeated consecutive states into a single state. Default is False. This parameter is effective only when k_gram is not None.

False