rockfish.metrics
metrics
Classes
Functions
session_length(dataset: LocalDataset, session_length_field: str = 'session_length') -> LocalDataset
Returns a new dataset with the metadata of table and a field containing the session length.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset
|
LocalDataset
|
Input dataset. |
required |
session_length_field
|
str
|
New field name to hold session length. |
'session_length'
|
Raises:
Type | Description |
---|---|
rf.errors.TableMetadataMissingError
|
When the dataset does not contain dataset metadata. |
count_all(dataset: LocalDataset, field: str, *, nlargest: Optional[int] = None) -> LocalDataset
Returns a new table containing the distinct values for the specified field and a new field with the number times the value occurred. The new field is named field_count.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset
|
LocalDataset
|
Input dataset. |
required |
field
|
str
|
Field name to count. |
required |
nlargest
|
Optional[int]
|
Limit results to nlargest values. |
None
|
interarrivals(dataset: LocalDataset, time_field: str, interarrival_field: str = 'interarrival', unit: Precision = 's') -> LocalDataset
Create a table containing the interarrival times.
Metadata fields are determined using the schema metadata.
The table will contain the metadata and the interarrival times. Each session has n-1 rows due to the first point not having a interarrival delta.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset
|
LocalDataset
|
Input dataset. |
required |
time_field
|
str
|
Field in dataset containing times. |
required |
interarrival_field
|
str
|
Field to be added with interarrival times. |
'interarrival'
|
Raises:
Type | Description |
---|---|
arrow.MetadataMissingError
|
When table does not contain schema metadata. |
aggregate(dataset: LocalDataset, field: str, agg_func_name: AggregateMethod, group_fields: Optional[list[str]] = None, output_field: Optional[str] = None) -> LocalDataset
Returns a new table containing the aggregate values in the specified field for each group. If group_fields == [], the aggregation will be applied to the entire dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset
|
LocalDataset
|
Input dataset. |
required |
field
|
str
|
Field name to compute aggregation. |
required |
agg_func_name
|
AggregateMethod
|
Aggregation function to use. For numerical field: "sum", "mean", "min", "max", "variance", "stddev", "count", "count_distinct". For categorical field: "count", "count_distinct". |
required |
group_fields
|
Optional[list[str]]
|
Optional. The fields to group by. The precedence order for group fields is this group_fields > table_metadata.group_fields > table_metadata.metadata. Default is None. |
None
|
output_field
|
Optional[str]
|
Optional. Name of the output field. Default is None. The output will be named field_agg_func_name. |
None
|
transitions_within_sessions(dataset: LocalDataset, field: str, metadata_fields: Optional[list[str]] = None, k_gram: Optional[int] = 2, collapse: bool = False) -> LocalDataset
Creates a new dataset that lists session keys to define sessions and the state transitions for a given stateful field within each session.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset
|
LocalDataset
|
The input dataset. |
required |
field
|
str
|
A stateful field. |
required |
metadata_fields
|
Optional[list[str]]
|
Optional. The metadata fields to group by. If provided, it will take precedence over the schema metadata. Default is None. |
None
|
k_gram
|
Optional[int]
|
Optional. The number of states to consider in each transition. Default is 2. If None, the full collapsed transitions for all states will be considered, regardless of the value of collapse. |
2
|
collapse
|
bool
|
Whether to collapse repeated consecutive states into a single state. Default is False. This parameter is effective only when k_gram is not None. |
False
|