rockfish.labs.vis
vis
Classes
Functions
plot_bar(datasets: list[LocalDataset], field: str, weights: Optional[str] = None, order: Optional[list] = None, orient: str = 'vertical', nlargest: Optional[int] = 10, stat: BinStat = 'percent', **kwargs)
Plot data as a bar plot.
This plot should only be used for categorical data. For numerical data
consider using :func:plot_kde
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
datasets
|
list[LocalDataset]
|
List of Dataset all with a same schema. |
required |
field
|
str
|
A categorical field name. |
required |
weights
|
Optional[str]
|
If set, a field name containing frequencies of each category for the specified field. Set this if you have pre-aggregated data. |
None
|
order
|
Optional[list]
|
Order of categories to display. |
None
|
orient
|
str
|
Orientation of the plot. Can be either "vertical" or "horizontal". Default is "vertical", meaning the x-axis will represent the field. If set to "horizontal", the y-axis will represent the field. |
'vertical'
|
nlargest
|
Optional[int]
|
Limit the number of categories to display. It will not be effective if the Dataset is aggregated with weights provided. Default is 10. Set to None to display all categories. |
10
|
stat
|
BinStat
|
Statistic to compute for each bin. Default is "percent", which represents the percentage of each bin relative to the total counts and is useful for comparing distributions with different data sizes. |
'percent'
|
plot_kde(datasets: list[LocalDataset], field: str, weights: Optional[str] = None, duration_unit: DurationUnit = 's', **kwargs)
Create a kernel density estimate plot.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
datasets
|
list[LocalDataset]
|
List of Dataset all with a same schema. |
required |
field
|
str
|
A continuous numerical field. |
required |
weights
|
Optional[str]
|
If set, a field name containing the weights for the specified field. Set this if you have pre-aggregated data. |
None
|
duration_unit
|
DurationUnit
|
When the specified field is a duration type, display it using these units. |
's'
|
kwargs
|
Additional arguments are passed to the seaborn displot function. |
{}
|
plot_cdf(datasets: list[LocalDataset], field: str, weights: Optional[str] = None, duration_unit: DurationUnit = 's', **kwargs)
Create a cumulative distribution function plot.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
datasets
|
list[LocalDataset]
|
List of Dataset all with a same schema. |
required |
field
|
str
|
A continuous numerical field. |
required |
weights
|
Optional[str]
|
If set, a field name containing the weights for the specified field. Set this if you have pre-aggregated data. |
None
|
duration_unit
|
DurationUnit
|
When the specified field is a duration type, display it using these units. |
's'
|
kwargs
|
Additional arguments are passed to the seaborn displot function. |
{}
|
plot_hist(datasets: list[LocalDataset], field: str, weights: Optional[str] = None, duration_unit: DurationUnit = 's', stat: BinStat = 'density', **kwargs)
Create a histogram plot.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
datasets
|
list[LocalDataset]
|
List of Dataset all with a same schema. |
required |
field
|
str
|
A continuous numerical field name. |
required |
weights
|
Optional[str]
|
If set, a field name containing the weights for the specified field. Set this if you have pre-aggregated data. |
None
|
duration_unit
|
DurationUnit
|
When the specified field is a duration type, display it using these units. |
's'
|
kwargs
|
Additional arguments are passed to the seaborn displot function. |
{}
|
|
stat
|
BinStat
|
Statistic to compute for each bin. Default is "density", which normalizes the histogram so that the area under the histogram equals 1 and is useful for comparing distributions with different sample sizes. |
'density'
|
plot_distribution(datasets: list[LocalDataset], field: str, weights: Optional[str] = None, order: Optional[list] = None, **kwargs)
Plot the data as either a histogram or kde depending on the type of data.
If you don't like which one this picks then you can call one of the lower level
functions directly, either :func:plot_kde
or :func:plot_histogram
.
plot_scatter(datasets: list[LocalDataset], field_x: str, field_y: str, **kwargs)
Create a scatter plot of the data in the x and y fields of the tables.
Each table is plotted with a different color and listed in the legend by name.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
datasets
|
list[LocalDataset]
|
Tables containing plot data. Each table must have fields with names field_x and field_y. |
required |
field_x
|
str
|
A continuous numerical field name to plot as the x-axis. |
required |
field_y
|
str
|
A continuous numerical field name to plot as the y-axis. |
required |
plot_correlation(datasets: list[LocalDataset], field_x: str, field_y: str, **kwargs)
Create a scatter plot plot with Pearson correlation coefficient.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
datasets
|
list[LocalDataset]
|
Tables containing plot data. Each table must have fields with names field_x and field_y. |
required |
field_x
|
str
|
A continuous numerical field name to plot as the x-axis. |
required |
field_y
|
str
|
A continuous numerical field name to plot as the y-axis. |
required |
plot_correlation_heatmap(datasets: list[LocalDataset], fields: list[str], cmap='Reds', **kwargs)
Plot correlation heatmap.
This plot should only be used for numerical columns.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
datasets
|
list[LocalDataset]
|
List of Datasets all with the same schema. |
required |
fields
|
list[str]
|
List of field names for the numerical values. |
required |
kwargs
|
Additional arguments are passed to the seaborn heatmap function. |
{}
|
plot_association_heatmap(datasets: list[LocalDataset], fields: list, correction: bool = False, cmap='Blues', **kwargs)
Plot association heatmap. This plot should only be used for catigorical columns.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
datasets
|
list[LocalDataset]
|
List of Datasets all with the same schema. |
required |
fields
|
list
|
List of field names for the catigorical values. |
required |
correction
|
bool
|
Boolean value. If True, apply bias correction to Cramer's V. Default is False. |
False
|
kwargs
|
Additional arguments are passed to the seaborn heatmap function. |
{}
|
custom_plot(datasets: list, query: str, plot_func, *args, **kwargs)
Create a custom plot using custom datasets via the chosen plot_func
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
datasets
|
list
|
List of LocalDatasets all with the same schema. |
required |
query
|
str
|
An SQL query against a table named |
required |
plot_func
|
A callable plotting function chosen from |
required | |
kwargs
|
Additional arguments except |
{}
|