Skip to content

Genentech/pysummaries

Repository files navigation

Welcome to PySumaries documentation!

PySummaries is a Python package to easily produce table summarizations from pandas, polars or PyArrow dataframes.

Other dataframe libraries supported by narwhals (e.g. Modin, cuDF) may also work but are untested.

For more detailed information, please look at the documentation

Installation

You can install the package with pip:

pip install pysummaries

QuickStart

Let's say we have a dataframe with some data we want to summarize. Let's take a look at the data:

import pandas as pd

from pysummaries import get_table_summary, get_sample_data

df = get_sample_data()
df

alt text

Now, let's do a table one stratifying by group

We can use two backends for the html representation: a pysummaries native representation, and one using the popular great_tables package

Let's start first with the PySummaries native backend:

summary_table = get_table_summary(df, strata='group', backend='native')  
summary_table

alt text

And now, let's try the great tables backend!

summary_table = get_table_summary(df, strata='group', backend='gt')  
summary_table

alt text

You can customize your tables! For this and more options, please look at the documentation