pandasのunstackの使い方
概要
aggregate
関数で得られたmulti-indexを展開してカラムに変換する- 挙動としてはpivotに似ている
具体例
import pandas as pd
df = pd.DataFrame()
df["date"] = ["2023-01-01", "2023-01-01", "2023-01-02", "2023-01-03"]
df["name"] = ["yamada", "takahashi", "ishikawa", "kobayashi"]
df
"""
| date | name |
|:-----------|:----------|
| 2023-01-01 | yamada |
| 2023-01-01 | takahashi |
| 2023-01-02 | ishikawa |
| 2023-01-03 | kobayashi |
"""
# aggregateでマルチインデックスで結果を得る
x = df.groupby(by=["date", "name"]).agg(count=("date", "count")).reset_index()
print(x.to_markdown(index=False))
"""
| date | name | count |
|:-----------|:----------|--------:|
| 2023-01-01 | takahashi | 1 |
| 2023-01-01 | yamada | 1 |
| 2023-01-02 | ishikawa | 1 |
| 2023-01-03 | kobayashi | 1 |
"""
# マルチインデックスをカラムに変換する
x = df.groupby(by=["date", "name"]).agg(count=("date", "count")).unstack().reset_index()
print(x.to_markdown(index=False))
"""
| ('date', '') | ('count', 'ishikawa') | ('count', 'kobayashi') | ('count', 'takahashi') | ('count', 'yamada') |
|:---------------|------------------------:|-------------------------:|-------------------------:|----------------------:|
| 2023-01-01 | nan | nan | 1 | 1 |
| 2023-01-02 | 1 | nan | nan | nan |
| 2023-01-03 | nan | 1 | nan | nan |
"""