Fashion brand CO2e emissions 👟¶
Fashion brands increasingly have to be aware and report on their environmental footprint.
The following dataset comes from a real fashion brand, and has been anomymized. Each row represents a product manufactured in a given year.
import icanexplain as ice
def fmt_CO2e(kg):
if abs(kg) < 1e3:
return f'{kg:,.2f}kgCO2e'
return f'{kg / 1e6:,.1f}ktCO2e'
products = ice.datasets.load_product_footprints()
products.sample(5).style.format({'footprint': fmt_CO2e, 'units': '{:,d}'})
year | category | product_id | footprint | units | |
---|---|---|---|---|---|
79622 | 2022 | PANTS | 0c7938bf | 13.38kgCO2e | 105 |
23575 | 2021 | PANTS | 7693f75b | 36.50kgCO2e | 41 |
113417 | 2023 | PANTS | c5c54140 | 26.89kgCO2e | 288 |
67791 | 2022 | PANTS | aed08558 | 106.76kgCO2e | 301 |
49045 | 2022 | PANTS | a1cf7d5c | 35.67kgCO2e | 925 |
The footprint
column indicates the product's carbon footprint in kgCO2e. The units
column corresponds to the number of units produced.
Companies usually report their emissions on a yearly basis. We can do this by multiplying the footprint of each product, with the number of units produced, and summing the results.
(
products
.groupby('year')
.apply(lambda g: (g['footprint'] * g['units']).sum() / g['units'].sum(), include_groups=False)
.to_frame('average')
.assign(diff=lambda x: x.average.diff())
.style.format(fmt_CO2e, na_rep='')
)
average | diff | |
---|---|---|
year | ||
2021 | 21.95kgCO2e | |
2022 | 21.71kgCO2e | -0.24kgCO2e |
2023 | 22.74kgCO2e | 1.03kgCO2e |
The average footprint went down between 2021 and 2022. It then went back up in 2023. Of course, we want to understand why. When they see this, fashion brands have one word coming out of their mouth: why, why, why?
The overall average footprint can change for two reasons:
- The average footprint per product category evolved.
- The mix of product categories evolved.
The second reason is called the mix effect. For instance, let's say t-shirts have a lower footprint than jackets. If the share of jackets produced in 2023 is higher than in 2022, the average footprint will go up.
The jackets in 2023 aren't necessarily the same than those of 2022. They could be more sustainable, and have a lower footprint. This is the tricky part: we need to disentangle the mix effect from the evolution of the footprint of each product category. That is the value proposition of this package.
explainer = ice.MeanExplainer(
fact='footprint',
count='units',
period='year',
group='category',
)
explanation = explainer(products)
explanation.style.format({'inner': fmt_CO2e, 'mix': fmt_CO2e}, na_rep='')
inner | mix | ||
---|---|---|---|
year | category | ||
2022 | DRESS | 0.05kgCO2e | -0.14kgCO2e |
JACKET | -0.17kgCO2e | -0.69kgCO2e | |
PANTS | 0.61kgCO2e | 0.20kgCO2e | |
SHIRT | -0.02kgCO2e | 0.00kgCO2e | |
SWEATER | -0.39kgCO2e | -0.09kgCO2e | |
TSHIRT | 0.08kgCO2e | 0.30kgCO2e | |
2023 | DRESS | -0.08kgCO2e | 0.51kgCO2e |
JACKET | -0.13kgCO2e | 0.97kgCO2e | |
PANTS | -0.22kgCO2e | -0.09kgCO2e | |
SHIRT | 0.02kgCO2e | -0.03kgCO2e | |
SWEATER | -0.06kgCO2e | 0.36kgCO2e | |
TSHIRT | -0.16kgCO2e | -0.06kgCO2e |
Here's the meaning of each column:
inner
is the difference due to the change in the average footprint per unit. A negative inner values means the footprint per unit shifted in a way that reduced emissions. For instance, low emission products seem to have been prioritized in 2022 (-17.5ktCO2e), but not in 2023 (+73.4ktCO2e).mix
is the difference due to the change in the number of units produced. A negative mix value means the number of units produced shifted in a way that reduced emissions.
A convenient way to read these values is to use a waterfall chart.
This is better than reporting the average footprint and unit produced separately. It's more informative to quantify their contribution to the change in emissions. Here it's good to confirm that the decrease in emissions is mostly due to a reduction in the number of units produced for both years. But it's also good to see that there was an increase due to the average footprint in 2023. Importantly, each one of these effects is calculated, and not just assumed.
It's natural to want to deepen the analysis. For instance:
- Why is there a significant inner contribution for pants in 2022? Is it because the materials are less sustainable? Or because the pants got heavier?
- The reduction in 2023 is mainly due to the reduction in the number of units produced. Can this be broken down into marketing segments? For instance, is the reduction mainly driven by online or in-person sales? How does this break down by country?
These questions hint at the interactive aspect of this kind of analysis. Once you break down a metric's evolution along a dimension, the next steps are to break down the metric (question 1) and/or include another dimension (question 2).