Data pre-processing for Climate Spirals Visualisation

In [1]:
%matplotlib inline
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from scipy import stats

CO2 emissions from PRIMAP-hist

Historical CO2 emissions are taken from the PRIMAP-hist dataset (Gütschow et al., 2016, in review).

In [2]:
primaphist = pd.read_csv("PRIMAP-hist_v1.0_14-Apr-2016.csv")
primaphist = primaphist[(primaphist.country == "EARTH") & 
                        (primaphist.entity == "CO2") &
                        (primaphist.category == "CAT0")]
primaphist = primaphist.drop(['scenario', 'country', 'category', 'entity', 'unit'], axis=1).T
primaphist.index.name = "year"
primaphist.index = primaphist.index.astype(int)
primaphist.columns = ["CAT0"]
primaphist = primaphist / 1000000 # convert to Gt
primaphist.plot()
plt.xlabel("Year")
plt.ylabel(u"Emissions in GtCO$_2$")
Out[2]:
<matplotlib.text.Text at 0x7ff1d7d8fcd0>

As bunkers (international aviation and shipping) are not included in PRIMAP-hist they are taken from the CDIAC dataset which includes bunkers from 1950 onwards.

In [3]:
!head nation.1751_2013.csv
"Nation","Year","Total CO2 emissions from fossil-fuels and cement production (thousand metric tons of C)","Emissions from solid fuel consumption","Emissions from liquid fuel consumption","Emissions from gas fuel consumption","Emissions from cement production","Emissions from gas flaring","Per capita CO2 emissions (metric tons of carbon)","Emissions from bunker fuels (not included in the totals)"
(Note: missing values denoted by ".")
Source: Tom Boden and Bob Andres (Oak Ridge National Laboratory); Gregg Marland (Appalachian State University)
DOI: 10.3334/CDIAC/00001_V2016
"AFGHANISTAN",1949,4,4,0,0,0,.,.,0
"AFGHANISTAN",1950,23,6,18,0,0,0,0,0
"AFGHANISTAN",1951,25,7,18,0,0,0,0,0
"AFGHANISTAN",1952,25,9,17,0,0,0,0,0
"AFGHANISTAN",1953,29,10,18,0,0,0,0,0
"AFGHANISTAN",1954,29,12,18,0,0,0,0,0
In [4]:
cdiac = pd.read_csv("nation.1751_2013.csv", skiprows=[1, 2, 3], usecols=[0, 1, 9]) 
In [5]:
bunkers = cdiac.groupby("Year").sum().loc[1850:2013] * 3.667 / 1000000 # from kt C to Gt CO2
bunkers.columns = ["Bunkers"]
bunkers = bunkers.Bunkers
bunkers.plot()
sns.plt.ylabel("Gt CO$_2$")
sns.plt.title("Global sum of CDIAC bunkers")
Out[5]:
<matplotlib.text.Text at 0x7ff1d7c108d0>

Extension of CDIAC bunkers

CDIAC bunkers are extended until 2014 the same way as missing data in the PRIMAP-hist paper (15 year trend).

In [6]:
start = bunkers.index[-1] - 15
sel = bunkers.loc[start:]
slope, intercept, r_value, p_value, std_err = stats.linregress(sel.index, sel)
new_index = pd.Index(range(start, 2015))
extrapolation = pd.Series(new_index * slope + intercept, index=new_index)
ax = extrapolation.plot()
sel.plot(ax=ax)
bunkers.set_value(2014, extrapolation.loc[2014])
bunkers = bunkers.round(2)

Total emissions are PRIMAP-hist plus CDIAC bunkers (assuming pre-1950 bunkers being negligible).

In [7]:
total = primaphist.join(bunkers).sum(axis=1)

BP estimates the energy related CO2 emissions to be increased by 0.1% in 2015. Applying this to non-energy sectors as well we can get a rough estimate for 2015 emissions by simply re-using the 2014 value.

In [8]:
total = total.set_value(2015, total.loc[2014])
In [9]:
ax = primaphist.plot()
total.plot(ax=ax)
sns.plt.title("Sum of PRIMAP-hist plus CDIAC bunkers and PRIMAP-hist CAT0 only")
sns.plt.legend(["PRIMAP-hist", "PRIMAP-hist + CDIAC bunkers (extended until 2015)"], loc="best")
sns.plt.ylabel("Gt CO$_2$")
Out[9]:
<matplotlib.text.Text at 0x7ff1d7ca1810>

Export for visualisation

In [10]:
export = pd.DataFrame({"value": total})
In [11]:
export.reset_index().to_csv("../public/emissions.csv", index=False)
export.tail()
Out[11]:
value
year
2011 40.35
2012 40.70
2013 41.42
2014 41.80
2015 41.80

Comparison with IPCC AR5 Synthesis Report budgets

Table 2.2 in the IPCC AR5 Synthesis Report: http://www.ipcc.ch/pdf/assessment-report/ar5/syr/SYR_AR5_FINAL_full_wcover.pdf

Cumulative CO2 emissions from 1870 (Fractions of simulations: 66%), RCP scenarios only

<2 °C   2900 GtCO2
<1.5 °C 2250 GtCO2

Cumulative CO2 emissions from 2011 (Fractions of simulations: 66%), RCP scenarios only

<2 °C   1000 GtCO2
<1.5 °C  400 GtCO2


Cumulative emissions 1850 - 1869 (PRIMAP-hist): 46.66 GtCO2

Historical 1850-2010 emissions for PRIMAP-hist plus CDIAC bunkers are ~90 Gt higher than the 1900 Gt from the RCP-based budgets.

In [12]:
total.loc[1850:1869].sum()
Out[12]:
46.659999999999997
In [13]:
total.loc[1850:2010].sum()
Out[13]:
1991.5500000000002

CO2 concentrations

CO2 concentrations are taken from the CMIP6 concentration dataset, version from 1 July 2016.

In [14]:
cmip6 = pd.read_csv(
    "mole_fraction_of_carbon_dioxide_in_air_input4MIPs_GHGConcentrations_CMIP_UoM-CMIP-1-1-0_gr3-GMNHSH_000001-201412.csv",
    index_col=["year", "month"]
)
cmip6 = cmip6.loc[1850:]
cmip6 = cmip6.drop(['datenum', 'datetime', 'day'], axis=1)
cmip6.head()
Out[14]:
data_mean_global data_mean_nh data_mean_sh
year month
1850 1 284.944656 285.629154 284.260158
2 285.333792 286.291159 284.376425
3 285.682825 286.941396 284.424254
4 285.931179 287.485812 284.376545
5 285.885928 287.405461 284.366396
In [15]:
cmip6.plot()
Out[15]:
<matplotlib.axes._subplots.AxesSubplot at 0x7ff1d7963ed0>
In [16]:
cmip6.loc[2010:].plot()
cmip6 = cmip6.reset_index()
In [17]:
cmip6.rename(columns={"data_mean_global": "value"}).to_csv("../public/concentrations.csv", index=False)
cmip6.rename(columns={"data_mean_nh": "value"}).to_csv("../public/concentrations_nh.csv", index=False)
cmip6.rename(columns={"data_mean_sh": "value"}).to_csv("../public/concentrations_sh.csv", index=False)

Global Temperatures

Global temperature data is taken from the HadCRUT4 near surface temperature dataset.

http://www.metoffice.gov.uk/hadobs/hadcrut4/data/current/download.html

In [18]:
hadcrut = pd.read_csv(
    "HadCRUT.4.4.0.0.monthly_ns_avg.txt",
    delim_whitespace=True,
    usecols=[0, 1],
    header=None
)
hadcrut['year'] = hadcrut.iloc[:, 0].apply(lambda x: x.split("/")[0]).astype(int)
hadcrut['month'] = hadcrut.iloc[:, 0].apply(lambda x: x.split("/")[1]).astype(int)

hadcrut = hadcrut.rename(columns={1: "value"})
hadcrut = hadcrut.iloc[:, 1:]


hadcrut = hadcrut.set_index(['year', 'month'])

hadcrut -= hadcrut.loc[1850:1900].mean()
hadcrut.plot()
hadcrut = hadcrut.reset_index()
plt.xlabel("Time")
plt.ylabel(u"Temperature anomalies (°C) (1850-1990 mean)")
plt.legend("")
Out[18]:
<matplotlib.legend.Legend at 0x7ff1d78becd0>
In [19]:
hadcrut.to_csv("../public/temperatures.csv", index=False)