Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

World Bank data for Exiobase 2.0 #19

Open
wants to merge 48 commits into
base: master
Choose a base branch
from
Open

Conversation

bixiou
Copy link

@bixiou bixiou commented May 2, 2018

Data for 2007 from World Bank. Includes data on Population, GDP (in current $ or current PPP), Trade (Ex, Im), Labor, as well as Land area and Energy use.

The (quite arbitrary) order of regions is preserved in the data.

Warning: there is no data for Taiwan (other missing points appear).

Use this code to create a DataFrame from the data:

WB_data = pd.DataFrame(pd.ExcelFile(path_misc+'/WB_data.xls').parse(0)[0:47])
WB_data.set_index('region', inplace=True)
# Euro conversion (WB_data is in dollars but Exiobase is in euros)
USD_EUR = 0.730754
WB_data[['GDP_PPP','GDP','GDPpc','GDPpcPPP','Export','Import']] = USD_EUR*WB_data[['GDP_PPP','GDP','GDPpc','GDPpcPPP','Export','Import']]
# Usage: 
WB_data.loc[['FR','AT']][['GDPpc','GDP']]

Data for 2007 from World Bank. Includes data on Population, GDP (in current $ or current PPP), Trade (Ex, Im), Labor, as well as Land area and Energy use. 

The (quite arbitrary) order of regions is preserved in the data.

Warning: there is no dat for Taiwan (other missing points appear).

Use this code to create a DataFrame from the data:
```
WB_data = pd.DataFrame(pd.ExcelFile(path_misc+'/WB_data.xls').parse(0)[0:47])
WB_data.set_index('region', inplace=True)
# Euro conversion (WB_data is in dollars but Exiobase is in euros)
WB_data[['GDP_PPP','GDP','GDPpc','GDPpcPPP','Export','Import']] = 0.730754*WB_data[['GDP_PPP','GDP','GDPpc','GDPpcPPP','Export','Import']]
# Usage: 
WB_data.loc[['FR','AT']][['GDPpc','GDP']]
```
@coveralls
Copy link

Pull Request Test Coverage Report for Build 32

  • 0 of 0 (NaN%) changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 42.906%

Totals Coverage Status
Change from base Build 31: 0.0%
Covered Lines: 1010
Relevant Lines: 2354

💛 - Coveralls

@coveralls
Copy link

coveralls commented May 2, 2018

Pull Request Test Coverage Report for Build 34

  • 0 of 0 (NaN%) changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 42.906%

Totals Coverage Status
Change from base Build 31: 0.0%
Covered Lines: 1010
Relevant Lines: 2354

💛 - Coveralls

I committed an error while creating data for RoW regions. Thus, I withdrew data for those variables that cannot be computed for RoW regions (because they are not extensive).
I added PPP conversion factors.
Note that with GDP, Population and PPP, the following variables can be deduced: GDP_PPP, GDPpc, GDPpcPPP. I let them in the data for the ease. Note that there is inconsistency in Cyprus data from the World Bank.
@bixiou
Copy link
Author

bixiou commented May 3, 2018

WARNING: there is a problem with RoW Europe (WE) and RoW Asia & Pacific (WA): the World Bank doesn't provide data for Europe, but only for Europe & Central Asia. Similarly, it doesn't provide data for Asia & Pacific, but only for South Asia and for East Asia & Pacific.
Hence, I conflated the Exiobase aggregations with the World Bank ones, although they do not coincide. In particular, Central Asia is included in RoW Europe in WB_data.xls, which makes its population four times higher than in population.txt. The effect on RoW Asia is less important (8%), given the much higher population is RoW Asia without Central Asia.

Preliminary comparisons of GDP and Trade data computed from Exiobase (using my own original functions) gives me the impression that Exiobase data are not adjusted for be consistent with national accounts. 
Also, there seems to be a discrepancy between the values of imports and exports computed from Exiobase, and those from the World Bank, the former being lower of ~20%. I wonder from where it comes from. 
Finally, Exiobase data seems to be in basic price (as shows the fact that the data on value added corresponds to GDP at factor cost, not at GDP per se), and I wonder how is accounted the rest of GDP in the data (namely, value added taxes). 
Indeed, I am only one week new on pymrio so I have a lot to learn on all these things. In any case, I congratulate you, Konstantin and your colleagues, for this fantastic work!
@konstantinstadler
Copy link
Member

Hi bixiou,
Thanks for the contibution - great that you find pymrio useful.
From my side, pymrio is currently on hold since I am on paternity leave until September. I will have a look into the issues and the pull request when I am back. Just a short clearification before you run off in the wrong direction: pymrio aims to be generic for various MRIOs, not only for exiobase2. My plan is to update the population (and also gdp ppp) data when I am back, based on UNdata with some manual work for Taiwan. This will than feed into a dynamic aggregation to be consistent for the different supported mrio databases.
best
kst

@bixiou
Copy link
Author

bixiou commented May 8, 2018

Ok. Enjoy the family time, Konstantin !

@bixiou
Copy link
Author

bixiou commented Jul 27, 2018

I cleaned the code I use and put it in oriented-object to push it on my branch of pymrio.

Yet, I am not sure if it should be merged into the main branch of pymrio: yes, some functions are useful (give the impacts, the EROIs or the structural path analysis of some sectors, etc.), but the code is not the neatest. This is so because I integrate several databases and have to include kind of by-hand patches for each of them in the functions. Thus, the code runs well for the databases that are integrated (Exiobase 1 and 2, THEMIS and Cecilia), but probably not for WIOD or Eora. Moreover, THEMIS and Cecilia are kind of unknown database (and the former is not publicly available).

Finally, the code does not pass the tests, and I haven't written a documentation (each function has its doc string but it's not as neat as the main branch of pymrio).

If you think it should be merged, I'll work on passing the tests and documenting it better. I can also push other functions that I have, more related to trade and labor, and implementing choropleth maps.

If you think it shouldn't, I hope you'll be able to merge only my first first commits, which add data from World Bank, and discard the sooner commits of my branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants