The World Bank has an annual global doing business report available here <http://www.doingbusiness.org/>. The latest data has been pulled from the report updated on February 15, 2018 which covers 189 economies from year 2004 to year 2017.
Series pulled from the World Bank Doing Business report
Instructions on pulling doing business series
- The data can be accessed from the World Bank’s Doing Business database page http://data.worldbank.org/data-catalog/doing-business-database. By clicking on the tab entitled "Data & Resources", the raw data can be downloaded either through CSV format or Excel format.
- Using Excel files as an example, the file contains three sheets with the names Data, Series, and Footnote. The Doing Business data was acquired from the Data tab. Prior to importing the data into the IFs, the raw data needs some pre-processing. The first step is to concord the country names with the IFs country names since the IFs does not have a specific country concording list for Doing Business data. Then, the next step is to separate the data into several individual files based on different series. After those two procedures, the data should be ready for import.
- One thing to note in the country concording process is that there are some sub-areas in the data (e.g., China - Beijing). Those data should be ignored since we only care about the data in certain countries. However, data for areas like 'Hong Kong, China' would be different, we actually need the data for that. Thus, to differentiate from sub-areas and main areas, you should look for that dash symbol '-' (i.e., China - Beijing would be treated as a sub-area).
- Certain series we had in the system are not being measured in the Doing Business data. Thus, series that were not included in the original downloaded data were typically not updated. Since the organization constantly updates their data and methodology each year, some series will not be concordant through years. Those series will need to be handled carefully.
- Series that were not updated are as follows (11 series in total): BetterBusinessIndex, BusinessRegulationIndex, ConstructionPermitDays, ConstructionPermitProcedures, EmpRigiIndex, ExportDocumentationNeeded, ImportDocumentationNeeded, HiringDiffIndex, RedundancyCostperWeekSalary, RedundancyIndex, TotalTaxRate. Note that the Tax%Profit series was updated using the "Total tax rate (% of profit)" in the raw downloaded data.
- Series that have been updated but need to be put on hold due to inconsistency in values through years are: SeriesGovWBDoingBusContainerExportCostUSD, SeriesGovWBDoingBusContainerImportCostUSD, SeriesGovWBDoingBusCreditTransparency, SeriesGovWBDoingBusImportDays, SeriesGovWBDoingBusLegalRightsStrengthIndex. Those 5 series in the most recent updated data only cover the years 2014-2017 while data in IFs covers 2006-2014. Moreover, the values in the overlapping year 2014 differ so much that we might need to create them as new series.
What are the biggest differences (in both relative and absolute terms) in 2015? 2050? 2100?
What are the effects of these changes on other variables in the model?
Any other major changes or anomalies?