The World Bank has an annual global doing business report available here <http://www.doingbusiness.org/>. The latest data has been pulled from the report updated on February 15, 2019 which covers 189 economies from year 2004 to year 2019.
Series pulled from the World Bank Doing Business report
Instructions on pulling doing business series
- The data can be accessed from the World Bank’s Doing Business database page http://data.worldbank.org/data-catalog/doing-business-database. By clicking on the tab entitled "Data & Resources", the raw data can be downloaded either through CSV format or Excel format.
- Using Excel files as an example, the file contains three sheets with the names Data, Series, and Footnote. The Doing Business data was acquired from the Data tab. Prior to importing the data into the IFs, the raw data needs some pre-processing. The first step is to concord the country names with the IFs country names since the IFs does not have a specific country concording list for Doing Business data. Then, the next step is to separate the data into several individual files based on different series. After those two procedures, the data should be ready for import.
- One thing to note in the country concording process is that there are some sub-areas in the data (e.g., China - Beijing). Those data should be ignored since we only care about the data in certain countries. However, data for areas like 'Hong Kong, China' would be different, we actually need the data for that. Thus, to differentiate from sub-areas and main areas, you should look for that dash symbol '-' (i.e., China - Beijing would be treated as a sub-area).
- Certain series we had in the system are not being measured in the Doing Business data. Thus, series that were not included in the original downloaded data were typically not updated. Since the organization constantly updates their data and methodology each year, some series will not be concordant through years. Those series will need to be handled carefully.
- Series that were not updated are as follows (11 series in total): BetterBusinessIndex, BusinessRegulationIndex, ConstructionPermitDays, ConstructionPermitProcedures, EmpRigiIndex, ExportDocumentationNeeded, ImportDocumentationNeeded, HiringDiffIndex, RedundancyCostperWeekSalary, RedundancyIndex, TotalTaxRate. Note that the Tax%Profit series was updated using the "Total tax rate (% of profit)" in the raw downloaded data.
- Series that have been updated but need to be put on hold due to inconsistency in values through years are: SeriesGovWBDoingBusContainerExportCostUSD, SeriesGovWBDoingBusContainerImportCostUSD, SeriesGovWBDoingBusCreditTransparency, SeriesGovWBDoingBusImportDays, SeriesGovWBDoingBusLegalRightsStrengthIndex. Those 5 series in the most recent updated data only cover the years 2014-2017 while data in IFs covers 2006-2014. Moreover, the values in the overlapping year 2014 differ so much that we might need to create them as new series.
For the 2019 update, the WB Doing Business data was pulled from the website using R. It was then cleaned and concorded with the corresponding variable names and the IFs country concordance list. The R script can be found in the Data Team Shared folder. In 2015, there were updates to the Doing Business methodology. Specific changes that affected our pulls were for the following series, and these series are no longer reported (stopped in 2015) and are now disaggregated into two separate series for each one.
GovWBDoingBusImportDays; SeriesGovWBDoingBusContainerImportCostUSD; SeriesGovWBDoingBusContainerExportCostUSD.
The new series, GovWBDoingBusContainExportCostUSDNew,
The new series, GovWBDoingBusContainImportCostUSDNew,
The new series, GovWBDoingBusImportDaysNew
What are the biggest differences (in both relative and absolute terms) in 2015? 2050? 2100?
What are the effects of these changes on other variables in the model?
Any other major changes or anomalies?