Difference between revisions of "Database files in IFs"
|Line 29:||Line 29:|
Revision as of 17:18, 28 March 2016
IFs Historical Database Files
IFs uses Microsoft Access files to store data and data dictionary (meta-data). All data files are in the “IFs\Data” folder. Data and related files are listed below:
- IFsHistSeries.Mdb is the largest and most frequently used IFs data file containing more than three thousand data tables each containing 186 rows (one row of data per country) and several columns (one column per year). The figure below shows data from an IFsHistSeries table.
- DataDict.Mdb is the data dictionary file with a table containing one row of meta-data (e.g., definition, unit, source, last date of update) for each of the data tables in IFsHistSeries.Mdb.
- IFs.Mdb is the Microsoft Access file that contains several IFs data tables. One of these table - "Country Translation" - is requied for automated IFs data import/update. Country Translation table maintains (and updates) a concordance list between country names used by IFs and data sources.
- IFsWVSCohort.Mdb is the file that contains data from waves of World Value Survey, a global survey of cultural values conducted by University of Michigan.
- IFsDataImport.Mdb, is an MS Access database that holds the data series imported using IFs software's automated single series 'import' interface.
- IFsDataImportBatch.Mdb, is the Access database that houses the data series imported using IFs software's automated batch import interface.
IFs Data Table Naming Convention
Names of all data tables in IFsHistSeries.MDB start with the prefix “Series”. The “Series” prefix is followed by an issue area prefix, e.g., “Ag” for agriculture or “Ed” for education. This second tier of prefix might be followed by additional prefixes (e.g., "EdSec" for secondary education) or might be absent altogether (e.g., in some of the earlier imports).
Data series names might also contain a suffix, the usual purpose of which is to differentiate among the sources for the same/similar series.
No spaces or symbols (other than %) are allowed in series name.
IFs DataDict Columns
The Datadict.mdb file serves as a reference for all series in the IFsHistSeries.mdb file. Every series in IFsHistSeries has an entry in DataDict containing all of the metadata on that series. The Data Dictionary lists each variable, the groups to which it belongs (e.g., Agriculture, Economics) its subgroup (e.g., Trade, Consumption), and additional identifying information. This information includes whether or not the data is a series (Yes/No), CoVaTrA, Cohort. It also includes a definition of the variable, and a column for an extended definition provided by the data source. The data dictionary has columns identifying the years for which a series has data, the source of the data, the original source (e.g. a series may have been pulled from the FAO website, but may have originated as World Bank research.) and the source name of the series, and an identifier for which team member last updated the series and when. It also includes instructions on how data should be aggregated or disaggregated for provincial models (e.g., by population or GDP distribution). Some additional information is supplied that is used by the model such as whether a datum of 0 should be treated as a null or as a zero, if a series is used in the preprocessor, if it is compared to other forecasts, the number of decimal places to read, and any formulas applied to the data.