From Wiki
Jump to: navigation, search

The most recent and complete education model documentation is available on Pardee's website. Although the text in this interactive system is, for some IFs models, often significantly out of date, you may still find the basic description useful to you.


The education model of IFs simulates patterns of educational participation and attainment in 186 countries over a long time horizon under alternative assumptions about uncertainties and interventions (Irfan 2008).  Its purpose is to serve as a generalized thinking and analysis tool for educational futures within a broader human development context. 

The model forecasts gender- and country-specific access, participation and progression rates at levels of formal education starting from elementary through lower and upper secondary to tertiary. The model also forecasts costs and public spending by level of education. Dropout, completion and transition to the next level of schooling are all mapped onto corresponding age cohorts thus allowing the model to forecast educational attainment for the entire population at any point in time within the forecast horizon.

From simple accounting of the grade progressions to complex budget balancing and budget impact algorithm, the model draws upon the extant understanding and standards (e.g., UNESCO's ISCED classification explained later) about national systems of education around the world. One difference between other attempts at forecasting educational participation and attainment (e.g, McMahon 1999; Bruns, Mingat and Rakotomalala 2003; Wils and O’Connor 2003; Delamonica, Mehrotra and Vandemoortele. 2001; Cuaresma and Lutz 2007) and our forecasting, is the embedding of education within an integrated model in which demographic and economic variables interact with education, in both directions, as the model runs. 

In the figure below we display the major variables and components that directly determine education demand, supply, and flows in the IFs system.  We emphasize again the inter-connectedness of the components and their relationship to the broader human development system.  For example, during each year of simulation, the IFs cohort-specific demographic model provides the school age population to the education model.  In turn, the education model feeds its calculations of education attainment to the population model’s determination of women’s fertility.  Similarly, the broader economic and socio-political systems provide funding for education, and levels of educational attainment affect economic productivity and growth, and therefore also education spending.


Structure and Agent System: Education

National Education System
Organizing Structure
Various Levels of Education; Age Cohorts
Educational Attainment; Enrollment
Intake; Graduation; Transition; Spending
Key Aggregate  Relationships 
(illustrative, not comprehensive)
Demand for and achievement in education changes with income, societal change
Public spending available for education rises with income level
Cost of schooling rises with income level
Lack (surplus) of public spending in education hurts (helps) educational access and progression
More education helps economic growth and reduces fertility
Key Agent-Class Behavior  Relationships
(illustrative, not comprehensive)
Families send children to school; Government revenue and expenditure in education

Education Model Coverage

UNESCO has developed a standard classification system for national education systems called International Standard Classification of Education, ISCED. ISCED 1997 uses a numbering system to identify the sequential levels of educational systems—namely, pre-primary, primary, lower secondary, upper secondary, post-secondary non tertiary and tertiary—which are characterized by curricula of increasing difficulty and specialization as the students move up the levels. IFs education model covers  primary (ISCED level 1), lower secondary (ISCED level 2), upper secondary (ISCED level 3), and tertiary education (ISCED levels 5A, 5B and 6).

The model covers 186 countries that can be grouped into any number of flexible country groupings, e.g., UNESCO regions, like any other sub-module of IFs. Country specific entrance age and school-cycle length data are collected and used in IFs to represent national education systems as closely as possible. For all of these levels, IFs forecast variables representing student flow rates, e.g., intake, persistence, completion and graduation, and stocks, e.g., enrolment, with the girls and the boys handled separately within each country.

One important distinction among the flow rates is a gross rate versus a net rate for the same flow. Gross rates include all pupils whereas net rates include pupils who enter the school at the right age, given the statutory entrance age in the country and proceed without any repetition. The IFs education model forecasts both net and gross rates for primary education. For other levels we forecast gross rates only. It would be useful to look at the net rates at least for lower secondary, as the catch up continues up to that level. However, we could not obtain net rate data for lower secondary. 

Additionally, for lower and upper secondary, the IFs model covers both general and vocational curriculum and forecasts the vocational share of total enrolment, EDSECLOWRVOC (for lower secondary) and EDSECUPPRVOC (for upper secondary). Like all other participation variables, these two are also disaggregated by gender.

The output of the national education system, i.e., school completion and partial completion of the young people, is added to the educational attainment of the adults in the population. IFs forecasts four categories of attainment - portion with no education, completed primary education, completed secondary education and completed tertiary education - separately for men and women above fifteen years of age by five year cohorts as well as an aggregate over all adult cohorts. Model software contains so-called "Education Pyramid" or a display of educational attainments mapped over five year age cohorts as is usually done for population pyramids.

Another aggregate measure of educational attainment that we forecast is the average years of education of the adults. We have several measures, EDYEARSAG15, average years of education for all adults aged 15 and above, EDYRSAG25, average years of education for those 25 and older, EDYRSAG15TO24, average years of education for the youngest of the adults aged between fifteen years to twenty four.

IFs education model also covers financing of education. The model forecast per student public expenditure as a share of per capita income. The model also forecast total public spending in education and the share of that spending that goes to each level of education.

What the Model Does Not Cover

ISCED level 0, pre-primary, and level 4, post-secondary pre tertiary, are not common across all countries and are thus excluded from IFs education model.

On the financing side, the model does not include private spending in education, a significant share of spending especially for tertiary education in many countries and even for secondary education in some countries. Scarcity of good data and lack of any pattern in the historical unfolding precludes modelling private spending in education.

Quality of national education system can also vary across countries and over time. The IFs education model does not forecast any explicit indicator of education quality. However, the survival and graduation rates that the model forecasts for all levels of education are implicit indicators of system quality.  At this point IFs does not forecast any indicator of cognitive quality of learners. However, the IFs database does have data on cognitive quality.

The IFs education model does not cover private spending in education.

Sources of Education Data

UNESCO is the UN agency charged with collecting and maintaining education-related data from across the world. UNICEF collects some education data through their MICS survey. USAID also collects education data as a part of its Demographic and Household Surveys (DHS). OECD collects better data especially on tertiary education for its members as well as few other countries.

We collected our student flows and per student cost data from UNESCO Institute for Statistics' (UIS) web data repository. (Accessed on 05/17/2013)

For educational attainment data we use estimates by Robert Barro and Jong Wha Lee (2000). They  have published their estimates of human capital stock (i.e., the educational attainment of adults) at the website of the Center for International Development of Harvard University. In 2001, Daniel Cohen and Marcelo Soto presented a paper providing another human capital dataset for a total of ninety-five countries. We collect that data as well in our database.

When needed we also calculated our own series using underlying data from UNESCO. For example, we calculate an adjusted net intake rate for primary using the age specific intake rates that UNESCO report. We also calculated survival rates in lower and upper secondary (EDSECLOWRSUR, EDSECUPPRSUR) using a reconstructed cohort simulation method from grade-wise enrollment data for two consecutive years. The transition rate from lower to upper secondary is also calculated using grade data.

Reconciliation of Flow Rates

Incongruities among the base year primary flow rates (intake, survival, and enrollment) can arise either from reported data values that, in combination, do not make sense, or from the use of “stand-alone” cross-sectional estimations used in the IFs pre-processor to fill missing data.  Such incongruities might arise among flow rates within a single level of education (e.g., primary intake, survival, and enrollment rates that are incompatible) or between flow rates across two levels of education (e.g., primary completion rate and lower secondary intake rate).

The IFs education model uses algorithms to reconcile incongruent flow values.  They work by (1) analyzing incongruities; (2) applying protocols that identify and retain the data or estimations that are probably of higher quality; and (3) substituting recomputed values for the data or estimations that are probably of lesser quality.  For example, at the primary level, data on enrollment rates are more extensive and more straight-forward than either intake or survival data; in turn, intake rates have fewer missing values and are arguably more reliable measures than survival rates.  The IFs pre-processor reconciles student flow data for Primary by using an algorithm that assumes enrollment numbers to be more reliable than the entrance data and entrance data to be more reliable than survival data.

Variable Naming Convention

All education model variable names start with a two-letter prefix of 'ED' followed, in most cases, by the three letter level indicator - PRI for primary, SEC for secondary, TER for tertiary. Secondary is further subdivided into SECLOWR for lower secondary and SECUPPR for upper secondary. Parameters in the model, which are named using lowercase letters like those in other IFs modules, also follow a similar naming convention.

Education: Dominant Relations

The dominant relationships in the model are those that determine various educational flow rates, e.g., intake rate for primary (EDPRIINT) or tertiary (EDTERINT), or survival rates in primary (EDPRISUR) or lower secondary (EDSECLOWRSUR). These rates are functions of per capita income. Non-income drivers of education are represented by upward shifts in these functions. These rates follow an S-shaped path in most cases. The flows interact with a stocks and flows structure to derive major stocks like enrollment, for the young, and attainment, for the adult.

On the financing side, the major dynamic is  in the cost of education, e.g., cost per student in primary, EDEXPERPRI, the bulk of which is teachers' salary and which thus goes up with rising income.

Public spending allocation in education, GDS(Educ) is a function of national income per capita that proxies level of economic development. Demand for educational spending -  determined by initial projections of enrollment and of per student cost - and total availability of public funds affect the base allocation derived from function.

For diagrams see: Student Flow Charts; Budget Flow Charts; Attainment Flow Charts

For equations see: Student Flow Equations Budget Flow Equations ; Attainment Equations

Key dynamics are directly linked to the dominant relations:

  • Intake, survival and transition rates are functions of per capita income (GDPPCP). These functions shift upward over time representing the non-income drivers of education.
  • Each year flow rates are used to update major stocks like enrollment, for the young, and attainment, for the adult.
  • Per student expenditure at all levels of education is a function of per capita income.
  • Deficit or surplus in public spending on education, GDS(Educ) affects intake, transition and survival rates at all levels of education.

Education: Selected Added Value

IFs Education model is an integrated model. The education system in the model is interlinked with demographic, economic and socio-political systems with mutual feedback within and across theses systems. Schooling of the young is linked to education of the population as whole in this model.

The model is well suited for scenario analysis with representation of policy levers for entrance into and survival at various levels of schooling. Girls and boys are represented separately in this model.

The education budget is also endogenous to the model with income driven dynamics in cost per student for each level of education. Budget availability affect enrollment. Educational attainment raises income and affordability of education at individual and national level.

Education Flow Charts


For each country, the IFs education model represents a multilevel formal education system that starts at primary and ends at tertiary. Student flows, i.e., entry into and progression through the system are determined by forecasts on intake and persistence (or survival) rates superimposed on the population of the corresponding age cohorts obtained from IFs population forecasts. Students at all levels are disaggregated by gender. Secondary education is further divided into lower and upper secondary, and then further into general and vocational according to the curricula that are followed.

The model represents the dynamics in education financing through per student costs for each level of education and a total public spending in education. Policy levers are available for changing both spending and cost.

School completion (or dropout) in the education model is carried forward as the educational attainment of the overall population. As a result, the education model forecasts population structures by age, sex, and attained education, i.e., years and levels of completed education.

The major agents represented in the education system of the model are households,—represented by the parents who decide which of their boys and girls will go to school—and governments that direct resources into and across the educational system.  The major flows within the model are student and budgetary, while the major stock is that of educational attainment embedded in a population. Other than the budgetary variables, all the flows and stocks are gender disaggregated.

The education model has forward and backward linkages with other parts of the IFs model. During each year of simulation, the IFs cohort-specific demographic model provides the school age population to the education model.  In turn, the education model feeds its calculations of education attainment to the population model’s determination of women’s fertility.  Similarly, the broader economic and socio-political systems provide funding for education, and levels of educational attainment affect economic productivity and growth, and therefore also education spending. 

The figure below shows the major variables and components that directly determine education demand, supply, and flows in the IFs system.  The diagram attempts to emphasize on the inter-connectedness of the education model components and their relationship to the broader human development system.

Overvieweducation flow.png

Education Student Flow

Student Flow

IFs education model simulates grade-by-grade student flow for each level of education that the model covers. Grade-by-grade student flow model combine the effects of grade-specific dropout, repetition and reentry into an average cohort-specific grade-to-grade flow rate, calculated from the survival rate for the cohort. Each year the number of new entrants is determined by the forecasts of the intake rate and the entrance age population. In successive years, these entrants are moved to the next higher grades, one grade each year, using the grade-to-grade flow rate. The simulated grade-wise enrollments are then used to determine the total enrollment at the particular level of education. Student flow at a particular level of education, e.g., primary, is culminated with rates of completion and transition by some to the next level, e.g., lower secondary.

The figure below shows details of the student flow for primary (or, elementary) level. This is illustrative of the student flow at other levels of education. We model both net and gross enrollment rates for primary. The model tracks the pool of potential students who are above the entrance age (as a result of never enrolling or of having dropped out), and brings back some of those students, marked as late/reentrant in the figure, (dependent on initial conditions with respect to gross versus net intake) for the dynamic calculation of total gross enrollments.

A generally similar grade-flow methodology models lower and upper secondary level student flows. We use country-specific entrance ages and durations at each level. As the historical data available does not allow estimating a rate of transition from upper secondary to tertiary, the tertiary education model calculates a tertiary intake rate from tertiary enrollment and graduation rate data using an algorithm which derives a tertiary intake with a lower bound slightly below the upper secondary graduation rate in the previous year.


Education Financial Flow

Financial Flow

In addition to student flows, and interacting closely with them, the IFs education model also tracks financing of education. Because of the scarcity of private funding data, IFs specifically represents public funding only, and our formulations of public funding implicitly assume that the public/private funding mix will not change over time.

The accounting of educational finance is composed of two major components, per student cost and the total number of projected students, the latter of the two is discussed in the student flows section.  Spending per student at all levels of education is driven by average income. Given forecasts of spending per student by level of education and given initial enrollments forecasts by level, an estimate of the total education funding demanded is obtained by summing across education levels the products of spending per student and student numbers.

The funding needs are sent to the IFs sociopolitical model where educational spending is initially determined from the patterns in such spending regressed against the level of economic development of the countries. A priority parameter (edbudgon) is then used to prioritize spending needs over spending patterns. This parameter can be changed by model user within a range of values going from zero to one  with the zero value awarding maximum priority to fund demands. Finally, total government consumption spending (GOVCON) is distributed among education and other social spending sectors, namely infrastructure, health, public R&D, defense and an "other" category, using a normalization algorithm.

Government spending is then taken back to the education module and compared against fund needs. Budget impact, calculated as a ratio of the demanded and allocated funds, makes an impact on the initial projection of student flow rates (intake, survival, and transition). The positive (upward) side of the budget impact is non-linear with the maximum boost to growth occurring when a flow rate is at or near its mid-point or within the range of the inflection points of an assumed S-shaped path, to be precise. Impact of deficit is more or less linear except at impact ratios close to 1, whence the downward impact is dampened. Final student flow rates are used to calculate final enrollment numbers using population forecasts for relevant age cohorts. Finally, cost per students are adjusted to reflect final enrollments and fund availability.


Education Attainment


The algorithm for the tracking of education attainment is very straight-forward.  The model maintains the structure of the population not only by age and sex categories, but also by years and levels of completed education.  In each year of the model’s run, the youngest adults pick up the appropriate total years of education and specific levels of completed education.  The model advances each cohort in 1-year time steps after subtracting deaths. In addition to cohort attainment, the model also calculates overall attainment of adults (15+ and 25+) as average years of education  (EDYRSAG15, EDYRSAG25) and as share of people 15+ with a certain level of education completed (EDPRIPER, EDSECPER, EDTERPER).

One limitation of our model is that it does not represent differential mortality rates associated with different levels of education attainment (generally lower for the more educated).[1] This leads, other things equal, to a modest underestimate of adult education attainment, growing with the length of the forecast horizon.  The averaging method that IFs uses to advance adults through the age/sex/education categories also slightly misrepresents the level of education attainment in each 5-year category.

1] The multi-state demographic method developed and utilized by IIASA does include education-specific mortality rates. <header><hgroup>

Education Equations


The IFs education model represent two types of educational stocks, stocks of pupils  and stocks of adults with a certain level of educational attainment . These stocks are initialized with historical data. The simulation model then recalculates the stock each year from its level the previous year and the net annual change resulting from inflows and outflows.

The core dynamics of the model is in these flow rates . These  flow rates are expressed as a percentage of age-appropriate population and thus have a theoretical range of zero to one hundred percent. Growing systems with a saturation point usually follow a sigmoid (S-shaped) trajectory with low growth rates at the two ends as the system begins to expand and as it approaches saturation. Maximum growth in such a system occurs at an inflection point, usually at the middle of the range or slightly above it, at which growth rate reverses direction. Some researchers (Clemens 2004; Wils and O’Connor 2003) have identified sigmoid trends in educational expansion by analyzing enrollment rates at elementary and secondary level. The IFs education model is not exactly a trend extrapolation; it is rather a forecast based on fundamental drivers, for example, income level. Educational rates in our model are driven by income level, a systemic shift algorithm and a budget impact  resulting from the availability of public fund. However, there are growth rate parameters for most of the flows that allow model user to simulate desired growth that follows a sigmoid-trajectory. Another area that makes use of a sigmoid growth rate algorithm is the boost in flow rates as a result of budget surplus.

Intake (or transition), survival, enrollment and completion are some of the rates that IFs model forecast. Rate forecasts cover elementary , lower secondary, upper secondary and tertiary levels of education with separate equations for boys and girls for each of the rate variables. All of these rates are required to calculate pupil stocks while completion rate and dropout rate (reciprocal of survival rate) are used to determine educational attainment of adults.

On the financial side of education, IFs forecast cost per student for each level. These per student costs are multiplied with enrollments to calculate fund demand. Budget allocation calculated in IFs socio-political module is  sent back to education model to calculate final enrollments and cost per student as a result of fund shortage or surplus.

The population module provides cohort population to the education model. The economic model provides  per capita income and the socio-political model provides budget allocation. Educational attainment of adults calculated by the education module affects fertility and mortality in the population and  health modules, affects productivity in the economic module and affects other socio-political outcomes like governance and democracy levels .

Equations: Student Flow

Econometric Models for Core Inflow and Outflow

Enrollments at various levels of education - EDPRIENRN, EPRIENRG, EDSECLOWENRG, EDSECUPPRENRG, EDTERENRG - are initialized with historical data for the beginning year of the model. Net change in enrollment at each time step is determined by inflows (intake or transition) and outflows (dropout or completion). Entrance to the school system (EDPRIINT, EDTERINT), transition from the lower level (EDSECLOWRTRAN, EDSECUPPRTRAN) - and outflows - completion (EDPRICR), dropout or it's reciprocal, survival (EDPRISUR) - are some of these rates that are forecast by the model.

The educational flow rates are best explained by per capita income that serves as a proxy for the families' opportunity cost of sending children to school. For each of these rates, separate regression equations for boys and girls are estimated from historical data for the most recent year. These regression equations, which are updated with most recent data as the model is rebased with new data every five years, are usually logarithmic in form. The following figure shows such a regression plot for net intake rate in elementary against per capita income in PPP dollars.

The educational flow rates are best explained by per capita income that serves as a proxy for the families' opportunity cost of sending children to school. For each of these rates, separate regression equations for boys and girls are estimated from historical data for the most recent year. These regression equations, which are updated with most recent data as the model is rebased with new data every five years, are usually logarithmic in form. The figure shows such a regression plot for net intake rate in elementary against per capita income in PPP dollars.

While all countries are expected to follow the regression curve in the long run, the residuals in the base year make it difficult to generate a smooth path with a continuous transition from historical data to regression estimation. We handle this by adjusting regression forecast for country differences using an algorithm that we call "shift factor" algorithm. In the first year of the model run we calculate a shift factor (EDPriIntNShift) as the difference (or ratio) between historical data on net primary intake rate (EDPRIINTN) and regression prediction for the first year for all countries. As the model runs in subsequent years, these shift factors (or initial ratios) converge to zero or one if it is a ratio (code routine ConvergeOverTime in the equation below) making the country forecast merge with the global function gradually. The period of convergence for the shift factor (PriIntN_Shift_Time) is determined through trial and error in each case.

The base forecast on flow rates resulting from of this regression model with country shift is used to calculate the demand for funds. These base flow rates might change as a result of budget impact based on the availability or shortage of education budget explained in the budget flow section.

Systemic Shift

Access and participation in education increases with socio-economic developments that bring changes to people's perception about the value of education. This upward shifts are clearly visible in cross-sectional regression done over two adequately apart points in time. The next figure illustrates such shift by plotting net intake rate for boys at the elementary level against GDP per capita (PPP dollars) for two points in time, 1992 and 2000.