|Year : 2023 | Volume
| Issue : 3 | Page : 126-132
Multilevel zero inflated and hurdle models for under five-child mortality in Indonesia
Madona Yunita Wijaya
Department of Mathematics, Faculty of Science and Technology, Syarif Hidayatullah State Islamic University Jakarta, South Tangerang, Indonesia
|Date of Submission||13-Feb-2023|
|Date of Decision||20-Jul-2023|
|Date of Acceptance||21-Aug-2023|
|Date of Web Publication||18-Sep-2023|
Madona Yunita Wijaya
Jl. Ir H. Juanda No.95, Ciputat, Tangerang Selatan, Banten, West Java
Source of Support: None, Conflict of Interest: None
Introduction: Overcoming under-five mortality rate remains a great challenge for Indonesia to meet the national target despite its notable advancements and progress in reducing child mortality rate. Therefore, understanding risk factors of under-five mortality is essential to enhance the health and well-being of children. This research seeks to investigate associated factors of under-five mortality in Indonesia by using the 2017 Indonesia Demographic and Health Survey data. Methods: The multilevel zero-inflated and multilevel hurdle models are considered to handle unobserved heterogeneity that may occur at province level, and to model prevalence and risk of child death as a joint process, which are reported in terms of odds ratio (OR) and incidence ratio rate (IRR), respectively. Results: Lower number of household members (IRR = 0.803, 95% confidence interval [CI]: 0.784–0.823), older mother's age at first birth (IRR = 1.020, 95% CI: 1.007–1.032), higher number of children ever born (IRR = 1.491, 95% CI: 1.450–1.533), lower mother's education (IRR = 1.224, 95% CI: 1.013–1.479), and lower father's education (IRR = 1.232, 95% CI: 1.015–1.495) are significantly associated with higher total death numbers in children before the age of 5 years. Furthermore, the odds of no child death are significantly higher among mother who use a contraceptive method (OR = 11.088, 95% CI: 6.659–18.462) and among household in higher quantile wealth (OR = 1.133, 95% CI: 1.005–1.277). Conclusion: This evidence-based empirical highlights priority risk factors that might provide insight for policymakers, health professional, and the community in general to design appropriate intervention to help reduce the burden of under-five mortality in the country.
Keywords: Hurdle, Indonesia demographic and health survey, multilevel, under-five mortality, zero-inflated
|How to cite this article:|
Wijaya MY. Multilevel zero inflated and hurdle models for under five-child mortality in Indonesia. Asian J Soc Health Behav 2023;6:126-32
| Introduction|| |
Child mortality is one of the important indicators describing public health condition and socioeconomic development of a country. It was globally estimated that about 5 million children died before their fifth birthday, or approximately 38 deaths/1000 live births, according to the United Nations report in 2021. In South-East Asian region, child mortality has demonstrated significant reduction in the past two decades and some countries have successfully achieved the Sustainable Development Goals (SDGs) target by 2019, including Indonesia. The rate of mortality for children under 5 years olds in Indonesia has shown a reduction from 52/1000 live births in 2020–2022/1000 live births in 2021. This shows that the country meets the SDG target 3.2.1 but is close to meeting a new target in its medium-term development plan and national SDG roadmap to reduce under-five mortality rate to 19/1000 livebirths by the year 2030.
Despite the significant decline, still many children are dying before the age of 5 years in Indonesia. In addition, disparities based on geography and other socio-economic factors may continue to persist. Ongoing efforts to minimize the absolute gap require further attention to achieve the Indonesian goal of healthy life. When compared with other South-east Asian countries, the under-five mortality rate in Indonesia is still two or three times the rate in Thailand and Sri Lanka. The effect of pandemic may also threaten the hard-won progress and a new commitment is needed to ensure the country stays on track. Therefore, it is very imperative to understand the determinant factors of under-five mortality in Indonesia to inform public health policies and design strategies to accelerate progress toward the national target.
Few studies have examined under-five mortality's risk factors in Indonesia using the standard approaches such as logistic regression and survival analysis, where they reported the estimation of prevalence alone (whether a household experienced a child death or not). Modeling the risk factor of child mortality can also be modelled in terms of severity estimation (the total reported death numbers per household) by utilizing a count regression model. However, the occurrence of deaths and the total reported death numbers in a household can be assessed simultaneously by considering as a joint process. This is known as a two-part model which considers the positive outcomes and the zero outcomes in a multi-index count model. No studies have been done to address this issue in Indonesia by considering as two simultaneous processes and biased estimates may occur if one fails to include the joint nature of these processes.,, Furthermore, these studies ignored the data structures, where sample units are clustered within provinces. Failing to account for hierarchical structures may cause underestimation of standard errors of regression coefficients, and thus yield to overstatement of statistical significance.
The zero-inflated and hurdle models are gaining popularity in the past few years which can be applied in this setting. These procedures are often applied in public health studies to assess the relationship of socio-economic, environmental, and other explanatory variables with response data that are in the form of counts or abundances. In addition, because of the hierarchical study design in the dataset used in this study where participants are nested within provinces, it is important to model the inherent correlation structure and underlying heterogeneity. To address this problem, multilevel zero-inflated and multilevel hurdle models are proposed in this study by introducing cluster-specific random effects.
| Methods|| |
This article examines the secondary database from the Indonesia Demographic and Health Survey (IDHS) in 2017. The recorded data were a nationally representative sample among women aged 15–49 years conducted in 34 provinces. As per the 2017 census, women population aged 15–49 years was 70.5 million, which comprised 53.6% of total women population in Indonesia. Data collection was organized by the Central Bureau of Statistics Indonesia (BPS) in collaboration with the Ministry of Health (MoH) of Indonesia and the National Population and Family Planning Board (BKKBN). The data were collected through the survey research using the questionnaire and interview methods. A two-stage stratified sampling method was used for the data collection to ensure the representative sample in the country. A total of 50,730 women aged 15–49 years were identified as eligible participants and 49,627 women completed the interviews resulting a response rate of 98%.
This study was based on an analysis of existing public domain survey data sets that is freely available online with all identifier information removed from the Demographic and Health Survey (DHS) website (https://dhsprogram.com/), which allowed this study to be conducted with no additional ethic approval. Author sent request to access dataset to the DHS program after presenting the study purposes. Permission was then granted to download and use the data for the analysis in this paper. The IDHS dataset was collected in accordance with the relevant regulations and guidelines. DHS survey received ethical approval from the ICF institutional review board and country-specific review boards. Informed consent was obtained from each participant before data collection.
Variable of the study
Variable of interest is a count variable representing the number of under-five mortalities per mother. Potential explanatory variables considered in this study include province, residence type, parental working status parental education level, wealth index, number of household members, age of mother at first birth, tetanus toxoid injection, contraceptive use, and total children ever born.
The most commonly used methods to analyze zero-inflated count data are the so-called zero-inflated family of models, which include zero-inflated Poisson (ZIP) and zero-inflated negative binomial (ZINB). Due to a hierarchical study design in many health surveys, where participants are often clustered in provinces or regions, multilevel ZIP and ZINB regression models were proposed by introducing random effects in the model to overcome correlation that may occur within clusters.
Let Yij, the number of under-five mortalities in the i-th province for the j-th mother, follows a ZIP distribution:
Where μij denotes the mean of the Poisson distribution, 0 ≤πij ≤ is the probability of extra zeroes. If overdispersion presents in the data, ZINB distribution is preferrable, defined as:
Where r-1 represents the dispersion parameter. The use of regression model based on ZIP and ZINB distributions was introduced to assess the effect of risk factors by letting the mean (μij) and zero proportion (πij) to be related to some explanatory variables.,
In a random effects (multilevel) setting, the ZIP and ZINB regression models are extended by introducing random components wi and ui (specific to the i-th cluster) to model the dependence of individuals within clusters., known as multilevel ZIP and ZINB models, expressed as follows:
The explanatory variables Xij and Wij do not have to be the same in both equations; the vectors α and β are the regression coefficients; the vectors ai and bi are the cluster random effects assumed to be independent and follow a normal distribution with mean 0 and variance of σu2 and σw2, respectively.
A similar approach to handle excess zeros in the data is the hurdle model. In contrast to zero-inflated model that assumes the zeros from two distinct sources, the hurdle model views the zero and nonzero outcomes from a single process of data generation. Assuming a truncated Poisson count component is applied for the nonzero outcomes, then the probability density function is expressed as:
If the nonzero counts are assumed to follow a truncated negative binomial (NB) distribution, then the probability density function is expressed as:
The hurdle model can be extended to clustered counts by including random effects ai and bi in the linear predictors for the Poisson (or NB) and logistic components in a similar way as expressed in equation (1) and (2), known as a multilevel hurdle Poisson (HP) and a multilevel hurdle NB model.
| Results|| |
The analysis is based on the sample of 25,253 mothers aged 15–49 years residing in 34 provinces in Indonesia, with mean age of 34.5 years old (standard deviation = 7.79). Approximately 89.7% (n = 22,655) of the mothers did not report any under-five death, whereas the remaining 10.3% (n = 2598) reported one to seven under-five deaths with details as follows: 8.3% (n = 2098) reported one death, 1.6% (n = 394) reported two deaths, and the remaining 0.4% (n = 106) reported more than two deaths. This demonstrates that the number of under-five mortality distribution shows a positive skew with a larger number of zero outcomes. The proportion of reporting at least one child death before age of five was the highest among uneducated mothers (31%) and fathers (31%), as shown in [Table 1].
|Table 1: Frequency (%) distribution of under-five mortality across mother's characteristics|
Click here to view
Various multilevel count regression models are fitted to the data and the results for model selection criteria are presented in [Table 2]. The multilevel Poisson and NB models poorly fit to the data with larger Akaike information criteria (AIC) and Bayesian information criteria (BIC) values. This demonstrates that better fits could be obtained through zero-augmented (zero-inflated and hurdle) models due to excessive zero values. Among the multilevel zero-inflated and hurdle models, the Poisson distribution has superiority over the NB distribution.
|Table 2: Model assessment for various multilevel count regression models|
Click here to view
[Table 3] shows that both multilevel HP and multilevel ZIP models are relatively similar in identifying significant factors in relation to the number of under-five mortalities. However, the multilevel ZIP model provides a better fit since it has smaller deviance, AIC, and BIC values. In addition, the variance component of random effects in the multilevel ZIP model can explain much higher between-province variability in the data. Thus, the multilevel ZIP is preferable in this study. Summarizing, type of residence, parent's working status, and tetanus toxoid injection did not appear to be significant factors on the number of under-five mortalities in both Poisson and logistic parts in the multilevel ZIP after adjusting province random effects.
|Table 3: The estimated odds ratio (95% confidence interval) and incidence ratio rate (95% confidence interval) under logistic and Poisson parts, respectively, for multilevel zero-inflated Poisson and multilevel hurdle Poisson models|
Click here to view
The results of the logistic part of the multilevel ZIP model showed that mother's age, age of mother at first birth, family size, wealth index, total children ever born, and contraceptive use were the significant factors in influencing the number of under-five mortalities in Indonesia. The estimated regression coefficients are reported in terms of odds ratio (OR). The estimated odds that no under-five deaths with mothers who used a contraceptive method was 11 times that of mothers who did not use a contraceptive method (OR = 11.088, 95% confidence interval [CI]: 6.659–18.462). A one-unit increase in the number of household members results in the estimated odds of no under-five deaths increased by 28.2% (OR = 1.282, 95% CI: 1.199–1.371). Similarly, a one-unit increase in wealth index led to an increase in the estimated odds by 13.3% (OR = 1.133, 95% CI: 1.005–1.277). On the other hand, every one child born to a mother resulted in the estimated odds of no under-five deaths decreased by 86% (OR = 0.140, 95% CI: 0.114–0.172). A 1-year increase in the age of mother at birth also resulted a decrease by 3.7% (OR = 0.963, 95% CI: 0.934–0.993) in the estimated odds of no under-five deaths.
The results of the Poisson part of the multilevel ZIP model reveal that mother's age, parental' educational level, number of household members, total children ever born, and mother's age at first birth had significant effect on the number of under-five mortalities in Indonesia. The estimated regression coefficients are reported in terms of incidence rate ratio (IRR). It suggests that each additional household members were associated with a decrease in the incidence rate of nonzero under-five mortality by 19.7% (IRR = 0.803, 95% CI: 0.784–0.823). On the other hand, for a 1-year increase in the age of a mother at first birth was associated with an increase in the incidence rate of nonzero under-five mortality by 2% (IRR = 1.020, 95% CI: 1.007–1.032). Furthermore, every one child born to a mother was associated with an increase in the incidence rate of nonzero under-five mortality by 49.1% (IRR = 1.491, 95% CI: 1.450–1.533). With regard to parents' educational level, results show that the level of mother's education on secondary grade has significantly decreased the likelihood of having larger number of under-five mortality by 18.3% than those mothers with no education (IRR = 0.817, 95% CI: 0.676–0.987). Similarly, higher level educational of a father in secondary (IRR = 0.812, 95% CI: 0.669–0.985) and higher education (IRR = 0.709, 95% CI: 0.536–0.937) have decreased the likelihood of having larger number of under-five mortality by 18.8% and 29.1%, respectively, than those fathers with no education. This demonstrates that in general, as both mother's and father's highest educational attainment increased, the incidence rate of nonzero under-five mortality decreased.
| Discussion|| |
This study applied multilevel zero-inflated and hurdle models to the number of under-five mortalities to account for overdispersion resulting from excess zeros. According to Rose et al., model selection should be based on study objectives. Using the same logic, the endpoint of interest in this study may contain both structural and sample zeros which naturally lead to zero inflated modeling framework. Various model selection criteria also suggest that the multilevel ZIP model showed better fit over the multilevel HP model.
The best model indicates that paternal and maternal education are significant socio-economic factors of under-five mortality. Increased paternal and maternal education were linked to the lower number of under-five mortalities. The finding is in line with previous studies which reported that the risk of under-five deaths was higher as parent's educational level decreased.,, This highlights the importance of education as a key role for child health determinants and outcomes. Higher parent's education degrees make them more likely to increase the chances of using health services and give adequate medical care and better nutrition for children.
Age of mothers at first birth is significantly related to the number of under-five mortalities in Indonesia. Mothers were at greater risk of having children dying before the age of five as mothers' age at first birth increased. This contrasts with previous studies, such as Ethiopia, Nigeria, Sri Lanka, and Bangladesh, where they reported that babies born to younger mothers were at higher risk of death. The reason includes biological factors related to younger mothers, where they are still growing and physically immature. However, few studies have revealed that late childbearing was also associated with higher risk of negative birth outcomes. There is an increased risk of maternal labor complication, fetal loss, and giving birth to Down syndrome babies among mothers with late age at first birth.,
Household size is another important factor of under-five mortality in Indonesia. This study found that mothers have lower risk of having children dying before age of five in large households similar to previous studies., According to the 2012 IDHS, household size considers as all related and/or unrelated persons sharing living space and food arrangement. This supports the findings that larger households may indicate better resources with more experienced adults related to child care and more working-age adults in making contribution to household earnings.
In contrast, mothers are at greater risk of having children dying before the age of five with higher number of children ever born. The findings in Bhutan and Nepal reported similar conclusion that mothers who gave more births were positively associated with the risk of having children dying before five. It is evident that higher parity increases the likelihood of short birth interval which influences the risk of under-five mortality as a result of mother's health and nutritional status depletion. In addition, higher parity can be associated with lack of knowledge and use of family planning methods. Family planning methods help lengthens the time between interval and hence improve the survival of the child. Therefore, enhancing the understanding of health education and family planning services may provide improvement in child survival.
The influence of wealth status on under-five mortality was clearly evident in this study, supporting findings in other studies., Improving household wealth status was significantly associated with reduction in the risk of under-five mortality per mothers. Higher household's wealth status is more likely to have better access for health services to children and may have proper health behavior and quality health practices.
This study strives to address significant risk factors of under-five mortality in Indonesia using the most recent nationally representative sample database. Since it is based on the national survey data, the associated key drivers of under-five mortality found in the study can be directly formulated into policy recommendations and be guidance to design appropriate intervention strategies by MoH and policy-makers to further enhance health outcomes for children. However, this study is not without limitations. The analysis is limited to the variables available in the dataset and some of them have many missing values. There are other potential risk factors not available in the dataset, such as genetic and environmental factors that might influence the number of under-five mortalities. The collected data were based on participants' self-report which often may not accurately recall previous experiences and thus susceptible to recall bias.
| Conclusion|| |
Among the six models considered for analyzing the risk factors for the number of under-five mortalities per mother, the multilevel zero-inflated regression model was found to be the most appropriate model. The model was applied to the 2017 IDHS data with the number of under-five mortalities per mother as the response variable to account for excess zeros. The study revealed that parent's educational level, wealth status, household size, number of children ever born, age of mother at first birth, and contraceptive use were the significant factors for the number of children mortality under the age five in Indonesia. The implication of this study that government interventions should focus on improvement on parental education and awareness, increase the use of family planning, strengthening equal access to health care facilities, and expanding quality services to end the burden of under-five mortality in the country and to be in conformity with sustainable development global.
Financial support and sponsorship
This study was supported by IDHS 2017 for providing open access for the data used in this research.
Conflicts of interest
There are no conflicts of interest.
| References|| |
Miladinov G. Socioeconomic development and life expectancy relationship: Evidence from the EU accession candidate countries. Genus 2020;76:2.
World Health Organization. East Asia Regional TAG Meeting to Accelerate Reduction in Newborn and Child Mortality towards Achieving SDG 2030 Targets: Virtual, 16-19 November 2021; 2022. Available from: https://apps.who.int/iris/handle/10665/354544
. [Last accessed on 2023 Jul 29].
Rachmawati PD, Kurnia ID, Asih MN, Kurniawati TW, Krisnana I, Arief YS, et al
. Determinants of under-five mortality in Indonesia: A nationwide study. J Pediatr Nurs 2022;65:e43-8.
Warrohmah ANI, Berliana SM, Nursalam N, Efendi F, Haryanto J, Has EMM, et al
. Analysis of the survival of children under five in Indonesia and associated factors. IOP Conf Ser Earth Environ Sci 2018;116012014.
Fenta SM, Fenta HM. Risk factors of child mortality in Ethiopia: Application of multilevel two-part model. PLoS One 2020;15:e0237640.
Su L, Tom BD, Farewell VT. Bias in 2-part mixed models for longitudinal semicontinuous data. Biostatistics 2009;10:374-89.
Kazembe LN. A Bayesian two part model applied to analyze risk factors of adult mortality with application to data from Namibia. PLoS One 2013;8:e73500.
Hox JJ, Moerbeek M, Schoot RV. Multilevel analysis: Techniques & applications. London: Taylor and Francis; 2018.
National Population and Family Planning Board (BKKBN), Statistics Indonesia (BPS), Ministry of Health (Kemenkes), ICF. Indonesia Demographic and Health Survey 2017. Jakarta, Indonesia: National Population and Family Planning Board (BKKBN); 2018.
Mullahy J. Specification and testing of some modified count data models. J Econom 1986;33:341-65.
Lambert D. Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics 1992;34:1-14.
Lee AH, Wang K, Scott JA, Yau KK, McLachlan GJ. Multi-level zero-inflated Poisson regression modelling of correlated count data with excess zeros. Stat Methods Med Res 2006;15:47-61.
Moghimbeigi A, Eshraghian MR, Mohammad K, Mcardle B. Multilevel zero-inflated negative binomial regression modeling for over-dispersed count data with extra zeros. J Appl Stat 2008;35:1193-202.
Bohning D, Dietz E, Schlattmann P, Mendonca L, Kirchner U. The zero-inflated Poisson model and the decayed, missing and filled teeth index in dental epidemiology. J R Stat Soc Ser A Stat Soc 1999;162:195-209.
Hur K, Hedeker D, Henderson W, Khuri S, Daley J. Modeling clustered count data with excess zeros in health care outcomes research. Health Serv Outcomes Res Methodol 2002;3:5-20.
Min Y, Agresti A. Random effect models for repeated measures of zero-inflated count data. Stat Model 2005;5:1-19.
Gurmu S. Generalized hurdle count data regression models. Econ Lett 1998;58:263-8.
Rose CE, Martin SW, Wannemuehler KA, Plikaytis BD. On the use of zero-inflated and hurdle models for modeling vaccine adverse event count data. J Biopharm Stat 2006;16:463-81.
Fenta SM, Fenta HM, Ayenew GM. The best statistical model to estimate predictors of under-five mortality in Ethiopia. J Big Data 2020;7:63.
Balaj M, York HW, Sripada K, Besnier E, Vonen HD, Aravkin A, et al
. Parental education and inequalities in child mortality: A global systematic review and meta-analysis. Lancet 2021;398:608-20.
Rahman MS, Rahman MS, Rahman MA. Determinants of death among under-5 children in Bangladesh. J Res Opin 2019;6:2294-302.
Yaya S, Ekholuenetale M, Tudeme G, Vaibhav S, Bishwajit G, Kadio B. Prevalence and determinants of childhood mortality in Nigeria. BMC Public Health 2017;17:485.
Trussell J, Hammerslough C. A hazards-model analysis of the covariates of infant and child mortality in Sri Lanka. Demography 1983;20:1-26.
Khan JR, Awan N. A comprehensive analysis on child mortality and its determinants in Bangladesh using frailty models. Arch Public Health 2017;75:58.
Gebreegziabher E, Bountogo M, Sié A, Zakane A, Compaoré G, Ouedraogo T, et al.
Influence of maternal age on birth and infant outcomes at 6 months: A cohort study with quantitative bias analysis. Int J Epidemiol 2023;52:414-25.
Andersen AMN, Wohlfahrt J, Christens P, Olsen J, Melbye M. Maternal age and fetal loss: population based register linkage study. BMJ 2000;320:1708-12.
Bouzaglou A, Aubenas I, Abbou H, Rouanet S, Carbonnel M, Pirtea P, et al
. Pregnancy at 40 years old and above: obstetrical, fetal, and neonatal Outcomes. is age an independent risk factor for those complications? Front Med (Lausanne). 2020;7.
Berelie Y, Yismaw L, Tesfa E, Alene M. Risk factors for under-five mortality in Ethiopia: Evidence from the 2016 Ethiopian demographic and health survey. S Afr J Child Health 2019;13:137.
Dendup T, Zhao Y, Dema D. Factors associated with under-five mortality in Bhutan: An analysis of the Bhutan national health survey 2012. BMC Public Health 2018;18:1375.
Bhusal MK, Khanal SP. A systematic review of factors associated with under-five child mortality. Biomed Res Int 2022;2022:1-19.
Adhikari R, Podhisita C. Household headship and child death: Evidence from Nepal. BMC Int Health Hum Rights 2010;10:13.
Kozuki N, Walker N. Exploring the association between short/long preceding birth intervals and child mortality: Using reference birth interval children of the same mother as comparison. BMC Public Health 2013;13 Suppl 3:S6.
Ekholuenetale M, Wegbom AI, Tudeme G, Onikan A. Household factors associated with infant and under-five mortality in sub-Saharan Africa countries. Int J Child Care Educ Policy 2020;14:10.
Asif MF, Pervaiz Z, Afridi JR, Safdar R, Abid G, Lassi ZS. Socio-economic determinants of child mortality in Pakistan and the moderating role of household's wealth index. BMC Pediatr 2022;22:3.
[Table 1], [Table 2], [Table 3]