In a previous post I entertain with the data of my newborn daughter’s daily habits. After publishing the article I was asked if the number of barfing events per day was Poisson distributed. I want to do a small research about that. I’m also sorry if anybody is disguised by the subject, but the curious question cannot be left unanswered.

In my previous data I haven’t collected vomiting events because it has never been an issue with her. My daughter rather used to burp – only after being fed and in about 90% of the cases. However, such a data also has not been collected.

Let us focus now more on the statistical aspect of the analysis. I’m rather a frequentist, but given no proper data I cannot infer on the burping distribution. That is why I should perform the analysis in Bayesian style.

Our goal is to find a distribution of the number of burping events per day, π(θ). We know the distribution of daily burping events given the number of feedings. It is simply a Binomial distribution f(θ|n)=B(n, 0.9), as the probability of the single event is 90% after a feeding. The additional variable “feedings” in the data is a sample of the distribution of the number of daily feedings, π(n). Therefore, using Bayes’ Theorem we can write

π(θ)∝∫f(θ|n)·π(n)dn.

However, we don’t know the analytical density function of π(n), so we cannot calculate the posterior. We only possess a sample of the prior distribution consisting in 73 observations. Therefore, we can use a bootstrap simulation to get an estimate of the posterior. The procedure has a very simple idea and it might be hard to believe that it actually works. It is just a simple version of Monte Carlo simulation.

At first we pick some huge number T that is going to be the number of simulations. Each simulation step consists of:

• Take ni as a single sample with replacement from the data of daily feedings.
• Each observation ni is assumed to be from the prior distribution π(n).
• For the observation ni, simulate another variable yi from the distribution B(ni, 0.9).
• The histogram of variables yi is an estimate of the posterior density π(θ).

In figure 1 we can compare the posterior and prior distribution. The prior is a skewed distribution with mean 10.3 and variance 2.2. The posterior is more symmetrically distributed with mean 9.24 and variance 2.7.