## Binary molecular compound examples

13 comments### Opzionibinarie vertiefte gewinnem

A coauthor and I recently encountered a bit of uncertainty regarding an underlying assumption of the negative binomial regression NBREG and were wondering if anyone had any advice on how to proceed. Our question centers on whether the NBREG model is capable of handling interdependence between counts, and, if so, what kind of interdependence is it designed to capture? In examples overdispersion is often attributed to one of two causal mechanisms. A common example is the number of published papers an assistant professor produces in a year.

We cannot assume the rate of publication is constant because professors will vary in their productivity for a number of reasons that are specific to each individual. A similar example has to do with how well sports teams perform across a season. Some teams will score at a higher rate than others because of a variable we cannot observe. In these examples, there is an interdependence within individual professors and within individual teams.

In this case, the individual counts are not independent of one another because success in one period might encourage the subject to make another attempt. For example, a successful sales pitch on Wednesday for a door-to-door salesman may encourage him to try again on Thursday. Another example might be the number of violent episodes mentally ill patients undergo in a given year.

Under this causal mechanism the contagion effect or interdependence is across time. One paper, in fact, went to great lengths to demonstrate why and how current NBREG models need to be modified to be capable of handling non-independence. If NBREG models can handle non-independence, which kind of non-independence are they meant to handle? There are at least 12 distinct probabilistic processes that can give rise to a negative binomial distribution Boswell and Patil, In my field, statistical ecology, three of these are often applicable— 1 heterogeneity in the Poisson intensity parameter the negative binomial arises as a gamma mixing distribution for a heterogeneous Poisson distribution , 2 grid sampling from a clustered population the negative binomial arises as a generalized Poisson model with Poisson distributed clusters and log series counts in a cluster , and 3 the outcome probability changes depending on the process history the negative binomial arises as a limiting distribution of a Polya-Eggenberger urn model.

The causal mechanisms you mention could be interpreted as examples of the first and third of these processes. Separate from these theoretical considerations the use of a negative binomial model can also be motivated by the nature of the mean-variance relationship of the response. I discuss some of these issues in a course I teach. I'm flying blind here I don't have a copy with me , but the book 'Univariate discrete distributions' Johnson, Kotz and Kemp should be worth browsing for further information on the distribution.

It's one of those books which, in an ideal world, would be on every statistician's shelf. In the sense that the interdependence can be dealt with as a hidden variable, yes, it deals with it. But we can do much better. The 'hidden variable' could be anything leading to overdispersion. If you look at nonlinear mixed models, then you can include the time-variable as a random effect. The book by Joseph Hilbe titled, Negative Binomial Regression Cambridge University Press should answer some of the questions raised in this discussion.

I address this issue in my recently released book, Hilbe, Joseph M. Basically, the negative binomial can be used to model unidentified correlation in the data, regardless of the cause. When we can identify the reason for the extra correlation, then one can use a model appropriate for the data — which may be a negative binomial, or not.

Of course, there are a variety of negative binomial models, each which address certain types of data situations. Note also that like the Poisson, the negative binomial can be overdispersed as well. Typically in such situations one can use a random intercept, or coefficient, or a host of other adjustments. I noticed that one of the statisticians commenting on this query asserts that the negative binomial is a type of Poisson-gamma mixture.

The NB-2 traditoinal version and NB-1 constant dispersion can be derived in that manner, but the negative binomial need not be considered in that manner at all. But this is all discussed in the book. December 11, at December 12, at 5: December 13, at August 8, at 6: September 7, at 6: