# Correlation of Duke Basketball Scores, in Crystal Ball (6/8)

Author: Karl - CB Expert/Thursday, February 3, 2011/Categories: Engineering Modeling

No rating

In our quest to simulate future Duke Basketball scores, we have taken past historical data of individual games during the '09/'10 season and fitted probability distributions to that data. Two PDFs are generated; one for Duke's scores (offense) and one for their opponents' scores (defense). We have used both Crystal Ball and ModelRisk to perform this task. Is there something missing in our PDF formulations?

There is. Examine the data carefully. Does there appear to be a rough pattern or correlation between the offense scores and scores allowed by Duke's defense? For many of the games, even if the Duke score is lower than the average for all Duke offense scores, the opponent seems to score less as well compared to the average of all Duke's opponents. In other words, there is a positive correlation between the sets of paired data samples. Calculated values for Pearson's, Spearman's and Kendall's Tau coefficient appear to be in agreement. (These values are in Cells C42:E42.) If this behavior was not modeled properly, then any future simulation of Duke Basketball outcomes would predict more Duke losses than the data really indicates. Last year's Duke squad did a good job of defense when their offense was not chugging on all cylinders!

To implement correlation behavior in CB between two CB Assumptions, the user must define the Spearman's coefficient within a set of CB windows. We will do this between the first best-fitting PDF for Duke scores and the first best-fitting PDF for opponent scores. After opening the Duke 09_10 Scores spreadsheet with Crystal Ball loaded within MS Excel, perform the following steps:

• Click on Cell I5.
• Select "Define Assumption" in the CB ribbon bar. (This will open the CB Assumption in a separate window; this first best-fitting PDF was previously defined by fitting to the Duke offense score data in Column C.)
• Select "Correlate…" in the lower left-hand corner.
• Select "Choose..." in upper right-hand corner. (A list of CB Assumptions will come up with check boxes for all other CB Assumptions in your spreadsheet.)
• Check the box for first best-fitting CB Assumption associated with opponent score (a Negative Binomial).

The user has the option, at this time, to define a Spearman's coefficient by either

1. typing values in manually (done just below "Choose…" button),
2. cell-referencing to a cell which has a coefficient value pre-defined (perhaps via Excel formula), or
3. having CB calculate a Spearman's coefficient directly from data in the worksheet.

We will use the third approach as follows:

• Select "Calc…" button.
• Using cell-reference-picker, identify first cell range as C4:C37 and second cell range as D4:D37.
• Select "OK."

As visualized in the data scatter chart, there is a positive correlation between Duke's offensive outputs and that of their opponents (see Fig. 6-1). The Spearman's coefficient value is approximately 0.35. Most analysts would term this a weak-to-intermediate correlation at best. Perhaps it won't have as much impact on our simulation results? But it is strong enough for consideration. (Crystal Ball does have a functionality to turn off all defined correlations in the CB Assumptions. Using this CB feature allows comparative MCA runs (with and without correlations) to determine magnitude of variance impact, which I highly recommend for all MCA models, regardless of code.)

• Select "OK."