Analytics Articles

Copulas Vs. Correlation

Eric Torkia

Share:

Print

Rate article:

4.0
Rate this article:
4.0

 

Copulas and Rank Order Correlation are two ways to model and/or explain the dependence between 2 or more variables. Historically used in biology and epidemiology, copulas have gained acceptance and prominence in the financial services sector.

In this article we are going to untangle what correlation and copulas are and how they relate to each other. In order to prepare a summary overview, I had to read painfully dry material… but the results is a practical guide to understanding copulas and when you should consider them. I lay no claim to being a stats expert or mathematician… just a risk analysis professional. So my approach to this will be pragmatic. Tools used for the article and demo models are Oracle Crystal Ball 11.1.2.1. and ModelRisk Industrial 4.0

 

What is correlation and how do I do it?

Loading the player ...

 

See on YouTube: PART 1: Understanding Copulas vs. Rank Order Correlation Overview

Correlation is used to assess the strength and direction of a relationship between 2 variables. Using the linear regression method, we can derive Pearson’s correlation coefficient by getting the R, using a square root function. This is very easy to do in Excel. Alternatively you can do a CORREL() function on two arrays of data to see how they correlate. This is the equivalent to plotting the data, applying a linear trend in Excel and extracting the R2 value and applying a square root to it.

The most widely known scale-invariant measures of association are the population versions of Kendall’s tau and Spearman’s rho. Both measure a form of dependence known as concordance. (Nelsen, 2002)

Rank order methods enable the modeler to smooth out the presence of outliers by correlating the rankings of a data set instead of the values themselves… that is the relative positioning of a datapoint within its’ dataset. In Spearman’s method we correlate the ranking pairs and extract the coefficient using Pearson linear regression method.

Kendall’s tau is a slightly different approach that looks at the probability of concordance minus the probability of discordance for a pair (xi, yi), (xj, yj) of observations randomly chosen from the samples. (Nelsen, 2002).

.

If we consider the tools, Oracle Crystal Ball, Palisade @RISK,  and Risk Solver Pro use Spearman’s method to correlate two variables. They will in a sense create a very basic bivariate copula in the background to correlate two items together. This is why we get somewhat similar (but not identical) results when we compare with ModelRisk’s explicit copula correlation and the other packages.

When you want to correlate two items together in the packages above, you need to either estimate correlation or if you want to be technically correct, fit it (using Spearman, Kendall, or some other recognized method).  Then, depending on the tool, you will either apply it directly in a correlation matrix OR if you are using ModelRisk, guide you to model a copula.

Working with Copulas

A copula is a function which joins or “couples” a multivariate distribution function to its one-dimensional marginal distribution functions. (Clemen & Reilly 1999, Nelsen 2002)

Generally the measuring and modeling dependencies has centered on correlation such as the ones mentioned above.  Of course, it is rare for distributions to follow the strict spherical assumptions with a constant dependence across the distribution implied by correlation. (Dorey & Joubert, 2005)

For this reason, copulas have gained great prominence as a method to model these non-constant correlations.  This has been a great boon to the financial engineering field for its flexibility to model these non-linear relationships.

The one thing to remember is that to build a copula, you still need to assess the degree of association in some way. Clemen and Reilly (1999) suggest 3 methods to assess correlation:

  • Statistical Approaches: These techniques rely on an expert’s familiarity with statistical concepts related to correlation. For example, an expert might make a judgment regarding the “percentage of variance explained” (R2) that would result from regressing one variable on another.
  • Probability Concordance is consistent with decision analysis elicitation techniques: Assess conditional or joint probabilities and relate those to the required measure of dependence.
  • Conditional Fractile Estimates requires conditional estimates and uses these to derive Spearman’s r

ModelRisk is the only tool that enables you to explicitly define a copula for the purpose of modeling relationships among variables. Only the professional and industrial versions enable you to fit & rank copulas (using Information Criteria, just as when fitting distributions, see previous article) to data to extract the correct parameters.  

As we mentioned earlier, in order to build a copula you need to assess the correlation or dispersion of the points using some sort of method, including Spearman’s Kendall’s tau, covariance or some other accepted method. Copulas use different parameters to define or configure the data point dispersion behavior of a copula, i.e.  Alpha and Theta. There does not seem to be a method to manually derive the Alpha or Theta parameters from the data without going through the fitting tool or approximating the parameters using Spearman Correlation Coefficient. Therefore if you need to correlate using data, you will need to fit a copula using the wizard to get the parameters for a proper correlation equivalent.

Based on our understanding, we believe that ModelRisk uses Conditional Fractile Estimates to enable the creation of non-linear & non-constant correlation patterns. In order to use this method, you need to split up your copula into bins known as fractiles (see the 10 pink and white bins in the diagram below)….  Each fractile then requires the assessment of the correlation of the points within the Fractile using either Spearman Rho or Kendall’s Tau. Of course we would need to further validate this with the people at Vose.

.

ModelRisk also enables you to construct bivariate copulas using distributions and correlation elicited from the risk analyst or a SME (Subject Matter Expert). This function is designed to assist risk analysts to define joint probability distributions with relative ease. Don’t forget we are talking about copulas… so relative is a very important word.

In the next section we are going to see how explicitly model copulas will differ from traditional correlation methodologies. We will compare the results in the video.

 

Comparing the Results of  Correlation vs. Copula Models

Loading the player ...

 

See on YouTube: Understanding Copulas vs. Rank Order Correlation (Part 2: Demonstration in Excel)

Copulas are a very powerful and elegant way to accurately model correlation patterns – they do not assess them. One of the key reason one should serioulsy consider using copulas is when the risk in the tails is of critical importance. This is why they are used in situations where the risk in the tail is highly improbable but also highly catgastrophic, if not fatal. We generally call these coconut risks because we are aware of the consequences but due to the low probability of occurrence, we decide to accept them. A real world example is the Nuclear Plant Disaster in Fukashima, where they accepted the (deemed) low risk of water damage to their eletrical generators. This is why properly estimating risk in the tails is considered so important.

..
..

On the other hand, Copulas are not for everybody and anything because they are very mathematically intensive, therefore require lots of computing horsepower as well as an above average understanding of statistics. According to Chernih et al, (2007) "models that involve complicated copulas are by no means better than simple but robust and transparent models and do not always add value. However, building a simple as possible, but not too simple, model requires significant actuarial training and expertise."

Therefore, if the tails are not of a significant consequence in the type of modeling application you are working on, then you should consider using the traditional methods employed by the other packages for performance issues and according to Chernih et al (2007), transparency as well.

Another important consideration when deciding wether you need to work with copulas is your modeling methodology. Given that copulas use arrays, this requires you to identify upfront most if not all your model’s correlation relationships in order to structure your data and models correctly.If you do not do this you will run ito trouble in short order when implimenting your correlations.

Conversely, with the other packages, you can apply a more iterative approach for the implementation of correlation relationships.

Key Take-Aways:

  • Copulas are very powerful and useful risk analysis tools.
  • Copulas require an above average understanding of statistics.
  • Copulas are critical in the proper modeling of the risk in the tails of an output distribution
  • Copulas enable the modelin of non-linear and non-constant correlation relationships.
  • Copulas require way more computing horse-power and can slow down a model quite a bit… so consider performance and accuracy wisely.

If you have any questions or comments, please don't hesitate to drop me a line at 1-888-879-8440 or send me a note at  [email protected].

 See the other viseos in the series

Article References:

  • Robert T. Clemen, Reilly, Terrence "Correlations and Copulas for Decision and Risk Analysis", Management Science, Vol. 45, No. 2, February 1999, pp. 208-224
  • Nelsen, R., "Properties and Applications of Copulas: A Brief Survey," Lewis and Clark College / Mount Holyoke College, 2002.
  • M. Dorey, P. Joubert, "Modeling Copulas: An Overview", The Staple Inn Actuarial Society, 2007.
  • Chernih, Andrew, Maj, Mateusz and Vanduffel, Steven, "Beyond Correlations: The Use and Abuse of Copulas in Economic Capital Calculations" (April 2007). Belgian Actuarial Bulletin, Vol. 7, No. 1, 2007. Available at SSRN: http://ssrn.com/abstract=1359578

Comments

Collapse Expand Comments (0)
You don't have permission to post comments.

Oracle Crystal Ball Spreadsheet Functions For Use in Microsoft Excel Models

Oracle Crystal Ball has a complete set of functions that allows a modeler to extract information from both inputs (assumptions) and outputs (forecast). Used the right way, these special Crystal Ball functions can enable a whole new level of analytics that can feed other models (or subcomponents of the major model).

Understanding these is a must for anybody who is looking to use the developer kit.

Why are analytics so important for the virtual organization? Read these quotes.

Jun 26 2013
6
0

Since the mid-1990s academics and business leaders have been striving to focus their businesses on what is profitable and either partnering or outsourcing the rest. I have assembled a long list of quotes that define what a virtual organization is and why it's different than conventional organizations. The point of looking at these quotes is to demonstrate that none of these models or definitions can adequately be achieved without some heavy analytics and integration of both IT (the wire, the boxes and now the cloud's virtual machines) and IS - Information Systems (Applications) with other stakeholder systems and processes. Up till recently it could be argued that these things can and could be done because we had the technology. But the reality is, unless you were an Amazon, e-Bay or Dell, most firms did not necessarily have the money or the know-how to invest in these types of inovations.

With the proliferation of cloud services, we are finding new and cheaper ways to do things that put these strategies in the reach of more managers and smaller organizations. Everything is game... even the phone system can be handled by the cloud. Ok, I digress, Check out the following quotes and imagine being able to pull these off without analytics.

The next posts will treat some of the tools and technologies that are available to make these business strategies viable.

Multi-Dimensional Portfolio Optimization with @RISK

Jun 28 2012
16
0

Many speak of organizational alignment, but how many tell you how to do it? Others present only the financial aspects of portfolio optimization but abstract from how this enables the organization to meets its business objectives.  We are going to present a practical method that enables organizations to quickly build and optimize a portfolio of initiatives based on multiple quantitative and qualitative dimensions: Revenue Potential, Value of Information, Financial & Operational Viability and Strategic Fit. 
                  
This webinar is going to present these approaches and how they can be combined to improve both tactical and strategic decision making. We will also cover how this approach can dramatically improve organizational focus and overall business performance.

We will discuss these topics as well as present practical models and applications using @RISK.

Reducing Project Costs and Risks with Oracle Primavera Risk Analysis

.It is a well-known fact that many projects fail to meet some or all of their objectives because some risks were either: underestimated, not quantified or unaccounted for. It is the objective of every project manager and risk analysis to ensure that the project that is delivered is the one that was expected. With the right know-how and the right tools, this can easily be achieved on projects of almost any size. We are going to present a quick primer on project risk analysis and how it can positively impact the bottom line. We are also going to show you how Primavera Risk Analysis can quickly identify risks and performance drivers that if managed correctly will enable organizations to meet or exceed project delivery expectations.

.

 

Modeling Time-Series Forecasts with @RISK


Making decisions for the future is becoming harder and harder because of the ever increasing sources and rate of uncertainty that can impact the final outcome of a project or investment. Several tools have proven instrumental in assisting managers and decision makers tackle this: Time Series Forecasting, Judgmental Forecasting and Simulation.  

This webinar is going to present these approaches and how they can be combined to improve both tactical and strategic decision making. We will also cover the role of analytics in the organization and how it has evolved over time to give participants strategies to mobilize analytics talent within the firm.  

We will discuss these topics as well as present practical models and applications using @RISK.

The Need for Speed: A performance comparison of Crystal Ball, ModelRisk, @RISK and Risk Solver


Need for SpeedA detailed comparison of the top Monte-Carlo Simulation Tools for Microsoft Excel

There are very few performance comparisons available when considering the acquisition of an Excel-based Monte Carlo solution. It is with this in mind and a bit of intellectual curiosity that we decided to evaluate Oracle Crystal Ball, Palisade @Risk, Vose ModelRisk and Frontline Risk Solver in terms of speed, accuracy and precision. We ran over 20 individual tests and 64 million trials to prepare comprehensive comparison of the top Monte-Carlo Tools.

 

Excel Simulation Show-Down Part 3: Correlating Distributions

Escel Simulation Showdown Part 3: Correlating DistributionsModeling in Excel or with any other tool for that matter is defined as the visual and/or mathematical representation of a set of relationships. Correlation is about defining the strength of a relationship. Between a model and correlation analysis, we are able to come much closer in replicating the true behavior and potential outcomes of the problem / question we are analyzing. Correlation is the bread and butter of any serious analyst seeking to analyze risk or gain insight into the future.

Given that correlation has such a big impact on the answers and analysis we are conducting, it therefore makes a lot of sense to cover how to apply correlation in the various simulation tools. Correlation is also a key tenement of time series forecasting…but that is another story.

In this article, we are going to build a simple correlated returns model using our usual suspects (Oracle Crystal Ball, Palisade @RISK , Vose ModelRisk and RiskSolver). The objective of the correlated returns model is to take into account the relationship (correlation) of how the selected asset classes move together. Does asset B go up or down when asset A goes up – and by how much? At the end of the day, correlating variables ensures your model will behave correctly and within the realm of the possible.

Copulas Vs. Correlation

Copulas and Rank Order Correlation are two ways to model and/or explain the dependence between 2 or more variables. Historically used in biology and epidemiology, copulas have gained acceptance and prominence in the financial services sector.

In this article we are going to untangle what correlation and copulas are and how they relate to each other. In order to prepare a summary overview, I had to read painfully dry material… but the results is a practical guide to understanding copulas and when you should consider them. I lay no claim to being a stats expert or mathematician… just a risk analysis professional. So my approach to this will be pragmatic. Tools used for the article and demo models are Oracle Crystal Ball 11.1.2.1. and ModelRisk Industrial 4.0

Excel Simulation Show-Down Part 2: Distribution Fitting

 

One of the cool things about professional Monte-Carlo Simulation tools is that they offer the ability to fit data. Fitting enables a modeler to condensate large data sets into representative distributions by estimating the parameters and shape of the data as well as suggest which distributions (using these estimated parameters) replicates the data set best.

Fitting data is a delicate and very math intensive process, especially when you get into larger data sets. As usual, the presence of automation has made us drop our guard on the seriousness of the process and the implications of a poorly executed fitting process/decision. The other consequence of automating distribution fitting is that the importance of sound judgment when validating and selecting fit recommendations (using the Goodness-of-fit statistics) is forsaken for blind trust in the results of a fitting tool.

Now that I have given you the caveat emptor regarding fitting, we are going to see how each tools offers the support for modelers to make the right decisions. For this reason, we have created a series of videos showing comparing how each tool is used to fit historical data to a model / spreadsheet. Our focus will be on :

The goal of this comparison is to see how each tool handles this critical modeling feature.  We have not concerned ourselves with the relative precision of fitting engines because that would lead us down a rabbit hole very quickly – particularly when you want to be empirically fair.

RESEARCH ARTICLES | RISK + CRYSTAL BALL + ANALYTICS