Hello Abhishek:
#1: Unfortunately, there isn't any reference for the 'polishing heuristics', which is why I mentioned them as proprietary. We came up with them while dealing with lots of different types of datasets.
#2: You can access the reference books I mentioned in google books, available at
http://books.google.com/. A few chapters/pages are visible for free there, and should meet your need.
Thanks.
-Samik
On 9/1/2011 11:50 PM, Chatterjee, Abhishek wrote:
Hello All,
I’m sorry for replying late as I had no access to emails for the last 2 days. Please find my comments below.
- I have no doubt on crystal ball estimates. I’m trying to understand how crystal ball produces so accurate estimation of 3 parameters while that using excel solver are not that close. Conversations with you and Eric have helped me increasing my knowledge in this particular area. I’ve no idea of the polishing heuristics. I’ll do some research before coming up with any questions. I’ve tried estimating weibull parameters by moments, but not the way you’ve mentioned. Will try doing that as well. Meanwhile can you suggest me some material on the polishing heuristics as well?
- The excel files I sent had a list of parameter estimates of 2 and 3 distributions. The 2 parameter estimates are pretty close to crystal balls. The problem was with weibull parameters which were estimated using MLE not moment estimators as you mentioned. I guess I would have to try it the way you mentioned.
- I agree with your 2nd point on GoF statistics.
I’m still in the process of acquiring the books you mentioned.
Thanks a lot for all your guidance!
Abhishek
From: Samik Raychaudhuri []
Sent: Tuesday, August 30, 2011 10:19 PM
To: Eric Torkia
Cc: Chatterjee, Abhishek
Subject: Re: Fwd: Crystal Ball distribution fitting enquiry
Thanks Eric - got the files. I will still wait to hear from Abhishek on my latest comments before investigating further.
-Samik
On 8/30/2011 10:34 AM, Eric Torkia wrote:
My Bad… I will send the files along
Eric
From: Samik Raychaudhuri
Sent: Tuesday, August 30, 2011 11:52 AM
To: Eric Torkia
Cc:
Subject: Re: Fwd: Crystal Ball distribution fitting enquiry
Hello Abhishek/Eric,
I think I have missed the Excel file you mention in your email. Can you please resend that - I can take a look.
Other than that, here are a few comments on your followup email:
#3: Weibull distribution parameter evaluation: For finding out distribution fitting parameter for Weibull distribution, we start off estimating the shape first using an iterating scheme, since the shape parameter has a standalone equation. Next we calculate the estimates for scale and location using the moment equation. But after that, we do quite a bit of proprietary polishing of the parameters based on the dataset and other indicators. In our internal tests, we have consistently got very good results from the parameter fitting routine, but it is not easy to replicate the calculations using Excel. What I would like to mention you however is, if you think that the quality of Weibull fit you are getting from Crystal Ball is not as good as you are getting from your calculations, please don't hesitate to send in the dataset for me to take a look. You can look at the GOF statistic of your choice to decide the quality of fit to the distribution with the parameter set you have evaluated, and compare it with the GOF statistic CB is returning. Do let me know what you are finding out.
Regarding reference on Weibull fit, the books I mentioned have materials on Weibull distribution too, but the polishing heuristics are not covered.
#4a. You are right, in general the parameters should be calculated first and GoF statistics should be calculated after that. We do not directly minimize GoF statistic for any distribution, rather we calculate the GoF statistic and check it as a way to guide the parameter estimation - that works well in the absence of closed-form MLE.
Hope this helps. As before, please don't hesitate to get back if you have more questions.
Thanks.
-Samik
On 8/30/2011 7:21 AM, Eric Torkia wrote:
Hi Samik:
Our friend Abhi still has a question or 2 about Weibull and some reference material...
Thanks for your help!
Eric
Sent from my iPhone
Begin forwarded message:
From: "Chatterjee, Abhishek"
Date: 30 August, 2011 8:01:22 AM EDT
To: Eric Torkia <
etorkia@technologypartnerz.com>
Subject: RE: Crystal Ball distribution fitting enquiry
Hello Eric,
Hope this mail finds you in the best of health. Any updates on this?
Abhishek
From: Chatterjee, Abhishek
Sent: Friday, August 26, 2011 8:57 PM
To: 'Eric Torkia'
Subject: RE: FW: Crystal Ball distribution fitting enquiry
Hello Eric,
First of all I would like to thank you for mentioning the site on weibull distribution and Dr. Samik as well for mentioning the materials on gamma distribution. I’ve gone through the answers provided by Dr. Samik.
Please find attached few excels and some clarifications and comments.
- I used the term bivariate by mistake. As correctly pointed out, I was referring to 2 and 3 parameter distributions.
- I’m able to estimate gamma parameters using excel solver. I read in crystal ball statistical documentation that sometimes solutions might not exist for gamma and beta distributions. But as said I’m able to derive estimates for 3 parameters gamma as well using excel solver.
- I’m still facing some issue in Weibull estimate. I’m attaching the excel samples as well. The location parameter estimate is not close to crystal ball estimate. But if I fix the location parameter and then estimate the other 2 parameters using excel solver, the solutions are close to crystal balls’. I tried to estimate the location parameter by a median rank method mentioned in Weibull analysis handbook but the estimates still seem to differ (the details can be checked in the excel file). I’m trying to search for possible iterative schemes as mentioned below by Dr. Samik. Request you to suggest some materials on this, if available, as has been mentioned in case of gamma distributions.
- Regarding the explanation on techniques used when a distribution does not have closed form MLE, I have some further doubts.
- I was of the opinion that first distribution parameters are estimated and then GoF statistics are calculated. Please correct me if I’m wrong.
- I’ll also read about MME (method of moments) and percentile fits and get back to you in case I’ve any doubts.
- As mentioned by you, my tests also show that excel solver and crystal ball estimates are pretty close in case of 2 parameters and 1 parameter distributions. I’ve tried exponential, normal and lognormal distributions.
Abhishek
Hi Abhishek:
As usual, Oracle has come through for us with an answer! I hope this helps and if you have any other questions, please don’t hesitate to ask.
Best regards,
Eric
From: Samik Raychaudhuri
Sent: Thursday, August 25, 2011 5:16 PM
To: Eric Torkia
Subject: Re: FW: Crystal Ball distribution fitting enquiry
Hi Eric,
Now onto the questions from your client. Please find them in the same order as the questions.
1. Crystal Ball uses MLE to calculate the distribution parameters when possible. MLE can be used for most of the distributions, including Gamma distribution.
Gamma distribution: For the MLE equations of 2-parameter gamma distribution, check the following references:
(a) Wikipedia:
http://en.wikipedia.org/wiki/Gamma_di...
(b) Cohen and Whitten. Parameter estimation in reliability and lifespan models. CRC Press. 1st Ed. 1988. Chapter 6.
(c) D'Agostino and Stevens. Goodness-of-fit techniques. Marcel Dekker Inc. 1986. Chapter 4. Section 4.12.
Note: (a) deals with 2-parameter gamma, and the remaining two references deal with 3-parameter gamma. We however use a slightly different algorithm to fit to 3-parameter gamma distribution, that includes the 2-param MLE equation, as well as an iterative scheme.
Beta distribution: You are right, there is no closed form MLE for either the 2-param or the 4-param beta distribution. Refer to the answer to next question.
2. If a distribution does not have closed form MLE, then there are a few ways to fit data to distribution. Some of them are MME (method of moments), percentile fit or even fit to minimize one of the goodness-of-fit statistics (like A-D statistic or K-S statistic). We sometimes use a combination of strategies to fit to these distributions.
3. When MLE is available, using Excel solver to find the distribution parameters would get you pretty close results. In my tests, you can get the best result when comparing to CB is if you lock the location parameter, and solve the 2-parameter MLE for distributions like gamma or Weibull. As I mentioned, we use a slightly different scheme when location is involved, so the results may not match exactly, but they should still be pretty close. Couple of notes here: (a) You mention here about MLE of gamma, which contradicts your first question about not finding MLE of gamma. (b) I am not sure I understand your use of the terms 'univariate and bivariate distribution'. All of what we are doing in distribution fitting is in the domain of univariate distribution. There is no joint probability distribution being considered, so no bivariate or multivariate distributions either. You probably meant distributions with one or two params. Which distributions were you able to match?
Please follow up if the above notes do not answer your questions, or you need more clarification. Also, if you are not able to match your dist fitting result between CB and Excel, please send me the file containing the data, and I can take a closer look. Hope this helps.
Thanks
-Samik
--
Samik Raychaudhuri, Ph.D.
Principal Member of Technical Staff
Oracle Americas, Inc.
Crystal Ball Global Business Unit
My blog on Crystal Ball:
http://oraclecb.blogspot.com/
On 8/24/2011 6:53 PM, Eric Torkia wrote:
Hi Samik:
I hope this message finds you well… I was wondering if you could help me resolve the following customer query about MLE in CB… kind of question that makes you think ;-)
By the way did you check out the latest video series on correlating assumptions?
I look forward to hearing from you soon!
From: Chatterjee, Abhishek
Sent: Wednesday, August 24, 2011 2:57 AM
To: info@crystalballservices.com
Subject: Crystal Ball distribution fitting enquiry
Hello,
We are using Oracle crystal ball fusion edition 11.1.2.0.00 for our analytics and simulation scenarios.
We have a couple of questions regarding distribution parameter estimation is crystal ball.
- Does crystal ball uses MLE to calculate all distribution parameters as MLE is not available for some distributions like Gamma and Beta distributions.
- If not what other standard estimator techniques are used by crystal ball?
- If Maximum likelihood is used, can excel solver be also used for estimating 3 parameter distributions? There seems to be a difference when we estimate parameters using excel solver and crystal ball in case of distributions like weibull and gamma but it matches for univariate and bivariate distributions.
Looking forward to your quick assistance. Please let me know if some other information is required from our side.
Regards,
Abhishek Chatterjee
Benchmarking & Analytics
Organizational Excellence