Decision Science Developper Stack
Julia Programming

Decision Science Developper Stack

Eric Torkia

What tools should modern analysts master 3 tier design after Excel?

Share:

Print

Rate article:

No rating
Rate this article:
No rating

Are you an analyst wondering what comes next? Trying to avoid being put to the curb because of AI? Well, there are a few things that you need in your bag to remain relevant and have the first principles knowledge to avoid being automated out of existence.  As an advanced Excel analyst, you might find your workbooks becoming increasingly complex, slow, and challenging to manage as they grow in size and complexity. By adopting a three-tier design approach—separating data management, business logic, and presentation layers—you can significantly improve the efficiency, scalability, and maintainability of your models. 

 

Data Tier: Offload Data to SQL Databases and OLAP Cubes

As an advanced Excel analyst, you’ve likely encountered situations where the sheer volume of data starts to overwhelm Excel’s capabilities. Whether it’s handling large datasets, performing complex queries, or integrating data from multiple sources, you may have found Excel straining under the weight of these tasks. This is where SQL (Structured Query Language) comes in, offering a robust solution for data management and integration that can serve as the backbone of your analytical work.

SQL is the industry-standard language for interacting with relational databases, and learning it can significantly enhance your ability to manage and query data efficiently. By starting with SQL, you’ll lay a strong foundation that not only supports your current Excel-based models but also prepares you for the more advanced simulation work that lies ahead.

Why Start with SQL?

SQL is designed for the express purpose of querying and managing large datasets, which often reside in relational databases. Unlike Excel, which is best suited for smaller, static datasets, SQL excels at handling dynamic, relational data spread across multiple tables. Learning SQL will enable you to extract, filter, and manipulate data more efficiently, making it the perfect starting point for any advanced data analysis workflow.

With SQL, you can seamlessly pull data into your existing models, prepare it for analysis, and ensure that your simulations are based on accurate, up-to-date information. As you move towards more complex simulations and modeling tasks, SQL’s ability to handle large-scale data management becomes invaluable.

  • Learn SQL Basics: Begin by mastering the fundamentals of SQL, including SELECT statements, WHERE clauses, and JOIN operations. These core concepts will allow you to retrieve and manipulate data from multiple tables, ensuring that your models are built on a solid data foundation.

  • Integrate SQL with Excel: While learning SQL, practice integrating SQL queries with your existing Excel models. Excel’s ability to connect to external databases via ODBC or other methods allows you to pull in large datasets, run SQL queries, and update your data without leaving the familiar environment of Excel.

  • Prepare for Advanced Analytics: As you become more proficient in SQL, start exploring how it integrates with other programming languages like Julia and Python. This integration will enable you to streamline your workflow, moving data seamlessly from your databases into your simulation models.

Starting with SQL equips you with the tools to manage data at scale, setting the stage for more sophisticated modeling and simulation work.

Business Logic Tier: Automate with Julia and Other Tools



Once your data is managed externally, the next step is to automate the business logic that drives your models. This is where Julia and other programming tools come into play. By moving the complex calculations, simulations, and logic from Excel to a more powerful language like Julia, you can enhance the speed, reliability, and scalability of your models.

  • Why Julia: Julia is designed for high-performance numerical computation, making it ideal for running simulations, optimizations, and complex calculations that would otherwise slow down Excel. It integrates well with databases, allowing you to pull data, process it, and feed the results back to Excel seamlessly.
  • Automation Benefits: Automating your business logic reduces manual effort, minimizes the risk of human error, and ensures that your models can scale with your business needs. You can also use Python or other tools to further automate tasks, integrate with different systems, and enhance your workflow.

By handling the heavy computational tasks outside of Excel, you free up the spreadsheet to focus on what it does best: presenting data and insights.

 

Julia for High-Performance Simulations

Once you have a solid grasp of SQL and are comfortable managing and querying large datasets, the next step in your journey is to transition to Julia. Julia is a high-performance programming language designed for numerical computation, making it an ideal choice for porting and enhancing your Excel-based models.

While SQL lays the groundwork by managing your data, Julia takes things further by providing the computational power needed to run complex simulations, optimization models, and advanced mathematical analyses. Julia’s intuitive syntax and performance capabilities make it particularly appealing for analysts who need to scale up their models beyond what Excel can handle.

Why Transition to Julia?

Julia was designed with simplicity and performance in mind, making it an excellent choice for analysts looking to build more complex, computationally intensive models. Unlike other programming languages that can be cumbersome for numerical tasks, Julia allows you to write code that is both easy to read and incredibly fast to execute.

With Julia, you can replicate the functionality of your Excel models, but with the added benefits of scalability and speed. Whether you’re working on optimization problems, financial simulations, or statistical analyses, Julia provides the tools you need to elevate your work to a new level.

  • Learn Julia Basics: Start by familiarizing yourself with Julia’s syntax and core concepts. Julia’s design makes it easy for newcomers to pick up, especially if you have experience with other data-centric tools like Excel and SQL.

  • Explore DataFrames.jl: DataFrames.jl is Julia’s equivalent to Excel’s data tables, allowing you to manipulate and analyze datasets with greater efficiency. Begin by learning how to import, clean, and transform data using DataFrames.jl, much like you would in Excel, but with far greater speed.

  • Dive into JuMP.jl for Optimization: If your Excel models involve optimization, Julia’s JuMP.jl package will be a game-changer. JuMP.jl allows you to model and solve complex optimization problems with ease, providing a powerful alternative to Excel’s Solver.

By transitioning to Julia, you gain access to a language that can handle the computational demands of large-scale simulations and complex models, all while maintaining an intuitive and accessible syntax.

When you need to deploy in the cloud easily: Python


After mastering SQL for data management and Julia for high-performance simulations, Python is the final piece of the puzzle. Python’s versatility and extensive library ecosystem make it a valuable complement to the work you’ve already done in SQL and Julia. While Julia provides the computational power for simulations, Python offers unmatched flexibility for data manipulation, workflow automation, and integration with other tools.

Python is widely used in the data science community, and its extensive range of libraries can help you extend the capabilities of your models even further. Whether you’re looking to perform advanced data wrangling, develop machine learning models, or automate repetitive tasks, Python has the tools you need.

Why Add Python to Your Toolkit?

Python’s strength lies in its versatility and the breadth of its applications. It’s particularly useful for tasks that involve extensive data manipulation, integration with various systems, or the need for rapid prototyping. While Julia excels in speed and performance for numerical tasks, Python provides the flexibility to handle a wide range of other tasks that you might encounter in your analytical work.

Python’s libraries, such as Pandas for data manipulation and NumPy for numerical operations, are essential tools for any analyst. Additionally, Python’s interactive environment, Jupyter Notebooks, allows you to document your work, combine code with narrative text, and share your findings with others in a seamless, interactive format.

  • Learn Python Basics: Begin with the basics of Python, focusing on understanding its syntax and core programming concepts. Python’s straightforward design makes it accessible even for those new to programming.

  • Master Pandas for Data Wrangling: Pandas is Python’s go-to library for data manipulation, and it’s a natural extension of the work you’ve done with DataFrames in Julia. Learn how to clean, filter, and transform data efficiently using Pandas.

  • Explore NumPy and SciPy for Advanced Computation: For tasks that require advanced mathematical computations, NumPy and SciPy provide powerful tools that complement Julia’s capabilities. These libraries are essential for extending your simulation models and performing complex analyses.

  • Utilize Jupyter Notebooks for Documentation: Jupyter Notebooks offer an interactive environment where you can combine code, text, and visualizations in a single document. This makes it easy to document your models, share your work, and collaborate with others.

By incorporating Python into your workflow, you gain a versatile tool that complements the strengths of SQL and Julia, enabling you to tackle a broader range of analytical challenges with confidence.

Presentation Tier: Use Excel and PowerBI as the User Interface

 +

Finally, Excel remains an excellent tool for the presentation tier, where you interact with the data and results. Excel’s intuitive interface, powerful charting capabilities, and familiarity make it the ideal platform for presenting and interpreting the outputs of your automated models.

  • Why Excel: Excel’s flexibility and ease of use make it perfect for creating dashboards, reports, and visualizations. By linking Excel to your SQL databases and Julia-automated models, you can pull in the latest data, run analyses, and present results all within a familiar environment.
  • Interactive Dashboards: Excel’s capabilities can be leveraged to build interactive dashboards that update automatically based on data pulled from your external sources, making it easier for stakeholders to explore insights without compromising on performance.

Conclusion

Starting with SQL gives you a strong foundation for managing and integrating data, ensuring that your models are built on a solid base of accurate, well-organized information. From there, transitioning to Julia allows you to scale up your models, harnessing the computational power needed for complex simulations and analyses. Finally, adding Python to your toolkit provides the flexibility to handle a wide variety of tasks, from data wrangling to automation, rounding out your ability to build and deploy sophisticated models.

By adopting this three-tier design—offloading data management to SQL and OLAP cubes, automating business logic with Julia, and using Excel or PowerBI for presentation—you optimize your Excel models for better performance, scalability, and usability. This approach allows you to maintain the user-friendly interface of Excel while significantly enhancing the power and reliability of your underlying models.

Comments

Collapse Expand Comments (0)
You don't have permission to post comments.

Oracle Crystal Ball Spreadsheet Functions For Use in Microsoft Excel Models

Oracle Crystal Ball has a complete set of functions that allows a modeler to extract information from both inputs (assumptions) and outputs (forecast). Used the right way, these special Crystal Ball functions can enable a whole new level of analytics that can feed other models (or subcomponents of the major model).

Understanding these is a must for anybody who is looking to use the developer kit.

Why are analytics so important for the virtual organization? Read these quotes.

Jun 26 2013
6
0

Since the mid-1990s academics and business leaders have been striving to focus their businesses on what is profitable and either partnering or outsourcing the rest. I have assembled a long list of quotes that define what a virtual organization is and why it's different than conventional organizations. The point of looking at these quotes is to demonstrate that none of these models or definitions can adequately be achieved without some heavy analytics and integration of both IT (the wire, the boxes and now the cloud's virtual machines) and IS - Information Systems (Applications) with other stakeholder systems and processes. Up till recently it could be argued that these things can and could be done because we had the technology. But the reality is, unless you were an Amazon, e-Bay or Dell, most firms did not necessarily have the money or the know-how to invest in these types of inovations.

With the proliferation of cloud services, we are finding new and cheaper ways to do things that put these strategies in the reach of more managers and smaller organizations. Everything is game... even the phone system can be handled by the cloud. Ok, I digress, Check out the following quotes and imagine being able to pull these off without analytics.

The next posts will treat some of the tools and technologies that are available to make these business strategies viable.

Multi-Dimensional Portfolio Optimization with @RISK

Jun 28 2012
16
0

Many speak of organizational alignment, but how many tell you how to do it? Others present only the financial aspects of portfolio optimization but abstract from how this enables the organization to meets its business objectives.  We are going to present a practical method that enables organizations to quickly build and optimize a portfolio of initiatives based on multiple quantitative and qualitative dimensions: Revenue Potential, Value of Information, Financial & Operational Viability and Strategic Fit. 
                  
This webinar is going to present these approaches and how they can be combined to improve both tactical and strategic decision making. We will also cover how this approach can dramatically improve organizational focus and overall business performance.

We will discuss these topics as well as present practical models and applications using @RISK.

Reducing Project Costs and Risks with Oracle Primavera Risk Analysis

.It is a well-known fact that many projects fail to meet some or all of their objectives because some risks were either: underestimated, not quantified or unaccounted for. It is the objective of every project manager and risk analysis to ensure that the project that is delivered is the one that was expected. With the right know-how and the right tools, this can easily be achieved on projects of almost any size. We are going to present a quick primer on project risk analysis and how it can positively impact the bottom line. We are also going to show you how Primavera Risk Analysis can quickly identify risks and performance drivers that if managed correctly will enable organizations to meet or exceed project delivery expectations.

.

 

Modeling Time-Series Forecasts with @RISK


Making decisions for the future is becoming harder and harder because of the ever increasing sources and rate of uncertainty that can impact the final outcome of a project or investment. Several tools have proven instrumental in assisting managers and decision makers tackle this: Time Series Forecasting, Judgmental Forecasting and Simulation.  

This webinar is going to present these approaches and how they can be combined to improve both tactical and strategic decision making. We will also cover the role of analytics in the organization and how it has evolved over time to give participants strategies to mobilize analytics talent within the firm.  

We will discuss these topics as well as present practical models and applications using @RISK.

The Need for Speed: A performance comparison of Crystal Ball, ModelRisk, @RISK and Risk Solver


Need for SpeedA detailed comparison of the top Monte-Carlo Simulation Tools for Microsoft Excel

There are very few performance comparisons available when considering the acquisition of an Excel-based Monte Carlo solution. It is with this in mind and a bit of intellectual curiosity that we decided to evaluate Oracle Crystal Ball, Palisade @Risk, Vose ModelRisk and Frontline Risk Solver in terms of speed, accuracy and precision. We ran over 20 individual tests and 64 million trials to prepare comprehensive comparison of the top Monte-Carlo Tools.

 

Excel Simulation Show-Down Part 3: Correlating Distributions

Escel Simulation Showdown Part 3: Correlating DistributionsModeling in Excel or with any other tool for that matter is defined as the visual and/or mathematical representation of a set of relationships. Correlation is about defining the strength of a relationship. Between a model and correlation analysis, we are able to come much closer in replicating the true behavior and potential outcomes of the problem / question we are analyzing. Correlation is the bread and butter of any serious analyst seeking to analyze risk or gain insight into the future.

Given that correlation has such a big impact on the answers and analysis we are conducting, it therefore makes a lot of sense to cover how to apply correlation in the various simulation tools. Correlation is also a key tenement of time series forecasting…but that is another story.

In this article, we are going to build a simple correlated returns model using our usual suspects (Oracle Crystal Ball, Palisade @RISK , Vose ModelRisk and RiskSolver). The objective of the correlated returns model is to take into account the relationship (correlation) of how the selected asset classes move together. Does asset B go up or down when asset A goes up – and by how much? At the end of the day, correlating variables ensures your model will behave correctly and within the realm of the possible.

Copulas Vs. Correlation

Copulas and Rank Order Correlation are two ways to model and/or explain the dependence between 2 or more variables. Historically used in biology and epidemiology, copulas have gained acceptance and prominence in the financial services sector.

In this article we are going to untangle what correlation and copulas are and how they relate to each other. In order to prepare a summary overview, I had to read painfully dry material… but the results is a practical guide to understanding copulas and when you should consider them. I lay no claim to being a stats expert or mathematician… just a risk analysis professional. So my approach to this will be pragmatic. Tools used for the article and demo models are Oracle Crystal Ball 11.1.2.1. and ModelRisk Industrial 4.0

Excel Simulation Show-Down Part 2: Distribution Fitting

 

One of the cool things about professional Monte-Carlo Simulation tools is that they offer the ability to fit data. Fitting enables a modeler to condensate large data sets into representative distributions by estimating the parameters and shape of the data as well as suggest which distributions (using these estimated parameters) replicates the data set best.

Fitting data is a delicate and very math intensive process, especially when you get into larger data sets. As usual, the presence of automation has made us drop our guard on the seriousness of the process and the implications of a poorly executed fitting process/decision. The other consequence of automating distribution fitting is that the importance of sound judgment when validating and selecting fit recommendations (using the Goodness-of-fit statistics) is forsaken for blind trust in the results of a fitting tool.

Now that I have given you the caveat emptor regarding fitting, we are going to see how each tools offers the support for modelers to make the right decisions. For this reason, we have created a series of videos showing comparing how each tool is used to fit historical data to a model / spreadsheet. Our focus will be on :

The goal of this comparison is to see how each tool handles this critical modeling feature.  We have not concerned ourselves with the relative precision of fitting engines because that would lead us down a rabbit hole very quickly – particularly when you want to be empirically fair.

RESEARCH ARTICLES | RISK + CRYSTAL BALL + ANALYTICS