According to Techjury, humans are creating 2.5 quintillion bytes of data daily in 2020.
That’s triple the number of zeroes in a million. Every second of every day, every person is creating 1.7 MB of data – a number more easily conceptualised but still staggering. By the end of the year, the entire digital universe is expected to amount to 44 zettabytes.
It’s not surprising then that data science is becoming an increasingly strong driver in the asset management industry; not only in investment research but also in the overall investment process. Combined, data science tools, investment philosophies and creative and skilled minds make for a powerful combination. Investment managers that do make the most of the data available to them stand to gain a convincing competitive advantage in this fast-developing digital era.
Data science gives investment analysts and portfolio managers access to platforms and tools to mine large datasets, enabling them to reach more scientifically determined and thus more reliable conclusions. Ultimately, decisions taken based on these insights enhance investment results.
Various sources of data are currently being used worldwide in investment analysis, and “alternative” data has become extremely popular as an additional source of insight. Alternative data contrasts traditional data sources (e.g. financial statements or a time series of share prices), with more unconventional sources (e.g. web data, satellite images, or text data from Twitter).
Through various data science algorithms, this data can be used to generate information that wasn’t readily available before, which, in turn, can be used to create signals. An example is using natural language processing on Google search data to generate information about consumer sentiment. This concept encourages creativity. It also enables you to access a more holistic and immediate view of what’s happening in the world and quantify it in a way that informs our investment decisions.
There’s a lot of hard work that goes into getting data into a form that is sufficiently sophisticated and useful. About 80% of a project’s duration is spent on transforming and cleaning data before it’s suitable for use. Although an arduous process, it’s of the utmost importance that the data used in any analysis is qualifiable. As the saying goes, “garbage in, garbage out”.
Thankfully, data science has provided tools and algorithms that can sort through data quickly and accurately and, in so doing, identify any outliers and/or significant anomalies. For example, to calculate the Value at Risk (VaR) for various assets over different periods, one must first collect pricing data for each asset from a data provider, audit the data and ensure it is reliable (you can also use rules here on how to handle days with missing/incorrect values). The next stage is to feed and format each series of data into one data structure for ease of use (for example, one can choose between using long formats or wide formats). Based on this, the returns for each asset over the required periods can be calculated and only then can the data be used in the VaR function calculation, which probably already exists in a coding package.
An exciting part of using data science tools in research is the range of visualisations available to you. What makes the visualisation process so powerful is its ability to process and summarise sizable amounts of data timeously in a visual format that is easy to assess. From simple indexed performance line charts and distribution plots to more complicated 3D plots and heat maps, you can view data from various angles for a variety of purposes, including analysing multivariate relationships, risk management and performance tracking.
Many visualisations are now also interactive, which provides a better user experience when trying to obtain a specific relevant view that could differ amongst analysts. For example, being able to look at various periods of an asset’s performance in one single graph, or being able to choose which assets are to be displayed on a graph from a set selection of assets in a portfolio.
Float-adjusted market cap for each JSE Top 40 constituent over time
The following graphic is a 3D surface area graph that shows the implied volatility on the JSE Top 40, with given month-to-maturity and moneyness – a measure of profitability, breakeven positions or losses.
Implied volatility on the JSE Top 40
Incorporating data science into a systematic investment process has produced favourable outcomes by making the process more efficient (requiring less time-consuming, repetitive work to get to the point of making decisions), more scalable and consistent.
Scenario analysis can be used in risk management processes to monitor the sensitivity of portfolios, which depicts how portfolios behaved in periods of crisis. Monitoring and analysis can also happen in real-time, using current asset weights and compositions. This is valuable in that it enables portfolio managers to adjust portfolio positions ahead of expected changes in market conditions. Data science thereby provides the opportunity to make quick and proactive decisions to the benefit of the investment portfolio.
During portfolio optimisation processes – once the portfolio constraints and investment objectives have been decided – weights for chosen assets across a range of targeted risk buckets are generated. Some optimisation methods take considerable time to generate optimal results. However, the more repetitions the optimisation runs, the more accurate the results. There is no doubt that systematic investing is heavily data-driven and statistically based. To maximise the potential data science offers investment managers, companies will need to continue to develop and fully integrate data into the investment process and make additional investment in data storage and computing power.
Eliminating human biases is another way that data science has improved the investment process. Cognitive and emotional biases like fear and overconfidence have been known to impact investment decisions.
Systematic investing has provided a way to combat these biases since it is evidence-based and relies on factor models and rules and not solely on judgement. Data science has made systematic investing more powerful and ensures that investment decisions are not skewed by human factors, but rather are made objectively, proactively and in alignment with the clients’ best interest. DM/BM
This article was written by Shriya Roy, Quantitative Analyst at Prescient Investment Management
Prescient Investment Management (Pty) Ltd is an authorised financial services provider (FSP 612).
The value of investments may go up as well as down and past performance is not necessarily a guide to future performance.
This representative is acting under supervision.