Venture Capital Investment – Is It Art or Science?

Based on a OpenForum discussion moderated by Lana Duong and Zhao Yiliang, PhD

VC investment has traditionally been more art than science. Recently, a paper titled: “How Do Venture Capitalists Make Decisions?” from Stanford University Graduate School of Business reported that the average professional looking to finance the next unicorns relies more on the magic of gut instinct based on surveys done with 885 institutional venture capitalists at 681 firms [1]. 

However, in the big data era, there is plenty of investment related data from various sources such as Crunchbase, CBInsights, or Tech in Asia, etc. One may also obtain other information related to startups, such as the founders’ professional background and connections, social media discussions, market trends, and app usage, to name a few. The field of machine learning (ML) has matured significantly in the last decade and has been successfully applied in numerous industries to solve problems such as product recommendation, demand forecasting, fraud detection, credit scoring and facial recognition. 

These successful use cases for ML prompt the question – can we utilize data and technologies to build predictive models to assess a company’s investability and make better investment decisions? If we could feed all the aforementioned data into an ML model which is able to provide answers to the key questions in VC, such as how promising a startup is, which founders have higher chances of success, perhaps even what the exit value would be, then investment decisions would become more automated. The Openspace team came together and had a vibrant debate on the future of venture capital – will it be more art or science? What will be the extent to which data science can help predict an investment outcome, and the extent to which venture capitalists will be needed for their skills and value-add? Below are some of our thoughts.

In a hypothetical perfect world where we had all the data, could we rely on these data and build models to make investment decisions? If we were looking to data to fully predict outcomes without human intervention, it is perhaps too early.  Despite the increasing availability of data and the rapid advancement of data science and machine learning, we think that early stage venture capital investing is inherently a high risk, high uncertainty discipline so there would not be any model that could accurately predict the success of a company without human intervention with a narrow margin for error in the foreseeable future. 

Take for example the U.S. stock market which has the highest data transparency and richest historical records, it is still far from being an efficient or predictable market, and most quantitative hedge funds still do not achieve superior returns on a consistent basis despite their sophisticated trading algorithms. When we say data, we mean measurable, quantifiable data, and by this definition, we would be missing out on some important drivers of outcome which are more abstract/subjective and not quantifiable such as the market sentiment or the founder’s brilliance. 

Also, we need to look out for the outliers – companies that become very successful or unsuccessful in a way that does not align with the model’s prediction. This is where the skills, experience and judgement of a good investor may come in. However, we do believe that a data driven approach can help us narrow down the search space and flag out interesting opportunities to pay more attention to.

Courses @ Welcome to St Joseph's College (Autonomous) Bengaluru

What type of data is needed to assess a company’s investability? To determine whether a company is a good investment, one would need historical data on i) investment outcomes, e.g. exits, ii) companies’ characteristics, and iii) external factors that affect the outcomes (such as the sector’s readiness) to find a causal relationship between ii) and iii) on i), then apply this function to predict an investment outcome or derive an investability score for a prospective company based on its idiosyncratic characteristics and current external/environmental factors. 

One can argue that the data for ii) and iii) is more readily available, while the data for i) is scarce, particularly in Southeast Asia where there have not been many exits and many of those exits do not disclose their values or returns. As exit data becomes richer over time, we can hopefully start to find meaningful correlations and build predictive models to assess startups’ future potential. 

Gathering sufficient data entails a long and patient process, which sometimes may include manual efforts. We have some way to go in this regard.

But coming back to the current environment characterized by a shortage of formal databases, what are some alternative sources where we can find meaningful data? In order to build an adaptive and evolving predictive model, we need a frequent, automated (or low-effort) feed of updated, meaningful dataset at scale. This means that even though we can call up an industry friend to get a private data point on a certain company’s valuation, that is not a scalable method to collect big amounts of data. So where can we find additional information beside the usual publicly available databases? 

An interesting idea came from one of our team members – how about we invest in a large sample of companies (say a few hundred) at a very early stage with relatively small cheques in order to collect data on them, such as who become fast movers, who go on to raise significant funding, and what makes them successful? Would that be an effective and short-cut method to build up the dataset, or would it be a waste of investor funds? 

In addition to the basic information such as valuation, amount of funds raised, etc., it will be even more useful if we get periodic data feeds of their business performance transaction data or server logs, so that we get a vastly larger dataset in terms of their performance. However, we don’t need to get data from every single startup due to resource/time constraints and data unavailability; as long as we can narrow down to a smaller pool to focus on, so that we are able to make a better judgement based on the information in this smaller pool, that is already very helpful. 

What about social media? In the US, Twitter is a great source of information, where many people are on Twitter and we can follow influential people and see what they follow, what they post, and who they are interacting with. From this, we can get a lot of data to see what sectors, what technologies, and what companies are trending in the market. 

How the Middle East uses Social Media: 19 standout stats from 2019 | by Damian Radcliffe | Damian Radcliffe | Medium

In China, there is Weibo and several other social media platforms. In Southeast Asia, there are Facebook and Instagram which are popular among a large part of the population. These platforms provide rich and dynamic information on product reviews, customer comments, and latest happenings in the ecosystem. 

These can give a meaningful sense of the popularity, quality and characteristics of products and companies, and what is trending. However, we should be aware of the positivity bias where people are more inclined to post positive comments than negative ones on a public forum, and take this into account when forming a view based on social media contents.

Can deal sourcing be done by machines? As more and richer databases and data feeds become available, we can set up systems that can detect trending companies, products, or apps, and flag them to the investment team. Such systems can also mine insights from the data and detect companies that meet certain criteria or have certain characteristics that give them a significant chance of success. While these systems alone do not lead to final investment decisions, they can act as helpful deal sourcing and filtering channels to complement traditional human-led efforts.

Is there a place for venture capital investors in a data-driven future? Venture capital is a tremendously dynamic industry; some may even call it thrilling. Just the investment process alone comprises three aspects, all of which are important: sourcing the deal, assessing the deal, and winning the deal

While scientific data approaches can help with deal sourcing and assessment, we believe it is still down to the people behind the game to win the deal. Good entrepreneurs have a choice of investors, and we as investors want to win the deal by being credible, value-adding partners to them and their companies. 

At Openspace, having worked with over 30 companies (and counting) in our portfolio, we believe that having reputable, knowledgeable and value-adding investors can help elevate a company to a different trajectory and significantly increase the chances of a successful outcome for it. Building a successful business at the end of the day is real hard work performed persistently by a group of talented, committed and cohesive human beings – the kind of things no predictive models can replace. 

Our investment managers and in-house operations experts work tirelessly to assist our portfolio management teams on business strategy, follow-on financing, positioning for/executing an exit, new product/market expansion, introductions to strategic partners, senior personnel recruitment, employee satisfaction, technological build-out, product improvement, marketing initiatives, and you guessed it, data science. 

We have seen that these hands-on efforts make a difference to a company’s success, and are particularly even more relevant in Southeast Asia where most things are still under development. This is why we do what we do – venture capital with a hands-on, helpful approach which embraces technology and data to complement our investment decision making and elevate outcomes for our portfolio.


It is a fascinating endeavor to infuse data science into the art of venture capital investment. VC is inherently a private sector that deals with young companies which have limited track records and are attempting to develop new technologies, new products, and sometimes new industries altogether. Hence it may be a long way until we can get close to having rich enough data and powerful enough models to replace the human magicians that are venture capitalists in this craft. 

However, data models can at least help detect early signals and highlight promising investment opportunities, prompting us to pay attention to them and apply our judgment on top of the models’ recommendations to arrive at the final investment decisions. 

Venture capital is also inherently a human-driven, relationship-based discipline where a big part of success comes from winning over good founders, and sweating together with them, supporting them in all aspects possible as they build their companies. This is indeed the part of the job that is most exciting and rewarding to us, venture capital investors. 

Having said that, technology and data are powerful forces that can and will transform every industry. As we brace ourselves for the future, Openspace is a believer and a pioneer in applying data science in our investment processes to enhance and enrich the efficiency and quality of our decisions. We believe that the combination of art and science will be the future of VC, and we can’t wait to see the results (i.e. superior returns) in the years to come.    


[1] Gompers, Paul A., et al. “How do venture capitalists make decisions?.” Journal of Financial Economics 135.1 (2020): 169-190.