ANDREW: Our guest today is Matt Turck, Managing Director at FirstMark Capital, the largest early stage venture capital firm in New York with about $1.6 billion under management. Matt focuses on investments in Big Data, AI and Machine Learning. On top of that, he organizes two monthly events about data. Data Driven NYC is one of the largest AI and Big Data event series in the country. Matt, thanks for joining us.
MATT: Thanks for having me.
ANDREW: The topic today is big data and machine learning and artificial intelligence. We know that you have written, spoken on these topics extensively. I’d love, if we could start, in two areas. One is, do I understand it correctly that even though FirstMark is a venture capital firm focused in on early stage investing, based out of New York, your particular focus is identifying companies that are in these three categories?
MATT: The way it works is that different partners tend to have different majors, if you want, and one of my key majors is indeed the world of data, broadly defined which the way I think about it includes any data driven business, any business that has to do with big data, data infrastructure, data science, and artificial intelligence. So, I actively invest in the space.
ANDREW: When you say any business that’s driven by data, what businesses today are not driven by data?
MATT: Surprisingly, a large number. So, it’s tempting for us in the tech community to assume that because we are in a world of tech and analytics and we geek out about those topics that the rest of the world is very much on board with this, the paradox is that for us in the tech community, the term Big Data, for example, is starting to feel very much three years ago or five years ago even, it’s sort of not cool anymore to say, “Oh, I’m all about big data.”
“Big Data provides the pipes and AI provides the smarts”
ANDREW: Well, help us define that term for our audience. What does the term ‘Big Data’ refer to?
MATT: Yeah, absolutely. But just to finish that other thought quickly, I think the rest of the world in many ways is just starting to catch on to the fact that Big Data is the thing and that it actually could have a complete transformative effect on their business. So, the term Big Data is one of those buzz words that’s almost by definition imperfect, and almost by definition a term that means something for a couple of years and then becomes pretty much devoid of meaning. But the fundamental idea is that we are all exposed to amounts of data that are many, many, many folds larger than any datasets that the previous generations were exposed to and that there is a wealth of potential, knowledge, insights and actions that you can derive from that massive amount of information. So, in other words, everything becomes instrumented. So, of course, anything that’s online has a data exhaust. But increasingly, and the whole separate but very related trend of the Internet of Things participates in that, but increasingly the rest of the world starts emitting data, for the rest of the physical world. So, Big Data describes that phenomenon, and more specifically describes a set of technologies that essentially started, call it, 10 years ago that have the ability to harness this massive amounts of data and increasingly extract meaning from the data.
ANDREW: So, when we when we talk about Big Data in healthcare or Big Data abstracted from devices in the home or Big data in a automobile context, does that definition suggest that there is a certain amount of data, that there is a certain continuity in the gathering of the data? Maybe you can get more granular, so people can get a sense of, if I’m analyzing cancer patients, how much data do I need to gather before it’s Big Data? At what point does it go from medium data to Big Data.
JEREMY: Or is it even a function of volume at this point?
MATT: Yes. So, look, there’s two separate concepts. There’s the concept of Big Data, so like just to rewind to few years ago, there was this concept of 3Vs, right? Volume, variety and velocity. So you had to have the 3Vs to be considered Big Data. And, indeed, for a number of specific tasks, and I guess l we’ll talk about this in a minute, but for a number of specific tasks including anything where machine learning and AI is involved, having very large amounts of data is at this stage of the game essential. So, that’s one concept. But there is another concept which is that you can do plenty of interesting things with smaller data sets in a particular very vertically specialized, clean and usable data sets. And if you’re in one of the industries that you mentioned, you can absolutely do great things with fairly basic analytics. So, it’s not Big Data for the sake of Big Data, you can do plenty of things with smaller dataset. However, to jump back to the first points, if you want to cure cancer or if you want even to detect cancer, let’s say you want to do that through radiology, which is a very current topic, then you need to have a very large amount of radiology images, so that you can run AI, specifically deep learning algorithms, on those images so that essentially you see so many different patterns that you train the machine to recognize those patterns and then enable the machine to detect the less immediately noticeable patterns and the rarer forms. So for that you need tons of data.
ANDREW: So the idea is that if someone is generating cancerous cells, that would manifest itself in a way that heretofore was undetectable by the human eye. But if we were able to gather the collective data of millions of millions of people who had some progression of, I don’t know whether we’re talking about ultrasounds, or sonograms, or MRIs, but we would be able to recognize patterns. And those patterns could be used to definitively suggest whether someone is on their way to developing a disease. Is that the Big Data scenario you just described?
MATT: Yes, that’s essentially it. So this has absolutely the various forms that you described, the hot topic of the moment where you see a lot of startups focusing on is, again the concept of images. And so specifically radiology where if you see enough images, you can train your deep learning algorithms, which is a form of AI that works particularly well on images, to recognize patterns but most importantly to be able to spot rarer forms of diseases. So think about it that way. If you’re a doctor, radiologist you’re going to be exposed to only so many different cases in your life. What you’re essentially saying is that if you pool all the images together that dozens and dozens, and hundreds of radiologists will see in their lifetime you basically train the software to recognize not just the more common and obvious images, and forms of disease, but you expose the machine to all the very rare form of disease that almost by definition, a radiologist would only see a handful of times in their career.
JEREMY: It sounds like Big Data is a prerequisite really for a lot of those other applications of the technology. Just to take a step back for a second, what is the state of Big Data today within the marketplace? Are we fully deployed using the sense that, if we used to use typewriters now we use word processors and Microsoft Word and everyone uses Microsoft Word today. Where are we on that lifecycle when we talk about Big Data?
MATT: I think we are in the early maturity of the market, meaning that if you follow the classic adoption cycle, the pioneers, and the early majority have started not only trying out the technology, but actually going through actual deployments. And then there is somewhere and I forget the exact classification, but the late majority is starting to look at it and try it out. So, in other words if you think of the beginning of this whole Big Data phenomenon as when essentially Hadoop started appearing on the scene, which was I think 2006 or 2007. Basically you had a few years where Hadoop became the buzzword in conferences, and then the CIOs would send some people to the data conferences. And then these people would come back, say we absolutely must try these Hadoop thing, and then the CIO, to look good in front of their board would say, “Okay of course we have a Hadoop pilot.” And then that whole phenomenon happened pretty much up until 2011, 2012 if you want. And then 2012, 2013 the years may not be perfectly exact, but that was the time when some of these early people actually started doing those pilot, those early departmental deployments. Paradoxically, we now in 2017 just getting in the phase where, even those early adopters are truly in production with these Big Data technologies. And of course, I’m not talking about the Googles and the Facebooks of the world, because those are the people that actually developed these technologies in the first place. So those people have been in production for a very long time. What I’m talking about is sort of the Fortune 1000 companies. So there’s a group of early adopters that again started piloting the technologies a few years ago, then doing this multi departmental deployments, and are just now starting to reap the benefits of all these early efforts. There’s a whole different group of people that are essentially just starting to dip their toes in the water. I think there was a large number of people in particular that were waiting to see if the market was going to consolidate, that were waiting to see if an IBM for example was going to come with a one stop shop where all the various parts of the data were going to be provided by a one single vendor. I think at this stage of the game it’s becoming clear that that’s not going to be the case, at least not for the immediate future. Therefore, the later majority is now in full experimentation mode around Big Data.
JEREMY: It sounds like what you’re talking about really is sort of the abstraction of the complexity around these technologies. Does that mean that things like being in the cloud, so to speak, is a prerequisite for beginning to be deploying a Big Data solution?
MATT: Interestingly not. The cloud aspect and the Big Data aspect are two very related concepts, but they’re not always related. In fact I would say that most of the Big Data technologies that you hear people talk about these days are being deployed in an on-prem model, as opposed to in the cloud. In the world of Fortune 1000 companies, right, where there are still a bunch of concerns around security in the cloud, and other reasons why these Fortune 1000 companies don’t want to deploy in the cloud. However, there is a very strong trend that is probably a decade long trend around even Fortune 1000 companies moving to the cloud. On the other hand, you see a lot of the big cloud companies providing all the core elements of the big data stack, but increasingly also the machine learning in AI stack in the cloud. So if you look at AWS, they have pretty much all the core pieces that you need to be able to do Big Data in the cloud, whether that’s the database system, the warehouse, anything you need, but also have all the deep learning algorithms, and the machine learning, and the computer vision, and everything that you need to be able to do AI in the cloud.
ANDREW: Matt, we’ve talked a couple of times about machine learning and AI, artificial intelligence and it sounds like the Big Data is the precursor to those. But I’d love it if you define those terms for our audience.
MATT: Absolutely. So the way to the way to think about the articulation between both of us concept is Big Data is essentially the pipes, so Big Data is really the set up technology that enables you to process very large amounts of data at scale, cheaply and efficiently. And then the other part which is machine learning and AI is really what enables you to extract meaning and intelligence. And in the form, for example of automation, where is all that data that you’re able to process.
ANDREW: Maybe you can distinguish between, let’s distinguish between machine learning and artificial intelligence.
MATT: Sure. So, very simple machine learning is a sub part of AI. AI is a broader term, there are different forms of AI that do not involve machine learning, but at this stage for most cases they are used pretty much interchangeably, and probably wrongly so, but all the excitement around the AI that’s already been happening over the last couple of years had to do with machine learning.
ANDREW: So in the example we were talking about before we’ve got all of these different MRIs of a organ progressing over time from billions of patients. The aggregation of that data, that’s the Big Data component that you’re talking about. And the machine learning, the subset of AI which is being used interchangeably is the intelligence, I think you’re describing that’s applied against that to predict that you are a high likelihood candidate for progressing to cancer.
MATT: Yes, absolutely. Or in another case you’re in any kind of a Fortune 1000 company, you want to figure out who among all your customers are the people who are most likely to churn, which is a very standard use case. So, the Big Data part would be, “Hey, let’s gather all the customer data that we have whether that is customer data that comes from click stream, mobile, web, people registering, email, whatever channel you can think of, let’s get all that data, let’s clean it up, let’s organize in a way that it lends itself to analysis, and let’s store it, let’s be able to query it — all this Big Data. And then the minute you start analyzing the data and trying to extract meaning from the data, in that case, you know, trying to predict who among your customers are going to churn and you start getting into the analytics and data science part of the world, and machine learning and AI is a part of it. So data scientists spend a lot of time playing around with machine learning models. This is not all of what they do, but it’s a big part, and that’s when the articulation is. So the way I characterize it quite often is that I say Big Data provides the pipes and AI provides the smarts.
“…it is probably one of the most exciting times ever to be starting a company”
ANDREW: Okay. So, let’s do this. I love the opportunity to talk with you and dream about how certain verticals are going to be totally transformed because of big data and AI. So we just talked about health care, we gave actually just one example of health care. We can predict that someone is a candidate, not even a candidate, someone that will progress to develop a certain disease. In the case of subscription businesses, we talked about how media companies or any subscription business could predict who’s going to churn. Let’s- other predictions, other verticals, I’d love to talk about what is the future with ML and AI for the automotive industry. I’d also like to talk about what does it mean for the future of war, and we’ve been hearing a lot from Elon Musk and Mark Zuckerberg, but Elon Musk talking about the dangers of AI. But maybe we can start with automotive, any other verticals that you think it would be exciting if you’re an entrepreneur and you’re listening to this and you want to imagine what the future looks like because of these technologies. Here’s Matt Turck making some predictions about what the future looks like and then we can turn to Musk and Zuckerberg.
MATT: As an entrepreneur, I think it is probably one of the most exciting times ever to be starting a company because you are now able to take pretty much any kind of industry and apply this now increasingly robust set of technologies around big data and machine learning to build a product or a company that is completely transformative. So I think we are at that inflection point where a whole range of opportunities are open around machine learning and AI, whatever you call it, applications. I think a lot of the hard work around infrastructure has been done, and now is the time when you can start dreaming, “Hey, what if we took this industry and that problem, and applied Big Data and machine learning?”
ANDREW: But what industry excite you? Are you excited about agriculture, people have an idea all of a sudden, there’s a different way to plant based on a big data? Are you excited about manufacturing? Are you excited about space? What excites you?
MATT: I’m excited about it all, which sounds like a possibly glib answer, but I think that’s actually the reality of it. I think that there is a whole aspect, and that’s part of the question or we can certainly talk about the futuristic aspects. I mean, I think, autonomous vehicle, space, genetic, engineering, precision medicine, all of this is completely machine learning and AI driven, the future of it, and that’s incredibly exciting. But as an investor, and I certainly love all of this, and I certainly want to invest in all of this and FirstMark has investments in self-driving vehicle and all of the things. At the same time, I think equally interesting from an investment perspective and from an entrepreneurial perspective if you to start a company is the fact that there is a whole range of much more pedestrian but also very concrete applications of machine learning and big data that are happening right now. So, for example we have a company in the portfolio called Hyper Science that applies machine learning to the otherwise incredibly non-glamorous world of back office automation. But basically what that means is that they enable very large customers, whether they are financial services or government or that type of size of enterprises that enable them to go through massive amounts of documents, and images, and scan forms, and extract information and knowledge, and enable them to process and organize those documents. It’s a prime example of a super dusty area that nobody really wants to touch, it’s in some ways tedious and complicated and everything takes time, but that’s one area where AI has a 10x, maybe 100x impact on how quickly efficiently, how quickly and efficiently you can make things work once you use AI. So that’s one example.
ANDREW: It’s interesting.
MATT: But I think the world of work is completely changing as well with machine learning and AI. Again, another example in our portfolio which I’m bringing up because I happen to be very familiar with, it is a New York-based company x.ai which is taking a very specific problem, which is scheduling meetings, and applying very significant amounts of AI to it. And what the product helps you do is essentially schedule meetings with a AI powered digital assistant. So, if you want to schedule a meeting you just copy the assistant the way you would copy a human assistant, and then the assistant, which in the non-branded version of the product is email@example.com or firstname.lastname@example.org will take over and enable you to schedule a meeting. That is another example of taking a very sort of painful tedious task that everybody experiences, especially as we all schedule more more meetings, and enable AI take over. So AI will have an impact on the world of work as well as in a very sort of relatable daily way.
ANDREW: Let me ask you a very personal question. It’s interesting a lot of these sound like fantastic business opportunities. For example, the personal assistant, I love that business. But I’m curious from a personal perspective, you’re sitting in a place where you have expertise about the future and the promise of AI and Big Data and it sounds like, AI and Big Data offers the promise of curing many of the world’s problems, not just increasing productivity but curing disease, improving output of agriculture, improving education for people that don’t have it, some might argue Big Data can be used to solve poverty. I’m curious whether you think of part of your mission to accelerate entrepreneurs that are in this Big Data space where you really have an appreciation of the dynamics or whether you think about it in the context of maybe this is an opportunity to cure many of the world’s ills?
MATT: Yeah. So, to be clear we have a number of companies in the portfolio that have, not only a great business but also have a very positive impact on the world. And you mentioned education just as an example, we were early investors in a company called Knewton which pioneers the whole concept of adaptive learning, that basically uses big data to customize on the fly educational material to the individual learning preferences of each learner. That’s one example. So, that’s really something we take into account, and we think there’s going to be plenty of opportunities for businesses that both do very well for their investors and founders from a financial perspective, but also have a very positive impact on the world.
“…not Skynet type AI yet…”
ANDREW: Why is Elon Musk so afraid of artificial intelligence? Maybe you can characterize his fears. And for those people that aren’t familiar with this public discussion he’s been engaging in, maybe you can catch us up.
MATT: I think Elon Musk is a obviously incredibly visionary entrepreneur who lives in a future that a lot of us think as complete science fiction, which is part of his genius. I think the whole concept of space, of living on Mars, and that type of thing that to many of us that’s what we go to the movies to see. I think for him that’s very much something that he thinks about as a absolute reality that can be achieved in his lifetime. Which is in many ways the trait of geniuses. And he has this incredible ability to take that vision that is decades, potentially centuries out, and make it a reality. I think when he thinks about AI, he thinks about what AI could be in that very far at future, and then rewinds it very quickly to today’s reality. So in other words, he lives in a version of the future that from our perspective feels very, very far away.
ANDREW: His perspective is that, the logical progression of AI if not legislated against is that the governments will ultimately create the terminator. Putin just spoke about the future of artificial intelligence in the context he said, “The future belongs to those that develop artificial intelligence,” and then left us a little bit wondering what was the context of that. Is the context in the things we’re talking about healthcare? Is part of the context of that war? I’d love for you to talk about about Putin and Musk in this context of the terminator.
MATT: I think it’s a fundamentally important discussion. I think we all need to be extremely cognizant and aware of the potential risks. I think it’s important to start the conversation early. At the same time, my perspective from the trenches as someone who spends a lot of time with companies that are actually trying to build machine learning and AI is that, we [are] very, very, very far from any kind of Skynet type scenario. I think we are at the moment, and at least for the next few years, in a scenario where extremely smart people are feeling incredibly challenged making small bits of AI truly work. Whether that’s again back office automation, or certain scientific discovery, or a digital assistant, at this stage it takes enormous efforts by people who are incredibly smart, and spend a lot of time with a lot of venture capital money to do anything that is remotely real with AI. So I would not get carried away about global artificial intelligence that would take over the world. I mean from my perspective it’s feels very, very far away. It could accelerate, but we’re really not there.
ANDREW: But the idea would be to build robots that could increase productivity across a whole host of industries using AI. It’s just that you think in the context of building soldiers, when Putin and Musk say whoever masters AI will run the world, you think that particular context is the wrong one, or is just too far into the distance to be concerned with?
MATT: It sort of depends. Again, that’s the difficulty of this term AI. Which is that there’s that constant dance between AI and the future, terminator that type of thing, and the reality of AI today, and those are two very different things. So short, long term, all of this is absolutely a threat. I think today there is a level of threat that absolutely needs to be thought through, and considered, and protected against. But that’s not the terminator threat. So, the current threat in the context of war is everything that Russia has been doing, which is by now fully well-documented in particular using Ukraine as a lab for all sorts of cyber terrorist attacks. All of this is incredibly scary, and by all reports, all accounts the Russians are becoming incredibly strong at it. AI is involved in some of that. But AI is just the next step in the usual cat and mouse game of security, where one party starts arming up, the opposing party starts arming up as well to defend, and then take over the next, be stronger to respond. So, AI is part of this, but that’s not Skynet type AI yet, and it’s not going to be for a long time. So again, super important debate. But I would relax quite a bit for now. It’s Elon Musk, Elon Musk is a different animal. He’s a incredible once in a generation or more type entrepreneur. He has the luxury to talk about that type of thing. If somebody is going to talk about the dangers in the future, that should be Elon Musk. He plays a very important role. I think for the rest of us we can collectively focus on just making whatever product we’re trying to build, addressing a very narrow problem with AI just focusing on making it that work, and that is hard enough.
ANDREW: Matt I’ve greatly enjoyed this discussion. For those that want to follow you. You have a blog at mattturck.com, on Twitter you are @mattturck. Any other data points I should provide before signing off?
MATT: That’s great. If anyone is interested in learning more about this world, if you happen to be in New York they could come attend our Data Driven NYC events. All the videos from Data Driven NYC are online both on the FirstMark website at firstmarkcap.com, but also on YouTube. So if you go on YouTube and just search for Data Driven NYC, you’ll see a fairly significant library of videos now where we’ve interviewed a lot of the top CEOs and CTOs of many of the most interesting startups in digital big data and the AI space.
ANDREW: Thank you, Matt.
MATT: Thank you. Really enjoyed it. Thanks very much for having me.
JEREMY: Thanks, Matt.