Book Notes: AIQ
In the tech startup world, it seems like all you ever hear about these days is artificial intelligence and machine learning. Everywhere you look, there is a AI startup automating this, or an ML startup providing more clarity to that. But it’s hard to hear the real innovators through all of the noise. And even then, when you do find an AI-based startup that is using real, true-blue AI, it’s hard to tell exactly what they are talking about. At least, that’s how I feel. I did not get my masters degree in data analytics or my PhD in neural networks, so I started looking for answers on my own.
Enter: AIQ, a book by Nick Polson and James Scott. This tome is as an easily digestible explainer for all things Artificial Intelligence - not just in the startup world, but everywhere that technological advances are taking place. I found the book to be exceptionally enlightening. I would highly recommend it to anyone remotely interested in this space, or anyone who wants to understand the future of our world in the next 5, 10, 30 years.
Polson is a professor of econometrics and statistics at the Chicago Booth School of Business. AI is his bread and butter and he has been published several times on the matter. He has a deep background in Bayesian Statistics, which, it turns out, has a lot to do with the current state of AI. His co-author, James Scott, is a professor of statistics at the University of Texas at Austin (hook em). Scott also has a deep scientific publications background and he spends his free time consulting with clients in various industries to help with their big data problems.
It’s no coincidence that the book about AI is written by two scholars who can go deep on statistics. The entire industry has been built on the shoulders of statisticians and mathematical wizards. Because that’s essentially what AI boils down to: the effective use of highly developed statistical models, mathematical ingenuity, and big data. AI is also all about conditional probability - the chance that one thing will happen because another thing has already happened (or vice-versa). While this book doesn’t go super deep on the inner-workings of some of these AI models, it does explain the logic behind how they are built.
It’s clear that there is a lot of secret sauce that goes in to building some of the best AI on the market today. But behind every super-complicated, highly esoteric neural network, there is a lot of straightforward statistical linear regression and human assumptions. The human assumptions are a key to why AI needs to be studied by everyone, not just computer engineers or math professors. Every AI has a human element built into it - and in order to understand the world around you in the near future, you are going to need to understand those assumptions as well as you understand your own fellow (wo)man.
There are a couple of other key takeaways that I have from the book that can be found below:
Models, Models, Models. Artificial Intelligence is dependent on developing strong models. Even if you have the greatest data set in the world, if your model is terrible and predicting wrong outcomes, it doesn’t matter. Good models require good assumptions - if you are putting together an AI model, make sure you understand every single assumption and that each one is defensible. And all models are basically ways to predict something, so you need assumptions. AI also relies on a complete data set, but solid assumptions and workarounds can be made to overcome incomplete data.
Pattern Recognition. One of the cool things about being a human is our innate ability to recognize patterns. Pattern recognition is one of the key cognitive abilities that separates humans from other animals. Machine Learning, in turn, is the act of trying to replicate that pattern recognition ability, but in a scalable manner. Recognizing a pattern means fitting an equation into data - you already have to have a solid understanding of the results from the past before you can build predictive results for the future.
Now is the Time. Neural Networks are basically super-sized predictive models. Models are larger than they used to be because they can be - we have more robust data sets and more computing power to quickly solve modeled formulas now than ever before. More things are being tracked than ever, so we have more data points. All of this data can be stored efficiently on the cloud and computing power is finally at a point where it can keep pace with predictive analytics on a super-sized scale.
Bayes’ Theorem. This theorem describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For example, if cancer is related to age, then, using Bayes’ theorem, a person's age can be used to more accurately assess the probability that they have cancer, compared to the assessment of the probability of cancer made without knowledge of the person’s age. The entire modern practice of AI / ML is based on Bayes’ Theorem.
Vive la Révolution. We are currently living through the Natural Language Processing revolution. Natural Language Processing is the act of a computer translating human words into computer language, an act that was done in reverse for the entire prior history of computers. A real breakthrough occurred when statistical NLP was discovered - this is when a computer can predict what one word is going to be based on the other corresponding words in a sentence. In other words, instead of computers having to know every rule in each human language, it just needs to follow the trends on how humans actually use those words. This boom-time for NLP has opened the floodgates for a variety of ML/AI use cases.
This Model is Busted. Models can eventually have “rust” if you don’t treat them correctly and if you don’t maintain them. Models can go out of date if data is processed incorrectly or the circumstances the data is being used to predict changes significantly.
One of the stories that is used to frame a chapter is the tale of Admiral Grace Hopper and how she brought about the rise of Natural Language Processing in the 1950’s and 1960’s. Hopper served in the United States Military during the Second World War and worked on the Harvard Mark I, a “super-computer” of its time that the military used to solve problems on large data sets and provide assistance to our armed forces. One of the things I found interesting about the Mark I was that newspapers of the time referred to it as a “robot brain”. This is particularly compelling evidence that the hype around AI has always been misunderstood and that attribution is frequently placed where it shouldn’t. However, I think that AI is still here to stay and we are only going to see an increase in AI activity, especially in software startups. While it might not be a requirement for a startup to have an AI component, I do think it will be more and more common moving forward. What’s more, I think it’s important for consumers of software products to understand how AI works so they can better understand how the world around them works, because, pretty soon, it’s going to be everywhere.