On the relevance of AI and ML research in academia
TL;DR: Is AI and ML research in academia relevant and necessary? Yes.
Over the last few months (and at our very recent faculty retreat) I had various discussions about the role of Artificial Intelligence and Machine Learning research (short: ML research) in academia and its relevance in light of various large companies, such as Google, Facebook, Microsoft, (Google’s) Deepmind, OpenAI, and Amazon pursuing their own research efforts in that space, at an unprecedented scale with a resource backing that no academic institution will ever be able replicate. A naïve first assessment might lead to the conclusion: we are done - let the industry guys take it from here. However, a more realistic assessment points to a synergetic relationship between industry and academia being located at very different stages in the research-to-product pipeline. Clearly, this post is (highly) biased; I am in academia after all (though have worked in industry at various stages).
Industry ML research is valuable and important
ML research conducted in industry has had a huge impact over the last few years with various high-profile examples including the AlphaGo’s success in playing go as well as more generally autonomous vehicles—although the latter recently came under heightened scrutiny.
Transition-to-scale
Often these successes are not necessarily about fundamental advancements in the underlying methodology but impressive demonstrations of transition-to-scale. In fact several of these recent high-profile successes are made possible by an insane amount of compute power for training but the underlying methodology (e.g., policy gradients) has not fundamentally improved. This is good and bad at the same time. First of all, it demonstrates the capabilities of methodology that we have in the limit. That’s good as it helps us understand whether there is an inherent shortcoming within the methodology or whether, e.g., it is just not scaling. At the same time it is bad as it might negatively impact the perceived need for improving the underlying methodology, as we can somehow make it work.
Access to data and infrastructure
Another big advantage of ML research through industry is that industry often has access to data and infrastructure that is not available in an academic setting. This allows industry to build ML systems that cannot be realized within an academic context, e.g., large-scale machine translations, systems such as Google’s assistant etc. Moreover these systems can be integrated into other large-scale systems offering value to the user and society at large—not necessarily for free though. The impact of these systems on day-to-day life can be quite impressive.
Academic ML research is essential to society
I believe that academic ML research, does/should/can/will play a very different role and can serve societal needs that are beyond the scope and interest of industry-driven ML research, as they do not bear an immediate or mid-term profit opportunity. I would like to stress a few things first though:
- this applies to a large extent beyond the specifics of ML research, however the current representation of large global industry players so close to academic entities in ML is (arguably) unprecedented.
- The following topics etc are not exclusive to academic research although, in my experience, they have been much stronger represented in academia. Moreover, these topics are on top of the foundational research agendas in ML and AI that are found throughout top academic institutions and that I deem essential for the academic pursuit as a whole.
Conducting curiosity-driven high-risk research
In general, academic research is situated very differently. Not having the need to serve a company’s agenda and ultimately a profit motive, it allows for exploration of fundamentally new methodologies and ideas that are more high risk but might ultimately lead to revolutionary approaches. After all, basic ideas of the approaches that we are riding on right now date back to about the 1940s and 1950s, but back then these ideas were considered crazy, unrealistic, or simply crackpot. It is precisely this curiosity-driven research that academia can provide and that is essential to society. Andrew Odlyzko provided an interesting and nuanced perspective on this in 1995 when he was at Bell labs in “The decline of unfettered research”:
We are going through a period of technological change that is unprecedented in extent and speed. The success of corporations and even nations depends more than ever on rapid adoption of new technologies and operating methods. It is widely acknowledged that science made this transformation possible. At the same time, scientific research is under stress, with pressures to change, to turn away from investigation of fundamental scientific problems, and to focus on short-term projects. The aim of this essay is to discuss the reasons for this paradox, and especially for the decline of unfettered research.
Open, transparent, and falsifiable
In contrast to industry research that is often proprietary and only available in watered-down versions (lacking details, data, or both), academic research is typically made available to the public including enough details to falsify a proposed approach. Staying true to Popper, this tiny detail is extremely important as it allows for scientific discourse, where a hypothesis can be tested and rejected and as such ultimately advances science and is highly relevant in the context of the current “alchemy-discussion” in ML research (see here for Ali Rahimi’s talk at NIPS, Yann LeCun’s response, and some background here, here, and an Addendum on Ben’s blog). I am with Ali, Ben, et al on this one, especially if we really plan on deploying ML-based systems in the physical world and putting them at the center of life and death decisions, as e.g., in autonomous vehicles… but that discussion is for some other time.
Tackling societal challenges
Academic (ML) research allows for tackling societal challenges that I believe deserve our attention even if they do not bear a short or mid-term profit opportunity. These include:
- Infrastructure management. E.g., improving power systems, transportation systems, etc., especially given that many of those systems are beyond end-of-life or highly strained. One example that comes to mind is Jake Abernethy’s ML/data approach in the context of the Flint water crisis (see also here and here)—this is also a great example for synergies between industry and academia as Google funded the research with $150k.
- Healthcare. (Beyond longevity), e.g., support systems for elderly healthcare, and systems for improving health-related outcomes in resource-limited settings. These can for example include systems for the early detection of cognitive decline as done at Riken AIP’s Cognitive Behavioral Assistive Technology Team.
- Human impact on the earth. This includes understanding the human impact on global weather change, as well as mitigating the effects of severe weather events (through AI-based early warning systems) and potentially reversing them through a holistic understanding of the causal chains.
- Broad societal challenges. Including, mitigating societal impact from unequal wealth distribution and workforce impact through ever faster technology cycles.
- Education. Having reached a point in time where technological cycles are so short that life-long learning is a necessity, ML approaches in teaching might significantly improve and speed-up learning outcomes. This goes hand in hand with the sustainability question of online education and the resulting challenges from such decentralized approaches.
- Protecting society against manipulation through data and ML. This includes things such as, detecting deep fakes (now available as an app) (here is the SIGGRAPH video—check it out! ), detecting manipulated news, as well as the detection of broader exposition to information bias, and many more.
Working with smaller, noisy data sets, and unbounded losses
I believe another important challenge in learning that has received relatively little consideration in industry ML research is working with small, noisy, and potentially unlabelled data sets. Working with such data, which is often at the core of real-world applications requires new approaches, interpolating dynamically between model-based approaches (regularly incorporating deep domain knowledge) and model-free approaches, where the system dynamics are learned directly from data. For example:
- Medical applications. Often it is hard (to impossible) to obtain the necessary amount of data for data-intensive learning approaches (such as, e.g., deep learning). Typically obtaining or ‘generating’ such data follows very complex and time-intense processes requiring complex IRBs and even if all formal requirements are met, then often, say, a condition one would like to obtain data about, is so rare that the overall data availability and throughput is very limited.
- Physical systems. Here the main challenges are that physical systems are bound to the limitations of physics and as such generation of data is often slow and expensive. To make things a bit more tangible, say you would like to build a reinforcement learning-based system for inventory management in a highly dynamic environment. For the necessary data collection, you either have to wait a long time as you actually have to observe system to obtain the data (reality-in-the-loop approach), apart from the fact that testing in such systems is nearly impossible, or you have to run a simulation but then you likely to run into model-mismatch issues as the simulation model does not quite match reality.
- Unbounded losses. A standard approach for many learning problems is based on (regularized) empirical risk minimization (ERM), where we solve problems of the form $\min_{\theta} \frac{1}{n} \sum_{i \in [n]} \ell(f(x_i,\theta),y_i) + R(\theta)$ and then $\theta$ is the parametrization of the learned model. ‘Getting it right on average’ (or some other form of probabilistic statement or risk measure) however is often not good enough for real-world applications. A great example is (again) autonomous driving: we do not want to learn that crashing into a wall is not good by actually crashing into the wall; a typical scenario where losses are essentially unbounded but in the ERM problem their contribution would be limited. These applications require either a very different learning approach (here is a nice example from the MPC Lab @ Berkeley for safe learning; check out the video!)—or explicit consideration of the maximal loss (see this work of Shalev-Shwartz and Wexler; can be nicely incorporated into many ERM approaches) if an ERM formulation is desired or required.
Synergetic relationship between Academia and Industry
So how does this all come together? I strongly believe that the relationship between Academia and Industry has to be synergetic. Never has an ‘academic skillset’ had such a direct translation into an industry context. While this direct translation is a root cause for the current debate on relevance of academic ML research at the same time it is also an opportunity for doing something together and rather than outlining a very limited model of what one could do I’d rather mention two current themes that I think are not helpful to achieve synergies—of course as always there are exceptions.
The false god of co-employment
It is impossible to serve two masters with vastly different objectives. Some time back I talked to a researcher with a co-employment deal (similar to the one that Facebook would like to see) and I asked him about publishing. He told me about an argument that he had had with one his superiors. Upon requesting time to publish (a pretty substantial methodological advance) he got the following answer (paraphrasing): “If it creates value for the company why do you want to make it available to the public? If it does not create value, why do you waste time writing it up?” (For a much more nuanced and detailed discussion you should read: “You Cannot Serve Two Masters: The Harms of Dual Affiliation”)
Tapping the talent pool disguised as university partnerships
Current interactions between academia and established industry players in the ML field are often reduced to treating the academic institution as a talent pool. This comes with many complications. Given the strong demand for ML talent, companies are vacuuming up whatever comes into their way, including students that would have been brilliant academic researchers and that are much less suited for an industry R&D type role. Often it is pure compensation numbers that lure students away and while there might be a short-term benefit for industry eventually this is akin to killing the golden goose. This is not to say that university partnerships with industry cannot be successful—I believe it is quite the opposite actually—but the current predominant model is harmful to academic institutions (and society at large) and does not satisfy the equal partners requirement; you know how the saying goes: if you cannot spot the sucker in the room, it is you.