Overview
The finale of the Deep Learning Workshop at ICML 2015 was the panel discussion on the future of deep learning. After a couple of weeks of extensive discussion and exchange of emails among the workshop organizers, we invited six panelists; Yoshua Bengio (University of Montreal), Neil Lawrence (University of Sheffield), Juergen Schmidhuber (IDSIA), Demis Hassabis (Google DeepMind), Yann LeCun (Facebook, NYU) and Kevin Murphy (Google). As recent deep learning revolution has come from both academia and industry, we tried our best to balance the panelists so that audience can hear from the experts in both industry and academia. Before I say anything more, I would like to thank the panelists for having accepted the invitation!
Max Welling (University of Amsterdam) moderated the discussion, and personally, I found his moderation to be perfect. A very tight schedule of one hour, with six amazing panelists, on the grand topic of the future of deep learning; I cannot think anyone could’ve done a better job than Max. On behalf of all the other organizers (note that Max Welling is also one of the workshop organizers), I thank him a lot!
Now that the panel discussion is over, I’d like to leave a brief note of what I heard from the six panelists here. Unfortunately, only as the panel discussion began, I realized that I didn’t have a notepad with me.. I furiously went through my backpack and found a paper I need to review. In other words, due to the lack of space, my record here is likely not precise nor extensive.
I’m writing this on the plane, and forgive me for any error below (or above.) I wanted to write it down before the heat from the discussion cools down. Also, almost everything inside quotation marks is not an exact quote but a paraphrased one.
On the present and future of deep learning
Bengio began by noting that natural language processing (NLP) has not been revolutionized by deep learning, though there has been huge progress during the last one year. He believe NLP has a potential to become a next big thing for deep learning. Also, he wants more effort invested in unsupervised learning, which was resonated by LeCun, Hassabis and Schmidhuber.
Interestingly, four out of the six panelists, LeCun, Hassabis, Lawrence and Murphy, all found medicine/healthcare as a next big thing for deep/machine learning. Some of the areas they expressed their interest in were medical image analysis (LeCun) and drug discovery (Hassabis). Regarding this, I believe Lawrence is already pushing into this direction (DeepHealth from his earlier talk on the same day,) and it’ll be interesting to contrast his approach and those from Google DeepMind and Facebook later.
LeCun and Hassabis both picked Q&A and natural language dialogue systems as next big things. Especially, I liked how LeCun puts these in the context of incorporating reasoning based on knowledge, its acquisition and planning into neural networks (or as a matter of fact, any machine learning model.) This was echoed by both Hassabis and Schmidhuber.
Schmidhuber and Hassabis found sequential decision making as a next important research topic. Schmidhuber’s example of Capuchin monkeys was both inspiring and fun (not only because he mistakenly pronounced it as a cappuccino monkey.) In order to pick a fruit at the top of a tree, Capuchin monkey plans a sequence of sub-goals (e.g., walk to the tree, climb the tree, grab the fruit, …) effortlessly. Schmidhuber believes that we will have machines with animal-level intelligence (like a Capuchin smartphone?) in 10 years.
Slightly different from the other panelists, Lawrence and Murphy are more interested in transferring the recent success of deep learning to tasks/datasets that humans cannot solve well (let me just call these kinds of tasks ‘non-cognitive’ tasks for now.) Lawrence noted that the success of deep learning so far has largely been constrained to the tasks humans can do effortlessly, but the future may be with non-cognitive tasks. When it comes to these non-cognitive tasks, interpretability of trained models will become more valuable, noted by Murphy.
Hierarchical planning, knowledge acquisition and the ability to perform non-cognitive tasks naturally lead to the idea of automated laboratory, explained Murphy and Schmidhuber. In this automated laboratory, a machine will actively plan its goals to expand its knowledge of the world (by observation and experiments) and provide insights into the world (interpretability.)
On the Industry vs. Academia
One surprising remark from LeCun was that he believes the gap between the infrastructures at the industry labs and academic labs will shrink over time, not widen. This will be great, but I am more pessimistic than he is.
LeCun continued on explaining the open research effort at Facebook AI Research (FAIR). According to him, there are three reasons why industry (not only specific to FAIR) should push open science: (1) this is how research advances in general, (2) this makes a company more attractive to prospective employee/researcher and (3) there’s competition among different companies in research, and this is the way to stay ahead of others.
To my surprise, according to Hassabis, Google DeepMind (DeepMind from here on) and FAIR have agreed to share research software framework based on Torch. I vaguely remember hearing something about this under discussion some weeks or months ago, but apparently it has happened. I believe this will further speed up research from both FAIR and DeepMind. Though, it’s to be seen whether it’ll be beneficial to other research facilities (like universities) for two places with the highest concentration of deep learning researchers in the world to share and use the same code base.
Hassabis, Lawrence, Murphy and Bengio all believe that the enormous resource available in industry labs is not necessarily an issue for academic labs. Lawrence pointed out that other than those data-driven companies (think of Google and Facebook) most of companies in the world are suffering from the abundance of data rather than enjoying it, which opens a large opportunity to researchers in academic labs. Murphy compared research in academia these days to Russians during the space race between US and Russia. The lack of resource may prove useful, or even necessary for algorithmic breakthroughs which Bengio and Hassabis found important still. Furthermore, Hassabis suggested finding tasks or problems where one can readily generate artificial data such as games.
Schmidhuber’s answer was the most unique one here. He believes that the code for truly working AI agents will be so simple and short that eventually high school students will play around with it. In other words, there won’t be any worry of industries monopolizing AI and its research. Nothing to worry at all!
On the Hype and the Potential Second NN Winter
As he’s been asked this question of overhyping everytime he was interviewed by a journalist, LeCun started on this topic. Overhyping is dangerous, said LeCun, and there are four factors; (1) self-deluded academics who need funding, (2) startup founders who need funding, (3) program managers of funding agencies who manage funding and (4) failed journalism (who probably also needs funding/salary.) Recently in the field of deep learning, the forth factor has played a major role, and surprisingly, not all news articles have been the result of the PR machines at Google and Facebook. Rather, LeCun prefers if journalists would call researchers before writing potentially nonsense.
LeCun and Bengio believe that a potential solution both to avoid overhyping and to speed up the progress in research is the open review system, where (real) scientists/researchers put up their work online and publicly comment on them so as to let people see both the upside and downside of the paper (and why this paper alone won’t cause singularity.) Pushing it further, Murphy pointed out the importance of open sourcing research software, using which other people can more easily understand weakness or limitations of newly proposed methods in papers. Though, he pointed out it’s important for authors themselves to clearly state the limitation of their approaches whenever they write a paper. Of course, this requires what Leon Bottou said in his plenary talk (reviewers should encourage the discussion of limitations not kill the paper because of them.)
Similarly, Lawrence proposed that we, researchers and scientists, should slowly but surely approach the public more. If we can’t trust journalists, then we may need to do it ourselves. A good example he pointed to is “Talking Machines” podcast by Ryan Adams and Katherine Gorman.
Hassabis agrees that overhyping is dangerous, but also believes that there will be no third AI/NN winter. For, we now know better what caused the previous AI/NN winters, and we are better at not promising too much. If I may add here my own opinion, I agree with Hassabis, and especially because neural networks are now widely deployed in commercial applications (think of Google Voice), it’ll be even more difficult to have another NN winter (I mean, it works!)
Schmidhuber also agree with all the other panelists that there won’t be any more AI/NN winter, but because of yet another reason; the advances in hardware technology toward “more RNN-like (hence brain-like) architectures,” where “a small 3D volume with lots of processors are connected by many short and few long wires.”
One comment from Murphy was my favourite; ‘it is simply human nature.’
On AI Fear and Singularity
Apparently Hassabis of DeepMind has been at the core of recent AI fear from prominent figures such as Elon Musk, Stephen Hawking and Bill Gates. Hassabis introduced AI to Musk, which may have alarmed him. However, in recent months, Hassabis has convinced Musk, and also had a three-hour-long chat with Hawking about this. According to him, Hawking is less worried now. However, he emphasized that we must be ready, not fear, for the future.
Murphy found this kind of AI fear and discussion of singularity to be a huge distraction. There are so many other major problems in the world that require much immediate attention, such as climate changes and spreading inequalities. This kind of AI fear is a simply oversold speculation and needs to stop, to which both Bengio and LeCun agree. Similarly, Lawrence does not find the fear of AI the right problem to be worried about. Rather, he is more concerned with the issue of digital oligarchy and data inequality.
One interesting remark from LeCun was that we must be careful at distinguishing intelligence and quality. Most of the problematic human behaviours, because of which many fear human-like AI, are caused by human quality not intelligence. Any intelligent machine need not inherit human quality.
Schmidhuber had a very unique view on this matter. He believes that we will see a community of AI agents consisting of both smart ones and dumb ones. They will be more interested in each other (as ten-year-old girls are more interested in and hang out with other ten-year-old girls, and Capuchin monkeys are interested in hanging out with other Capuchin monkeys,) and may not be interested in humans too much. Furthermore, he believes AI agents will be significantly smarter than humans (or rather himself) without those human qualities that he does not like of himself, which is in lines with LeCun’s remark.
Questions from the Audience
Unfortunately, I was carrying around the microphone during this time and subsequently couldn’t make any note. There were excellent questions (for instance, from Tijmen Tieleman) and the responses from the panelists. Hopefully, if anyone reads this and remembers those questions and answers, please share this in the comment section.
One question I remember came from Tieleman. He asked the panelists about their opinions on active learning/exploration as an option for efficient unsupervised learning. Schmidhuber and Murphy responded, and before I reveal their response, I really liked it. In short (or as much as I’m certain about my memory,) active exploration will happen naturally as the consequence of rewarding better explanation of the world. Knowledge of the surrounding world and its accumulation should be rewarded, and to maximize this reward, an agent or an algorithm will active explore the surrounding area (even without supervision.) According to Murphy, this may reflect how babies learn so quickly without much supervising signal or even without much unsupervised signal (their way of active exploration compensates the lack of unsupervised examples by allowing a baby to collect high quality unsupervised examples.)
I had an honor to ask the last question directed mainly at Hassabis, LeCun and Murphy on what companies would do if they (accidentally or intentionally) built a truly working AI agent (in whatever sense.) Would they conceal it thinking that the world is not ready for this? Would they keep it secret because of potential opportunities for commercialization? Let me put a brief summary of their responses (as I remember, but again, I couldn’t write it down then.)
All of them expressed that it won’t happen like that (one accident resulting in a thinking machine.) And, because of this, LeCun does not find it concerning, as this will happen gradually as a result of joint efforts of many scientists both in industry and academia. Hassabis believes similarly to how LeCun does, and also couldn’t imagine that this kind of discovery, had it happened, will be able to be contained (probably the best leak of human history.) However, he argued for getting ready for the future where we, humans, will have access to truly thinking machines, of which sentiment I share. Murphy agreed with both LeCun and Hassabis. He together with LeCun made a remark about a recently released movie, Ex-Machina (which is by the way my favourite this year so far): It’s a beautifully filmed movie, but nothing like that will happen.
I agree with all the points they made. Though, there was another reason behind my question, which was unfortunately not discussed by them (undoubtably due to time constraint.) That is, once we have algorithms or machineries that are “thinking” and say the most important few pieces were developed in a couple of commercial companies (like the ones Hassabis, LeCun and Murphy are,) who will have right to those crucial components, will those crucial components belong to those companies or individuals, will they have to be made public (something like universal right to artificial intelligence?), and most importantly who will decide any of these?
Conclusion?
Obviously, there is no conclusion. It is an ongoing effort, and I, or we the organizers, hope that this panel discussion has been successful at shedding at least a bit of light in the path toward the future of deep learning as well as general artificial intelligence (though, Lawrence pointed out the absurdity of this term by quoting Zoubin Ghahramani ‘if a bird flying is flying, then is a plane flying artificial flying?‘)
But, let me point a few things that I’ve personally found very interesting and inspiring:
(1) Unsupervised learning as reinforcement learning and automated laboratory: instead of taking into account every single unlabeled example as it is, we should let a model selectively consider a subset of unlabeled examples to maximize a reward defined by the amount of accumulated knowledge.
(2) Overhyping can be avoided largely by the active participation of researchers in distributing latest results and ideas, rather than by letting non-experts explain them to non-experts. Podcasting, open reviewing and blogging may help, but there’s probably no one right answer here.
(3) I don’t think there was any one agreement on industry vs. academia. However, I felt that all three academic panelists as well as the other industry panelists all agree that each has its own role (sometimes overlapping) toward a single grand goal.
(4) Deep learning has been successful at what humans are good at (e.g., vision and speech), and in the future we as researchers should also explore tasks/datasets where humans are not particularly good at (or only become good at after years and years of special training.) In this sense, medicine/health care seems to be one area where most of the panelists were interested in and probably are investing in.
When it comes to the format of the panel discussion, I liked it in general, but of course, as usual with anything, there were a few unsatisfactory things. The most unsatisfactory thing was the time constraint (1 hour) we set ourselves. We have gathered six amazing panelists who have so much to share with the audience and world, but on average, only 10 minutes per panelist was allowed. In fact, as one of the organizers, this is partly due to my fault in planning. It would’ve been even better if the panel discussion was scheduled to last the whole day with more panelists, more topics and more audience involvement (at least, I would’ve loved it!) But, of course, a three-day-long workshop has been way out of our league.
Another thing I think can be improved is the one-time nature of the discussion. It may be possible to make this kind of panel discussion some kind of yearly event. It may be co-located with a workshop, or can even be done online. This can help, as many of the panelists pointed out, us (and others) avoid overhyping our research result or the future of the whole field of machine learning, and will be a great way to approach a much larger audience including both senior and junior researchers as well as other informed/interested public. Maybe, I, or you who’s reading this, should email the hosts of “Talking Machines” and suggest it.
Comment from Juergen Schmidhuber
Schmidhuber read this post and emailed me with his comment to clarify a few things. With his permission I am putting here his comment as it is:
Thanks for your summary! I think it would be good to publish the precise transcript. Let me offer a few clarifications for now:
1. Why no additional NN winter? The laws of physics force our hardware to become more and more 3D-RNN-like (and brain-like): densely packed processors connected by many short and few long wires, e.g., http://www.reddit.com/r/MachineLearning/comments/2xcyrl/i_am_j%C3%BCrgen_schmidhuber_ama/coz9n6n. Nature seems to dictate such 3D architectures, and that’s why both fast computers and brains are the way they are. That is, even without any biological motivation, RNN algorithms will become even more important – no new NN winter in sight.2. On AI fear: I didn’t say “Nothing to worry at all!” I just said we may hope for some sort of protection from supersmart AIs of the far future through their widespread lack of interest in us, like in this comment: http://www.reddit.com/r/MachineLearning/comments/2xcyrl/i_am_j%C3%BCrgen_schmidhuber_ama/cp1iusxAnd in the near future there will be intense commercial pressure to make very friendly, not so smart AIs that keep their users happy. Unfortunately, however, a child-like AI could also be trained by sick humans to become a child soldier, which sounds horrible. So I’d never say “Nothing to worry at all!” Nevertheless, silly goal conflicts between robots and humans in famous SF movie plots (Matrix, Terminator) don’t make any sense.Cheers,Juergen
[UPDATE 13 July 2015 7.10AM: rephrased the Schmidhuber’s reply on AI/NN Winter as requested by Schmidhuber]
[UPDATE 14 July 2015 9.03AM: Schmidhuber’s comments added]