Tomaso Poggio: Brains, Minds, and Machines | MIT Artificial Intelligence (AI) Podcast

the following is a conversation with Tommaso poggio he's the professor at MIT and as a director of the Center for brains minds and machines sited over 100,000 times his work has had a profound impact on our understanding of the nature of intelligence in both biological and artificial neural networks he has been an advisor to many highly impactful researchers and entrepreneurs in AI including demis hassabis of deep mind I'm nacho of mobile eye and Christof Koch of the Allen Institute for brain science this conversation is part of the MIT course on artificial general intelligence and the artificial intelligence podcast if you enjoy it subscribe on youtube itunes or simply connect with me on twitter at Lex Friedman spelled Fri D and now here's my conversation with Tommaso poggio you've mentioned that in your childhood you've developed a fascination with physics especially the theory of relativity and that Einstein was also a childhood hero to you what aspect of Einstein's genius the nature was genius do you think was essential for discovering the theory of relativity you know Einstein was a hero to me and I'm sure to many people because he was able to make of course a major major contribution to physics with simplifying a bit just a Gedanken experiment a fourth experiment you know imagining communication with Lights between a stationary observer and somebody on a train and I thought you know the the fact that just with the force of his fault of his thinking of his mind he could guide to some something so deep in terms of physical reality how time depend on space and speed it was something absolutely fascinating was the power of intelligence the power of the mind do you think the ability to imagine to visualize as he did as a lot of great forces sister do you think that's in all of us human beings or is there something special to that one particular human being I think you know all of us can learn and have in principle similar breakthroughs there are lesson to be learned from Einstein he was one of five PhD students at ETA and the ID Canarsie technician actua in Zurich in physics and he was the worse of the five but the only one who did not get an academic position when he graduated well finished his PhD and he went to work as everybody knows for the Patent Office and so it's not so much the work for the Patent Office but the fact that obviously it was marked but he was not the top student obviously was the anti conformist I was not thinking in the traditional way that probably stitches and the other students were doing so there is a lot to be said about you know trying to be to do the opposite or something quite different from what other people are doing that's actually true for the stock market never never buy for very bodies by and also true for science yes so you've also mentioned staying on a theme of physics that you were excited and a young age by the mysteries of the universe that physics could uncover such as I saw mentioned the possibility of time travel so the most out-of-the-box question I think I'll get to ask today do you think time travel is possible well it would be nice if it were possible right now you know you in science you never say no but your understanding of the nature of time yeah it's very likely that it's not possible to travel in time you may be able to travel forward in time if we can for instance freeze ourselves or you know go on some spacecraft traveling close to the speed of light but in terms of activity traveling for instance back in time I find probably very unlikely so do you still hold the underlying dream of the engineering intelligence that will build systems that are able to do such huge leaps like discovering the kind of mechanism that would be required to travel through time do you still hold that dream or are echoes of it from your childhood yeah I you know I don't think whether there are certain problems that probably cannot be solved depending what what you believe about the physical reality like you know maybe totally impossible to create energy from nothing or to travel back in time but about making machines that can think as well as we do or better or more likely especially in the short and midterm helped us think better which is in a sense is happening already with the computers we have and it will happen more and more but that I certainly believe and I don't see in principle why computers at some point could not become more intelligent than we are although the word intelligence it's a tricky one and one who should discuss which I mean with that in intelligence consciousness yeah words like love is all these are very you know you need to be disentangled so you've mentioned also that you believe the problem of intelligence is the greatest problem in science greater than the origin of life and the origin the universe you've also in the talk I've listened to said that you're open to arguments against against you so what do you think is the most captivating aspect of this problem of understanding the nature of intelligence why does it captivate you as it does well originally I think one of the motivation that I had as I guess a teenager when I was infatuated with theory of relativity was really that I I found that there was the problem of time and space and general relativity but there were so many other problems of the same level of difficulty and importance that I could even if I were I stein it was difficult to hope to solve all of them so what about solving a problem whose solution allowed me to solve all the problems and this was what if we could find the key to an intelligence you know ten times better or faster than Einstein so that's sort of seeing artificial intelligence as a tool to expand our capabilities but is there just an inherent curiosity in you and just understanding what is in our in here that makes it all all work yes absolutely all right so I was starting I started saying this was the motivation when I was a teenager but you know soon after I think the problem of human intelligence became a real focus of you know of my sent my science and my research because I think he's for me the most interesting problem is really asking oh we we are right is asking not only a question about science but even about the very tool we are using to do science which is our brain how does our brain work from where does it come from after its limitation can we make it better and that in many ways is the ultimate question that underlies this whole effort of science so you've made significant contributions in both the science of intelligence and the engineering event in a hypothetical way let me ask how far do you think we can get in creating intelligent systems without understanding the biological the understanding how the human brain creates intelligence put another way do you think we can build a strong-ass system without really getting at the core the functionally understanding the functional nature of the brain well this is a real difficult question you know we did solve problems like flying without really using too much our knowledge about how birds fly it was important I guess to know that you could have things heavier than than air being able to fly like like birds but beyond that probably we did not learn very much you know some you know the brothers right did learn a lot of observation about birds and designing their their aircraft but you know you can argue we did not use much of biology in that particular case now in the case of intelligence I think that it's it's a bit of a bet right now if you are if you ask okay we we all agree we'll get at some point maybe soon maybe later to a machine that is indistinguishable from my secretary say in terms of what I can ask the machine to do I think we get there and now the question is and you can ask people do you think we'll get there without any knowledge about you know the human brain or that is the best way to get there is to understand better the human brain yeah okay this is I think an educated bet that different people with different background will decide in different ways the recent history of the progress in AI in the last out say five years or ten years is has been that the main breakthroughs the main recent breakthroughs I really start from neuroscience mention reinforcement learning as one is one of the algorithms at the core of alphago which is the system that beat the kind of an official world champion of go lee sedol and two three years ago in seoul that's one and that started related with the work of Pavlov and I'll or hundred Marvin Minsky in the sixties many other neuroscientists later on and deep learning started which is the core again of alphago and systems like autonomous driving systems for cars like the systems that mobile I which is a company started by one of my exposed or Colonel Joshua did so that is a core of those things and deep learning really the initial ideas in terms of the architecture of this layered ARCIC on networks started with work of Torsten Wiesel and David Hubel at Harvard up the river in the 60s so recent history suggests the neuroscience played a big role in these breakthroughs my personal bet is that there is a good chance they continue to play a big role maybe not in all the future breakthroughs but in some of them at least in inspiration so at least in a new spirit absolutely yes so you see you studied both artificial and biological neural networks you said these mechanisms that underlie deep learning deeper and reinforcement learning but there is nevertheless significant differences between biological and artificial neural networks as they stand now so between the two what he finds the most interesting mysterious maybe even beautiful difference as it currently stands in our understanding I must confess that until recently I found that the artificial networks too simplistic relative to real neural networks but you know recently I've been started to think that yes there are a very big simplification of what you find in the brain but on the other hand there are much closer in terms of the architecture to the brain than other models that we had that computer science used as model of thinking which were mathematical logics you know Lisp Prolog and those kind of things yeah so in comparison to those they're much closer to the brain you have networks of neurons which is what the brain is about and the artificial neurons in the models are as I said caricature of the biological neurons but they're still neurons single units communicating with other units something that is absent in you know the traditional computer type models of mathematics reasoning and so on so what aspect is would you like to see in artificial neural networks added over time as we try to figure out ways to improve them so one of the main differences and you know problems in terms of deep learning today and it's not only deep learning and the brain is the need for deep learning techniques to have a lot of labeled examples you know for Easter for imagenet you have like a training site which is 1 million images each one labeled by some human in terms of which object is there and it's it's clear that in biology a baby may be able to see million of images in the first years of life but will not have million of labels given to him or her by parents or take take care takers so how do you solve that you know I think there is this interesting challenge that today deep learning and related techniques are all about big data big data meaning a lot of examples labeled by humans whereas in nature you have so that this big data is n going to infinity that's the best you know and meaning labeled data but I think the biological world is more n going to one Hey a child can learn the beautiful wrote a very small number of you know labeled examples like you tell a child this is a car you don't need to say like imagenet you know this is a car this is a car this is not a car this is not a cat 1 million times so and of course with alphago and or at least alpha 0 variants there's because of the because the world of go is so simplistic that you can actually learn by yourself through self play you could play against each other and the real world i meet the visual system that you've studied extensively is a lot more complicated than the game of go so under comment about children which are fascinatingly good at learning new stuff how much of it do you think is hardware how much of it is software you know that's a good deep question is in a sense is the old question of nurture and nature how much isn't in the gene and how much is in the experience of an individual obviously it's both that play a role and i believe that the way evolution gives put prior information so to speak hard while it's not really hard while but that's essentially an hypothesis I think what's going on is that evolution as you know almost necessarily if you believe in Darwin is very opportunistic and and think about our DNA and the DNA of Drosophila our DNA does not have many more genes than resolve around the fly the fly the fruit fly now we know that the fruit fly does not learn very much during its individual existence it looks like one of this machinery that it's really mostly not hundred percent but you know 95 percent hard coded by the genes but since we don't have many more genes than Drosophila as evolution could encoding as a kind of general learning machinery and then had to give very weak priors like for instance let me take give a specific example which is recent to work by a member of our Center for brains minds and machines we know because of work of other people in our group and other groups that there are cells in a part of our brain neurons that are tuned to phases they seems to be involved in face recognition now this face area exists seems to be present in young children and adults and one question is is there from the beginning is hardwired by evolution or you know somehow is learned very quickly so what's your by the way a lot of the questions I'm asking with the answer is we don't really know but as a person who has contributed some profound ideas in these fields you're a good person to guess at some of these so of course there's a caveat before a lot of the stuff we talk about but what is your hunch is the face the part of the brain that that seems to be concentrated on face recognition are you born with that or you just is designed to learn that quickly like the face of the mother and I my hand shimmer by bias was the second one learned very quickly and it turns out that Marge Livingstone at Harvard has done some amazing experiments in which she raised baby monkeys depriving them of faces during the first weeks of life so they see technicians but the technician have a mask yes and and so when they looked at the area in the brain of this monkeys that were usually find faces they found no face preference so my guess is that what evolution does in this case is there is a plastic Canaria which is plastic which is kind of predetermined to be imprinted very easily but the command from the gene is not detailed circuitry for a face template could be but this will require probably a lot of bits you had to specify a lot of connection of a lot of neurons instead that the command that commands from the gene is something like imprint memorized what you see most often in the first two weeks of life especially in connection with food and maybe nipples I don't write well source of food and so in then that area is very plastic at first and in the otherwise I'd be interesting if a variant of that experiment would show a different kind of pattern associated with food than a face pattern well whether that quite stick there are indications that during that experiment what the monkey saw quite often where the blue gloves of the technicians that were giving to the baby monkeys the milk and some of the cells see instead of being face sensitive in that area or a hand sensitive that's fascinating can you talk about what are the different parts of the brain and in your view sort of loosely and how do they contribute to intelligence do you see the brain as a bunch of different modules and they together come in the human brain to create intelligence or is it all one mush of the same kind of fundamental architecture yeah that's you know that's an important question and there was a phase in neuroscience by in the 1950 or so in which it was believed for a while that the brain was equipotential this was the term you could cut out a piece and nothing special happened apart a little bit less performance there was a a surgeon Lashley did a lot of experiments of this type with mice and rats and concluded that every part of the brain was essentially equivalent to any other one it turns out that that's that's really not true it's there are very specific modules in the brain as you said and you know people may lose the ability to speak if you have a stroke in a certain region or may lose control of their legs in another region or so they're very specific the brain is also quite flexible and redundant so often it can correct things and you know the kind of takeover functions from one part of the brain to the other but but but really there are specific modules of the answer that we know from this old work which was basically on based on lesions either on animals or very often there were a mine of well it there was a mine a very interesting data coming from from the war from different types of injuries injuries that soldiers had in the brain and more recently functional MRI which allow you to to check which part of the brain are active when you are doing different tasks as you know can replace some of this you can see that certain parts of the brain are involved or active in this language yeah yeah that's right but sort of taking a step back to that part of the brain that discovers that specializes in the face and how that might be learned what's your intuition behind you you know is it possible that the sort of from a physicists perspective when you get lower and lower that it's all the same stuff and it just when you're born it's plastic and it quickly figures out this part is going to be about vision this is gonna be about language this is about common sense reasoning do you have an intuition that that kind of learning is going on really quickly or is it really kind of solidified in hardware that's a great question so there are parts of the brain like the cerebellum or they put campus that are quite different from each other they clearly have different Anatomy different connectivity that then there is the cortex which is the most developed part of the brain in humans and in the cortex you have different regions of the cortex that are responsible for vision for audition for motor control for language now one of the big puzzles of of this is that in the cortex is the cortex is the cortex it looks like it is the same in terms of hardware in terms of type of neurons and connectivity across these different modalities so for the cortex letting aside these other parts of the brain like spinal cord upon campus or bedroom and so on for the cortex I think your question about hardware and software and learning and so on it's it I think is rather open and you know it I find very interesting for easy to think about an architecture computer architecture that is good for vision and the symptom is good for language seems to be you know so different problem areas that you have to solve but the underlying mechanism might be the same that's really instructive for it maybe artificial neural networks so you've done a lot of great work in vision and human vision computer vision and you mentioned the problem of human vision is really as difficult as the problem of general intelligence and maybe that connects to the cortex discussion can you describe the human visual cortex and how the humans begin to understand the world through the raw sensory information the woods for folks enough familiar especially in on the computer vision side we don't often actually take a step back except saying what the sentence or two that one is inspired by the other well what is it that we know about the human visual cortex that's interest so we know quite a bit at the same time we don't know a lot but the the bit we know you know in a sense we know a lot of the details and Men we don't know and we know a lot of the top level the answer the top level question but we don't know some basic ones even in terms of general neuroscience forgetting vision you know why do we sleep it's such a basic question and we really don't have an answer to that do you think so taking a step back on that so sleep for examples fascinating do you think that's a neuroscience question or if we talk about abstractions what do you think is an interesting way to study intelligence or are most effective on the levels of abstractions the chemicals the biological is electro physical mathematical as you've done a lot of excellent work on that side which psychology is sort of like at which level of abstraction do you think well in terms of levels of abstraction I think we need all of them all hits when you know it's like if you ask me what does it mean to understand the computer right that's much simpler but in a computer I could say well I understand how to use PowerPoint that's my level of understanding a computer it's it has reasonable you know give me some power to produce lights and beautiful slides and now the class on body exercise well I I know how the transistor work that are inside the computer I can write the equation for you know transistor and diodes and circuits logical circuits and I can ask this guy do you know how to operate PowerPoint no idea so do you think if we discovered computers walking amongst us full of these transistors that are also operating under windows and have PowerPoint do you think it's digging in a little bit more how useful is it to understand the transistor in order to be able to understand PowerPoint and these higher-level very good intelligence I see so I think in the case of computers because they were made by engineers by us this different level of understanding are rather separate on purpose you know you there are separate modules so that the engineer that designed the circuit for the chips does not need to know what power is inside PowerPoint and somebody you can write the software translating from one to the end to the other and so in that case I don't think understanding the transistor help you understand PowerPoint or very little if you want to understand the computer this question you know I would say you have to understanding a different levels if you really want to build one right but but for the brain I think these levels of understanding so the algorithms which kind of computation you know the equivalent of PowerPoint and the circuits you know the transistors I think they are more much more intertwined with each other there is not you know in Italy level of the software separate from the hardware and so that's why I think in the case of the brain a problem is more difficult or more than four computers requires the interaction the collaboration between different types of expertise that's a big the brain is a big mess you can't just on disentangle a level I think you can but is is much more difficult and it's not you know it's not completely obvious and and I said I think he's one of the person everything is the greatest problem in science so yeah you know I think he's it's fair that it's difficult one that said you do talk about compositionality and why I might be useful and when you discuss what why these neural networks in artificial or biological sense learn anything you talk about compositionality see there's a sense that nature can be disentangled our purpura well all aspects of our cognition could be disentangled a little to some degree so why do you think what first of all how do you see compositionality and why do you think it exists at all in nature it spoke about I use the the term compositionality when we looked at deep neural networks multi-layers and trying to understand when and why they are more powerful than more classical one layer network like linear classifier kernel machines so-called and what we found is that in terms of approximating or learning or representing a function a mapping from an input to an output like from an image to the label in the image if this function as a particular structure then deep networks are much more powerful than shallow networks to approximate the underlying function and the particular structure is a structure of compositionality if the function is made up of functions of function so that you need to look on when you are interpreting an image classifying an image you don't need to look at all pixels at once but you can compute something from small groups of pixels and then you can compute something on the output of this local computation and so on that is similar to what you do when you read the sentence you don't need to read the first and the last letter but you can read syllables combine them in words combine the words in sentences so this is this kind of structure so that's as part of the discussion of why deep neural networks may be more effective than the shallow methods and is your sense for most things we can use neural networks for those problems are going to be compositional in nature like like language like vision how far can we get in this kind of right so here is almost philosophy well you know there yeah let's go there so a friend of mine max tegmark who is a physicist at MIT I've talked to him on this thing yeah and he disagrees with you right yeah but we you know we agree most but the conclusion is a bit differently he is conclusion is that for images for instance the compositional structure of this function that we have to learn or to solve these problems comes from physics comes from the fact that you have local interactions in physics between atoms and other atoms between particle of matter and other particles between planets and other planets between stars that it's all local and that's true but you could push this argument a bit further not this argument actually you could argue that you know maybe that's part of the true but maybe what happens is kind of the opposite is that our brain is wired up as a deep network so it can learn understand solve problems that I have this compositional structure and I cannot do they cannot solve problems that don't have this compositional stretch so the problem is we are accustomed to we think about we test our algorithms on our this compositional structure because our brain is made up in that's in a sense an evolutionary perspective as we've so the ones that didn't have the they weren't dealing with a compositional nature of reality died off yes it also could be may be the reason why we have this local connectivity in the brain like simple cells in cortex looking only the small part of the B image each one of them and another says looking at it small number of these simple cells and so on the reason for this may be purely that was difficult to grow longer range connectivity so suppose it's you know for biology it's possible to grow short range connectivity but not longer and also because there is a limited number of long range the Duke and so you have at this this limitation from the biology and this means you build a deep convolutional neck this would be something like deep convolutional network and this is great for solving certain class of problem these are the ones we are we find easy and important for our life and yes they were enough for us to survive and and you can start a successful business on solving those problems right mobile a driving is a compositional problem right so on the unlearning task i mean we don't know much about how the brain learns in terms of optimization but so the thing that's stochastic gradient descent is what artificial neural networks used for the most part to adjust the parameters in such a way that it's able to deal based on the label data it's able to solve the problem yeah so what's your intuition about why it works at all a heart of a problem it is to optimize in your own network artificial neural network is there other alternatives you're just in general your intuition is behind this very simplistic algorithm that seems to do pretty good surprising yes yes so I find near of science the the architecture of cortex it's a really similar to the architecture of deep networks so that there is a nice correspondence there between the biology and this kind of local connectivity hierarchical architecture the stochastic gradient descent as you said is is a very simple technique it seems pretty unlikely that biology could do that from from what we know right now about you know cortex and neurons and synapses so it's a big question open whether there are other optimization learning algorithms that can replace stochastic gradient descent and my my guess is yes but nobody has found yet a real answer I mean people are trying still trying and there are some interesting ideas the fact that stochastic gradient descent is so successful this has become clear is not so mysterious and the reason is that it's an interesting fact you know it's a change in a sense in how people think about statistics and and this is the following is that typically when you had data and you had say a model with parameters you are trying to fit the model to the data you know to fit the parameter typically the kind of kind of crowd wisdom type idea was you should have at least you know twice the number of data than the number of parameters you maybe 10 times is better now the way you train neural net or this disease that I have they have 10 or 100 times more parameters than did exactly the opposite and which you know it is it has been one of the puzzles about neural networks how can you get something that really works when you have so much freedom in its in from that Laura Derek in general right somehow right exactly do you think this the stochastic nature is essential to randomness so I think we have some initial understanding why this happens but one nice side effect of having this over parameterization more parameters than data is that when you look for the minima of a loss function like stochastic gradient descent is doing in find I I made some calculations based on some old basic theorem of algebra called bazoo theorem and that gives you an estimate of the number of solution of a system of polynomial equation anyway the bottom line is that there are probably more minima for a typical deep networks than atoms in the universe just to say there are lost because of the over parametrization a more global minimum zero meaning good meaning so it's not just local minima yeah a lot of them so you have a lot of solutions so it's not so surprising that you can find them relatively easily and this is why this is because of the overall parameterization the organization sprinkles an entire space for solutions pretty good and so not so surprising right is like you know if you have a system of linear equation and you have more unknowns than equations then you have we know you have an infinite number of solutions and the question is to pick one that's another story but you have an infinite number of solutions so there are a lot of value of your unknowns that satisfy the equations but it's possible that there's a lot of those solutions that aren't very good what's surprising so that's a good question why can you pick one the generalizes one yeah that's a separate question with separate answers one one theorem that people like to talk about that kind of inspires imagination of the power in your networks is the universality a universal approximation theorem you can approximate any computable function with just a finite number of neurons and a single hidden layer see you find this theorem one surprising you find it useful interesting inspiring now this one you know I never found it very surprising it's was known since the 80s since I entered the field because it's basically the same as biased as the which says that I can approximate any continuous function with a polynomial of sufficiently with a sufficient number of terms monomials so basically the same and the proves very similar so your intuition was there's never any doubt in your networks in theory could the right be very strong approximate nicely the the question the interesting question is that if this theorem it says you can approximate fine but when you ask how many neurons for instance or in the case of polynomial how many monomials I need to get a good approximation then it turns out that that depends on the dimensionality of your function how many variables you have but it depends on the dimensionality of your function in a bad way it's for instance suppose you want an error which is no worse than 10% in your approximation you come up with a net of the approximate your function within 10% then turns out that the number of units you need are in the order of 10 to the dimensionality D how many variables so if you have you know two variables is these 2 would you have hundred units and okay but if you have say 200 by 200 pixel images now this is you know 240 thousand whatever and we can go to the sizing universe pretty quickly there are exactly 10 to the 40,000 and so this is called the curse of dimensionality not you know quite appropriate and the hope is with the extra layers you can remove the curse what we proved is that if you have deep layers or a rocky core architecture that with the local connectivity of the type of convolutional deep learning and if you are dealing with a function that has this kind of hierarchical architecture then you avoid completely the curves you've spoken a lot about supervised deep learning yeah what are your thoughts hopes views on the challenges of unsupervised learning with the with Ganz with the generator valor surround networks do you see those is distinct that the power of Ganz does is distinct from supervised methods in your networks are they really all in the same representation ballpark gains is one way to get estimation of probability densities which is somewhat new way but people have not done before I I don't know whether this will really play an important role in you know in intelligence or it's it's interesting I'm I'm less enthusiastic about it too many people in the field I have the feeling that many people in the field are really impressed by the ability to of producing realistic looking images in this generative way which describes the popularity of the methods but you're saying that while that's exciting and cool to look at it may not be the tool that's useful for yeah for so you describe it kind of beautifully current supervised methods go and to infinity in terms of number of labelled points and we really have to figure out how to go to and to one yeah and you're thinking ganz might help but they might not be the right I don't think you for that problem which I really think is important I think they may help they certainly have applications for instance in computer graphics and you know we I did work long ago which was a little bit similar in terms of saying okay 11 network and I present images and I can so input its images and output is for instead the pose of the image you know a face how much is miling is rotated 45 degrees or not what about having a network that I trained with the same dataset but now I invert input and output now the input is the pose or the expression number certain numbers and the output is the image and I train it and we did pretty good interesting results in terms of producing very realistic looking images was you know less sophisticated mechanism but the output was pretty less than gains but the output was pretty much of the same quality so I think for computer graphics type application yeah definitely gains can be quite useful and not only for that–for but for you know helping for instance on this problem of unsupervised example of reducing the number of labeled examples I think people it's like they think they can get out more than they put in you know it there's no free lunches Yeah right that's what do you think what's your intuition how can we slow the growth of n to infinity in supervised and to infinity in supervised learning so for example mobile I has very successfully I mean essentially annotated large amounts of data to be able to drive a car now one thought is so we're trying to teach machines of AI and we're trying to so how can we become better teachers maybe that's one one way now I got your you know what I like that because one again one caricature of the history of computer sites you could say is with the gains with programmers expensive yeah continuously labelers cheap yeah and the future would be schools like we have for kids yeah currently the labeling methods were not selective about which examples we we teach networks with so I think the focus of making one-shot networks that learn much faster is often on the architecture side but how can we pick better examples with wish to learn do you have intuitions about that well that's part of the quarter program but the other one is you know if we look at biology reasonable assumption I think is in the same spirit II that I said evolution is opportunistic and has weak priors you know the way I think the intelligence of child the baby may develop is by bootstrapping weak priors from evolution for instance in you can assume that you are having most organisms including human babies built in some basic machinery to detect motion and relative motion and in fact there is you know we know all insects from fruit flies other animals they have this even in the readiness of in the very peripheral part it's very conserved across species something that evolution discovered early it may be the reason why babies tend to look in the first few days to moving objects and not to not moving out now moving objects means okay they are attracted by motion but motion also means that motion gives automatic segmentation from the background so because of motion boundaries you know either the object is moving or the eye of the baby is tracking the moving object and the background is moving right yeah so just purely on the visual characteristics of the scene as seems to be the most useful right so it's like looking at an object without background it's ideal for learning the object otherwise it's really difficult because you have so much stuff so suppose you do this at the beginning first weeks then after that you can recognize the object now they're imprinted a number of even in the background even without motion so that's at the by the way I just want to ask an object recognition problem so there is this being responsive to movement and edge detection essentially what's the gap between being effectively effective at visually recognizing stuff detecting word that is and understanding the scene there is this a huge gap in many layers or is it as a close no I think that's a huge gap I think present algorithm with all the success that we have and the fact that are a lot of very useful it's I think we are we are in a golden age for applications of low level vision and low level speech recognition and so on you know Alexa and so there are many more things of similar level to be done including medical diagnosis and so on but we are far from what we call understanding of a scene of language of actions of people that is despite the claims that's I think very far or a little bit off so in popular culture and among many researchers some of which I've spoken with the sue Russell and you know a mask in and out of the AAI field there's a concern about the existential threat of AI yeah and how do you think about this concern in and is it valuable to think about large-scale long-term unintended consequences of intelligent systems we try to build I always think is better to worry first you know early rather than late so some worry is good yeah I'm not against worry at all personally I think that you know it will take a long time before there is real reason to be worried but as I said I think it is good to put in place and think about possible safety against what I find a bit misleading are things like that I've been said by people I know like Elon Musk and what is boström important notice first name a neck panic poster right you know and a couple of other people that for instance a eyes more dangerous the nuclear weapons right yeah I think that's really project that can be it's misleading because in terms of priority which should still be more worried about nuclear weapons and you know what people are doing about it and some then a and he's spoken about them as obvious and yourself saying that you think you'll be about a hundred years out before we have a general intelligence system that's on par with the human being you have any updates for those predictions what I think he said he's at 28 he said it went all right this was a couple of years ago I have not asked him again so I should have your own prediction what's your prediction about when you'll be truly surprised and what's the confidence interval or not you know it's so difficult to predict the future and even the presence of it's nothing it's pretty hard to predict a bit I'll be but as I said this is completely it would be more like rod Brooks I think he's about 200 years when we have this kind of a GI system artificial general intelligence system you're sitting in a room with her him it do you think it will be the underlying design of such a system is something we'll be able to understand it will be simple do you think you'll be explainable understandable by us your intuition again we're in the realm of philosophy a little bit but probably no but it again it depends would you really mean for understanding so I think you know we don't understand what how deep networks work I think we're beginning to have a theory now but in the case of deep networks or even in the case of the simple simpler kernel machines or linear classifier we really don't understand the individual units also we but we understand you know what the computation and the limitations and the properties of it are it's similar to many things you know we what does it mean to understand how a fusion bomb works how many of us you know many of us understand the basic principle and some of us may understand deeper details in that sense understanding is as a community as a civilization can we build another copy of it okay and in that sense you think there'll be there will need to be some evolutionary component where it runs away from our understanding or do you think it could be engineered from the ground up the same way you go from the transistor to our point all right so many years ago this was actually 40 41 years ago I wrote a paper with David Marr who was one of the founding father of computer vision of computational dish I wrote a paper about levels of understanding which is related to the question I discussed earlier about understanding power point understanding transistors and so on and you know in that kind of framework we had the level of the hardware and the top level of the algorithms we did not have learning recently I updated adding levels and one level I added to those free was learning so and you can imagine you could have a good understanding of how you construct learning machine like we do but being unable to describe in detail what the learning machines will discover right now that would be still a powerful understanding if I can build the learning machine even if I don't understand in detail every time made it learn something just like our children if they're if they start listening to a certain type of music I don't know Miley Cyrus or something you don't understand why they came after that particular preference but you understand the learning process that I'm very interesting yeah yeah so unlearning for systems to be part of our world it has a certain one of the challenging things that you've spoken about is learning ethics learning yeah morals and what how hard do you think is the problem of first of all humans understanding our ethics what is the origin and the neural a low level of ethics what is it at a higher level is it something that's learner before machines in your intuition I think yeah ethics is learnable very likely I I think I is one of these problems were think understanding the neuroscience of ethics you know people discuss there is an ethics of neuroscience yes you know how a neuroscientist should or should not behave can you think of a neurosurgeon and the ethics are you Rory has to behavior he she has to be but I'm more interested on the neuroscience of you blow my mind right now the neuroscience of ethics is very matter yeah and you know I think that would be important to understand also for being able to to design machines that have that are ethical machines in our sense of ethics and you think there is something in your science there's patterns tools in your science that can help us shed some light on ethics or yeah mostly on the psychology sociology much higher level no there is a culture but there is also in the meantime there are there is evidence fMRI of specific areas of the brain that are involved in certain ethical judgment and not only this you can stimulate those area with magnetic fields and change the ethical decisions yeah Wow so that's work by a colleague of mine Rebecca Saxe and there is a other researchers doing similar work and I think you know this is the beginning but ideally at some point we'll have an understanding of how this works and white of all right the big y question yeah it must have some some purpose yeah obviously test you know some social purpose is is probably if neuroscience holds the key to at least eliminate some aspect of ethics that means it could be a learn about problem yeah exactly and as we're getting into harder and harder questions let's go to the hard problem of consciousness yeah is this an important problem for us to think about and solve on the engineering of intelligence side of your work of our dream you know it's unclear so you know again this is a deep problem part because it's very difficult to define consciousness and and there is the debate among neuroscientist and about whether consciousness and philosophers of course whether consciousness is something that requires flesh and blood so to speak yes or could be you know that we could have silicon devices that are conscious or up to statement like everything has some degree of consciousness and some more than others this is like Giulio Tononi and she would just recently talk to Christophe Koch okay so he a crystal force my first graduate student yeah do you think it's important to illuminate aspects of consciousness in order to engineer intelligence systems do you think an intelligent system would ultimately have consciousness are they to the interlinked you know most of the people working in artificial intelligence I think who'd answer we don't strictly need the consciousness to have an intelligent system that's sort of the easier question because yeah because it's it's a very engineering answer to the question yes that's the Turing test will run in consciousness but if you were to go do you think it's possible that we need to have so that kind of self-awareness we may yes so for instance I I personally think that when test a machine or a person in a Turing test in an extended to interesting I think consciousness is part of what we require in that test you know in priestly to say that this is intelligent Christophe disagrees so as he does yeah it despite many other romantic notions he who he disagrees with that one yes that's right so you know we would see do you think as a quick question Ernest Becker fear of death do you think mortality and those kinds of things are important for well for consciousness and for intelligence the finiteness of life finiteness of existence or is that just the side effect of evolutionary side effect is useful to a for natural selection do you think this kind of thing that we're gonna this interview is gonna run out of time soon our life will run out of time soon do you think that's needed to make this conversation good and in life good you know I never thought about it is it a very interesting question I think Steve Jobs in his commencement speech at Stanford argued that you know having a finite life was important for for stimulating achievement so I was a different yeah I live every day like it's your last right yeah yeah so I rationally I don't think strictly you need mortality for consciousness but oh no they seem to go together in our biological system yeah you've mentioned before and students are associated with alpha go immobilize the big recent success stories in the eye and I think it's captivated the entire world of what I can do so what do you think will be the next breakthrough and what's your intuition about the next breakthrough of course I don't know where the next breakthroughs is I think that there is a good chance as I said before that the next breakthrough would also be inspired by you know neuroscience but which one I don't know and there's so MIT has this quest for intelligence you know and there's a few moon shots which in that spirit which ones are you excited about what which projects kind of well of course I'm excited about one of the moon shots with it which is our Center for brains minds and machines history the one which is filip fully funded by NSF and it's a it is about visual intelligence it's an area that one has a particularly about understanding visual intelligence or visual cortex and and visual intelligence in the sense of how we look around ourselves and understand the word around ourselves you know meaning what what is going on how we could go from here to there without hitting obstacles you know whether there are other agents people in the market these are all things that we perceive very quickly and and it's something actually quite close to being conscious not quite but now there is this interesting experiment that was run at Google X which is in a sense is just a virtual reality experiment but in which they had subject sitting in a chair with goggles like oculus and so on earphones and they were seeing through the eyes of a robot nearby two cameras microphones for a/c mossad their sensory system was there and the impression of all the subject very strong they could not shake it off was that they were where the robot was they could look at themselves from the robot and still feel they were they were where the robot is they were looking their body their self were had moved so some aspect of scene understanding has to have ability to place yourself have a self-awareness about your position in the world and what the world is right so yeah so we may have to solve the hard problem of consciousness on their way yes but it's quite quite quite a moonshot eyes so if you've been an adviser to some incredible minds including demis hassabis Christophe Co I'm not sure like you said all went on to become seminal figures in their respective fields from your own success as a researcher and from perspective as a mentor of these researchers having guided them Madhvi so what does it take to be successful in science and engineering careers whether you're talking to somebody in their teens 20s and 30s what does that path look like it's curiosity and having fun and I think is important also having fun with other curious minds it's the the people you surround with – so yeah fun and curiosity is there mentioned Steve Jobs is there also an underlying ambition that's unique that you saw or is it really does boil down to insatiable curiosity and fun well of course you know it's been cured using active and ambitious way yes definitely but I think sometime in in science there are friends of mine who are like this you know there are some of the scientists like to work by themselves and kind of communicate only when they complete their work or discover something I think I always found the the actual process of you know discovering something is more fun if it's together with other intelligent and curious and fun people so if you see the fun in that process of the side effect of that process will be the election of discovering something yes so as you've led many incredible efforts here what's the secret to being a good advisor mentor leader in a research setting is that similar spirit or yeah what what advice could you give to people young faculty and so on it's partly repeating what I said about an environment that should be friendly and fun and ambitious and you know I think I learned a lot from some of my advisers and friends and some of our physicists and there was reason this behavior that was encouraged of when somebody comes with a new idea in the group you're unless is really stupid but you are always enthusiastic and then and the other two just for a few minutes for a few hours then you start you know asking critically a few questions testing but you know this is a process that is I think it's very very good this you have to be enthusiasm time people are very critical from beginning that's that's that's not yes you have to give it a chance yes let's see to grow that said with some of your ideas which are quite revolutionary so there's a witness especially in the human vision side and neuroscience side there could be some pretty heated arguments do you enjoy these dessert a part of science and I could academic pursue see you enjoy yeah is it is that something that happens in your group as well yeah absolutely I also spent some time in Germany again that is this tradition in which people are more forthright less kind than here so you know in the u.s. you when you write a bad letter you still say this guy's nice yes so yet here in America its degrees of nice yes it's all just degrees of Nicaea right right so as long as this does not become personal and it's really like you know a football game with these rules that's great so if you somehow found yourself in a position to ask one question of an Oracle like a genie maybe a god whoa and you're guaranteed to get a clear answer what kind of question would you ask what what would be the question you would ask in the spirit of our discussion it could be how could be how could I become ten times more intelligent and so but see you only get a clear short answer so do you think there's a clear short answer to that no and that's the answer you'll get yeah okay so you've mentioned flowers of Algren odd oh yeah this is a story that inspires you in your childhood as this story of a mouse and human achieving genius-level intelligence and then understanding what was happening while slowly becoming not intelligent again in this tragedy of intelligence and losing intelligence do you think in that spirit and that story do you think intelligence is a gift or curse from the perspective of happiness and meaning of life you try to create intelligence system that understands the universe but at an individual level the meaning of life do you think intelligence is a gift it's a good question I don't know as one of this as one people consider the smartest people in the world in some in some dimension at the very least what do you think no no it may be invariant to intelligence likely of happiness would be nice if it were that's the hope yeah you could be smart and happy and clueless unhappy yeah as always on the discussion of the meaning of life it's probably a good place to end Tommaso thank you so much for talking today thank you this was great you

26 thoughts on “Tomaso Poggio: Brains, Minds, and Machines | MIT Artificial Intelligence (AI) Podcast

  1. Alcoholic lip. Dad had that, too. i suspect this AI generated. i don't remember ever subscribing to lex fridman's channel. Imagine my surprise when i noticed subscribed in grey.

  2. This was a fascinating conversation on many levels. I found Thomaso to be the ideal mentor I would like to have, openess, curiosity, ambition and fun in the journey of understanding and discovery. It would be great to have his points summarized, he shed some great light on a lot of questions. I specially enjoyed the ones on how we are beginning to understand NN, like the high probabilties of finding a global minima in a highly paramatrized model due to high number of prameters and the fact that hidden layers somehow solve the need for having N ^ Dimensions parameters in order to approximate the function, wasn't completely sure I understood if N was the error percentage boundary that you want or if 10 is a rule and 10% error was just a boundary. Anyway thanks to you both.

  3. Lex, dude…you know the right questions. Do you have a bio onlline? I wanna know where you got yer smarts. I hope you can handle fame, lol. Your videos constitute a graduate seminar in AI for the proletariat. I'm gonna tell you one thing, kid…technology democratizes. Persist.

  4. – electrical engineers creating computer components could be likened to neurologists and neurosciencentists on a low level
    – IT professionals could be equated to a psycologist or psychiatrist, helping us as the users to troubleshoot the software
    – what we are looking to do here as programmers is to bridge that gap, creating an emulator for the whole system, needing to understand both sides of the mind and how they fit together

    There is not yet an equivalent to emulator programmer for the human mind to help us programmers understand this connection.

  5. Interesting to learn the the director of this new filed has a physics background! Yes, Einstein is inspiring indeed..

  6. Gratitude for the interviews Lex. I am 23 and currently pivoting in the world of AI and mathematics. Your lectures are a great inspiration.

  7. The problem with Max Tegmark's argument is also that he believes his idea about the world as physics describes it, is the reality itself. Bur it may really narrow down to the fact we describe reality on those terms, in mathematical language based around the principle of equilibrium, etc, because this is what our brain allows us to do. It might be that if we had a "different kind" of brain, we would discover a different kind of "mathematics", that is to say a symbolic logic based on different principles, which are unimaginable to us.
    The common belief amongst many physicists that scientific equations are the description of reality can not be supported logically (because of extrapolation of Godel paradox). This is a kind of hubris a neuroscientist, or even Lacanian psychologist can easily point out. There is a difference between what's Real(which we never directly encounter), and our Symbolic imagination about it. When you approach the foundation of physics with clear mind you must accept that ideas such as Force or Energy are metaphysical ideas, which function only within a linguistic structures of our minds, and are not found in nature. Every physicist should know the difference between a physical phenomenon, and the description of that phenomenon. Nobody have ever observed a force or an energy, only, as we are used to put it – the effects of force and energy. But to claim World is an effect of something (like energy of singularity) is itself a philosophical a'priori idea, coming directly from Newtonian a'priori definition f a Force. It's a very simple bit of reasoning which all physicists almost deliberately refuse to accept.
    The map is not the territory, no matter how good your map is. And let's be frank, physics account for a minuscule part of world phenomena, they simply decided to call everything they know nothing about "noise", or "chaos". It's a really cheap trick.

  8. 51:51 can anyone pls explain the part where Tomaso says one caricature of computer history is first we have expensive programmers , then cheap labelers and the future would be schools like we have for kids?

  9. What's that experiment with the VR-glasses, headphones, robot? Would love to get more info about that but right now I can't find it…….

  10. J Cogn Neurosci. 2016 Apr;28(4):558-74. doi: 10.1162/jocn_a_00919. Epub 2016 Jan 7.
    Are Face and Object Recognition Independent? A Neurocomputational Modeling Exploration.
    Wang P1, Gauthier I2, Cottrell G1.

  11. Brain Res. 2008 Apr 2;1202:14-24. Epub 2007 Jul 26.
    Why is the fusiform face area recruited for novel categories of expertise? A neurocomputational investigation.
    Tong MH1, Joyce CA, Cottrell GW.

  12. I know from my life and many other highly intelligent ones, being intelligent and faking emotions is the worst way to live. I'm not saying ignorance is bliss, but more you know, more you suffer. literally.

Leave a Reply

Your email address will not be published. Required fields are marked *