Geoffrey Hinton – The Neural Network Revolution



excuse me thank you very much Jim it's a it's a great pleasure to be here geoff hinton we'll be we'll be out here in just a second and so I'm just going to offer a few preliminary remarks one being that it is really an extraordinary honor to have Jeff in here he is was you know slightly modestly described as the founder pioneer of AI in Canada but really even more than that he's really truly one of the the true pioneers and the Giants in the field so that is a that is a great privilege he he has another I don't know if you'd call it an achievement but it sort of explains a little bit the stage set up here is that he told me that he is not actually sat him down in a chair since 2005 maybe some kind of world record for our standing up but he has a he has a herniated disc which prevents him from sitting down so he'll be he'll be standing here well while we talk now you know AI and this is you know part of what we'll get into in the beginning of this discussion AAI means a lot of different things to different people these days it's it's thrown around a lot as what could really be thought of as a marketing term as much as anything and so that's that's how we're going to start out the conversation here and first of all I welcome Jeff thank you so much for being here and Jeff was also he noted to me recently named just to give you a sense of both his own personal importance in the field and and the importance of this field in general that Jeff was named one of the 50 most influential people in the world which is quite a quite a statement actually so congratulations now just to start out as I was just saying you know AI artificial intelligence machine learning these are terms that get thrown around a lot especially these days it's often hard to discern what they really mean I mean computer computing by itself is can be thought of as a form of AI so well what is the what is the definition of artificial intelligence so the definition is doing things that would make a person seem intelligent if a person did them but that doesn't get you very far what happened was about 60 years ago there were two schools of thought there was a clash between two paradigms for how to make an intelligent system so one paradigm was logic so if I give you some true premises and some valid rules of inference you can derive some truth inclusions and the people who believe in logic thought that's the way the mind must work and somehow the mind is using some funny kind of logic that can cope with the fact that sometimes you discover things you believed were false normal logic has problems with that and so one paradigm said we have these symbolic expressions in our head and we have rules from repairing them and the essence of intelligence is reasoning and it works by moving around symbols in symbolic expressions there was a completely different paradigm that wasn't called artificial intelligence it was called neural networks that said we know about an intelligent system it's the brain and the way that works is you have lots of little processes with lots of connections between them about 10 to the 14 connections between them and you change the strengths of the connections and that's how you learn things so they thought the essence of intelligence was learning and in particular how you change the connection strengths so that your neural network will do new things and they would argue that everything you know comes from changing those connection strengths and those connection strengths change to have to somehow be driven by data you're not programmed you somehow absorb information from data well for 60 years this battle has gone on and fortunately I can tell you recently it was won and the neural Nets was the right branch so when when was the battle won what was the what was the tipping point when do people realize that this that the neural net approach was a fundamentally better one than a sort of a logic so I guess in about 2009 people doing your own nets showed that you could make a better speech recognizer and that quickly got taken up by the big companies and in the Android in 2012 was the first system to complete all the engineering to put that into a system and when speech recognizers on yourself and got a lot better in 2012 that was neural nets so that was one sign another sign was in 2012 people made neural nets much better recognizing objects and images so now in google you can upload your photos and it'll tell you what's in them you say finally a photo of a hug and it'll recognize a hug and a hug it's quite a complicated thing or jewelry of where Claire's jewelry and a small glittery thing isn't necessarily jewelry but if you see a small glittery thing and then a woman's neck that's jewelry and so you're Nets able to cope with all that kind of stuff and you can't do it with rules there's too many rules too right you just have to learn it from data and in that case you'd have a big label data set of lots of images and labels and say what's in them and you train your neural net and until about 2012 people hadn't been able to train really big neural nets on millions of images and when we first did that we suddenly got much better results than computer vision stanley computer vision and the whole field of computer vision flipped between 2012 and 2013 in Bac 2011 if you submitted a paper bright neural nets it would be sort of more less automatically rejected and by 2013 if you submitted a paper that wasn't about neural nets it would be more less to magically reject it so it was complete flip sir because it worked so much better and and you referred to this going back 60 years I mean so so visited going on for 60 years and then suddenly in one year the whole thing flips over I mean what what was that about I guess paradigm clashes are like that um suddenly there was enough evidence the neural networks really would work really well given a lot of compute power and a lot of data and at that point for example 10,000 smart Chinese graduate students go into the field and we've been laboring away with kind of only a few hundred people really working at this stuff suddenly you get this big infusion of very smart young people who push the field forward and that's what's happening that there was one other thing I didn't mention which was machine translation so if ever there was a problem where the symbolic approach was gonna win it was machine translation because what comes in is a string of symbols and what comes out is a string of symbols and there's all these linguists all over the place will tell you how to manipulate symbols they'll tell you what the rules are for how you find structured strings of symbols that are language and so that was that was what I think of as the final battle if symbolic I was ever gonna win it was gonna be for machine translation if you asked how cool does translation now what you do is you take in words in one language you break them up into a thirty-two thousand different fragments which things like eing but also all the common words like that and then you produce the other language and it's done by a great big neural network and the neural network has no hand wired knowledge in it it just takes these thirty-two thousand alternative fragments of stray loads and it produces thirty-two thousand alternative fragments in the output language and it's just trained to do it from data and if you ask how many linguists did you need to do that well you needed lots of linguists to prepare the data and understand that there are different languages and things like that but how many linguists were involved in actually creating the network none you didn't need any linguist at all you didn't need any prior knowledge it was just all learn from data what you needed was a lot of data so what does that look like when you say you know you need a lot of data what so you need millions and millions of pairs of a sentence in one language and a good translation in the other language and you take your neural network and you feed it the sentence in the first language then initially it has random weights and it'll produce garbage in the second language that is what it'll produce at each time is the sort of probabilities of what the next word might have been in the new language let's first we're doing English to French you give it an English sentence it'll then produce probabilities for the first word of the French sentence you pick one of those words according to those probability and say okay suppose that was the first word what you think the second word is and it'll give you probabilities you pick one of those you say okay suppose that was the second one what do you think the third word is and it'll produce garbage because you put random weights in and then what you do is you say okay you thought the first word was love and actually the first world is LA so what I'm gonna do is I'm gonna try and make you think there's higher probability for LA and less probability for LA so you inject an error signal that tells it you know God you got that wrong you were betting online it should be lower and so what we do is we inject an error signal and that error signal goes backwards through all the connections in the network figure out how to change those connections so next time it'll say whichever I said no rather than laugh and you just keep doing that this is dumb as out you just keep doing that for actually billions of times and eventually it starts producing good strings in the other language what's more you make one net that will through translate all pairs of languages will give it a language you tell it what the output language should be and it'll be used in the other language just one name so when it was the specific story of how how this kind of machine translation went from your laboratory to the Android phone okay the machine translation wasn't done in my lab that was done at the University of Montreal and Google but the speech recognition and the object recognition will first stand in my lap and in 2009 we've got a speech recognizer working and it works slightly better than the existing technology but our one was done by two graduate students working over summer and the existing technology was a result of 30 years of hard work and so it was very obvious that if you developed our system it would get much better which you did and so my graduate students went off to the various big labs to IBM and Microsoft and to Google and all those labs then switched to doing speech recognition using your own s Google was by far the fastest to actually do the engineering to get it into production that's what Google does really well it eventually came out in Syrian things because IBM helps with the speech recognition for Siri but yeah so the speech recognition I was very impressed by how fast Google did the engineering mm-hmm so what are what are some of the most important applications for this when it comes to financial services I mean I know it's a it's a kind of a that's probably a very long list but if you could give us a couple of examples of where this technology has really really had a big impact on the financial system and and and when you look over the horizon a little bit what do you think are the big problems that could be solved with these technologies okay so I know about how you get a network of simulated neurons to change the connection strengths to work better I all I know about finance is each time I look at my bank account the monthly fees have gone up I do have a former student who knows a lot about finance he's the CEO of a company called Renaissance which is our a successful hedge fund and he tells Mary Lee Bob Mercer right no no that will be Peter Brown okay there's two guys those two guys is Peter Brown and this is evil twin Bob Mercer Peter Brown is a very nice guy everything I know about finance I learned from him and I don't know much so I probably should refrain from telling you how this is going to be using finance but I can tell you something much more general anytime you have a lot of data you want to predict something neural nets are now a very good way to do it they weren't better than the alternative technologies at present so all you need is lots and lots of data and lots of information about what the right answer is which might just be what happened next and you'll be able to train a big neural net to do really well of course I actually I know a tiny bit more than that about finance I was actually once the technical guy on a little mutual fund for Nesbitt burns that works extremely well we had a neural net in the 90s that he was actually there was a neural net that decided what sort of phase of the market you're in and there's another neural net that told you which stocks would do better than the market in six months time and so it bought things and held them for six months and it actually performed extremely well it performed extremely well a small fraction of that was because it worked and a bigger fraction that was because we were lucky and then he stopped working well because the Ontario teachers pension fund copied it that started running a large amount of money so took all the signal out of the market but back then it was very clear that the main issue in finance in for example predicting which a good stocks topic is noise so there isn't enough data there wasn't enough data then to be able to tell for sure what was signal and what was noise and protecting yourself against thinking that noise is signal was really important as you get more data that becomes less important and it gets becomes more important to be able to see structure in the data when it's complicated structure back in the 90s you could only find simple structure because the data sets weren't big enough now as J sets get bigger you're gonna be able to learn more and more complex structure and I guess one thing I can say that it's generally useful is that it supposed to give you a great big data set and there's lots of complicated structure and you know for all these examples what the right answer should be historically can you find rules that will predict the right answer and basically the answer is no you can't what's gonna go on in a big data set is there's gonna be millions of weak regularities now among those millions of weak very garetty's hundreds of thousands will be due to the fact that you've got that particular sample of data and if you try and generalize those to the future it won't work if you get another sample even from the same data set you'll get different regularities they're what's called sampling error just the particular quirks of the particular examples you've got and your big neuron that will model all those quirks and a statistician will tell you that's a complete disaster when it is a complete disaster unless your big neuron matches also model lots of other weak regularities is really there and given a danger that you can't tell the difference between regularities that are there as quirks of the sampling and regularities that are real and so what a statistician will tell you is that have a strong threshold so you won't be confused by regularities that aren't real you won't accept a regularity until there's enough evidence for it but that means you can't use all these regularities a much better way is to find gazillions of weak regularities and just pray that the ones that are true will outweigh the ones that are spurious and to do with your limited data set and that's what these big neural nets are doing now particularly in areas like healthcare where you're trying to predict medical things you're really grabbing gazillions of regularities and hoping the correct ones will overwhelm the incorrect ones rather than being a sort of uptight statistician who says I'm gonna have a significant threshold for a rule and I'm not gonna accept any rule unless so something like predicting you know you're talking about the power of prediction from these systems right and obviously in finance you want to predict future prices but but to the extent that one can predict prices that affects the prices right so yeah I mean so is it a logical goal to think like well I can develop a neural network system that will be able and enable me to predict future crisis better than the other guy I mean is that a realistic way to think about it temporarily yes mm-hm so I think back in the sixties I don't know much about finance like I said but I think back in the sixties people started using linear models in finance and they really cleaned up Peter Brian told me that anyway um they really cleaned up because they were using a more powerful tool than what other people using and by the end of the sixties sometime rands and linear models wouldn't buy you anything anymore you got to go into more complicated models in fact in the mid-90s we came up with a model called Gaussian processes but I'm not gonna try and explain to you because it's complicated but it's a very good way of protecting yourself against mistaking noise for signal and the mutual fund I was managing for Nesbitt burns or giving a technical input to we showed that if you use guessing processes you could actually make it work much better instead being getting it right predictions right sort of 51 and a half percent of the time it will get the predictions right 53 percent of the time which were noisy data was much better and allowed you to make much more money but it turned out Nesbitt burnwell wasn't really interested in that because I couldn't explain how Gaussian processes worked I still can't explain how they work and so they couldn't sell this mutual funds so one thing I discovered about the financial industry was banks don't care whether the mutual fund works they care whether they can sell it to people so so you raise a very interesting point there which is that these these technologies and what you do them are sort of fundamentally not explainable right so so the the neural net is kind of doing something in there and coming out with a conclusion but you don't you don't know how it generated that result right yeah and so that so I think you suggested that that opaque quality made it hard to sell because you couldn't really explain to people what the methodology was and it seems like that would also in various context raise a lot of regulatory issues where you know you if you you can't really tell the regulator why you're doing what you're doing it's just well the Box said we should do that I mean is that gonna be an issue yeah it's gonna be a big issue particularly in Europe where they want to legislate this kind of thing there's been this paradigm pool situation that we don't know how our brains work right so if if I ask why did you decide to ask me that question rather than some other question you could make up a story you could say you know because it seemed relevant at the time and I can say but why didn't seem relevant at the time and you would then sort of fish around and your story will be a bad as plausible as a story by a press secretary trying to explain why somebody says some crazy thing you'd be making it up to try and justify it but you don't really know why it happened so we're currently willing to accept people's opinions even though we have no idea why they said that and we're not willing to accept neural nets opinions when we also have no idea where they said that we're gonna have to move to understanding that you have a choice you can have something that's absorbed a lot of information and is using lots of weak rules and using the consensus of all these weak regularities is discovered to come to a conclusion where you can't justify the conclusion in terms of nice simple rules those are the systems they're going to work best but if you don't want that you can have a system that doesn't work nearly as well that uses a decision tree or something that uses nice clear rules that's fine you can have your system with clear rules but it just won't work as well because in reality the only way to make a good decision in a big messy world is to be sensitive to a gazillion regularities and take the consensus of what they imply and you're never going to be able to give a simple explanation of that the simplest explanation of the neural net that does machine translation for example is well it's got these billion connection strengths in and if you run these 32 fragments through these billion connection strengths that's what comes out so you you know it works but you don't know why it works yeah and you never will know why it works that's not quite true because you can train a neural net to behave like a press secretary you can train your own nets to say why they worked and it'll be a badge of reliable as people may be a bit more reliable than people and so we already do that for images so you can train in your like to recognize things and images and if you ask yeah but what's in your nets scene so you take the neural net that was classifying the images and you hook it up to the second half of a machine translation system so instead of putting an English sentence you put in an image and out comes an English sentence sorry said I putting in a French sentence you put in an image and then I will come on English sentence that describes the image and so now you've got a neural net and you are and you can ask it what do you see and it'll say I see a close-up of a baby holding a stuffed animal and that's as good as you can do and that is what it sees and that's when you're elected that and so we if you want to explain in your neck get another neural net to explain it but don't necessarily believe that's what's really going on so that's fascinating so you know the one of the great fears this this raises right is that the you know you've got these machines kind of doing their own thing essentially without us really knowing what they're doing or how exactly does that raise any other kinds of concerns I mean without getting into the more apocalyptic scenarios of you know how the computer locking you out of your spaceship but I mean that's you know is it completely ridiculous to worry about those kinds of things no it's not ridiculous but I think we're gonna have to treat you the same as we do with people um if I get in a taxi and I want to know is this taxi driver gonna kill me basically what I do is I don't I don't ask for a printout of what's going on inside the taxi drivers head I don't ask protection over what rules he's using I just look at the statistics and all my friends get in taxes and most of them are still alive and I think it's probably a good bet certainly when you get on a plane you know you know that because of the whole system at the airlines the pilots probably not drunk and you know it's a good bet that the plane will actually land safely but you don't derive that conclusion from knowing exactly how the pilots brain worked you do that from Statistics and it'll be the same with these AI systems if you want to know that your driver list car is not going to mow down pedestrians you just measure how often in most an pedestrians preferably in simulation to begin with and that's going to give you a much better answer than trying to understand how its vision system works so you just kind of look at the results as their as the way you do what you do for understanding yeah so so you know really so this doesn't you don't think there's a risk of things sort of producing kinds of nefarious conclusions let's say or or you know leading us down and you have to worry about people manipulating these things so with a big neural net that's discovered lots of weak regularities in the data you can use the neural net like that for proof recognizing road signs for example so you just train it up so when it sees a stop sign it says stop sign and when it sees a speed limit sign it's the speed of a sign then when it sees a school bus sign it says school bus and stuff like that but after you've trained it the question is can an adversary come along and figure out how to make something that looks just like a stop sign to you but looks like a no speed limit sign to somebody else so the neural net we can't make something that looks just like a stop sign to you and looks like a nose limit sign to another person but we can make things that look pretty much like a stop sign to a neural net that's been trained sorry I pretty much like stop sign to you and look like some other kind of sign altogether to a trained neural net and that's because we can sort of cash in all what weak regularities is picked up on and if you're clever you can make something that a person won't mistake for some other sign but these neural nets will so there's a lot of work remains to be done on how to avoid that kind of adversarial attack on these systems you know as long this is a bit of an aside but since you're bringing up image recognition stop sign stuff it you know I I think a lot of question a big question a lot of people have is around self-driving vehicles and you know there's you know are they sort of a next year thing or 20 years from now thing I mean do you have a view on that I think they're definitely between next year and 20 years from now I think you can predict the future about five years in advance and after five years it's all just crazy I mean you've no idea okay so now I'm gonna I'm gonna pin you down on that one so five years from now are we gonna have self-driving cars probably I didn't say you can predict yes or no you got a probability distribution of these yeah sure I understand all right so what do you think I mean what should we be looking for what are the next kind of major things coming out of in terms of actual applications so we've talked about machine translation voice recognition pattern recognition these things would what are some big really important breakthroughs that we can look for say in the next five years out of these technologies so there's a database called PubMed that has abstracts of medical papers and if you look in PubMed for the last year and you search it with deep learning you'll discover that those there must be about a hundred abstracts come out of systems that use deep learning to understand medical images and everybody in the field understands that the object recognition that these neural nets can do is now good enough so you can make systems that are about as good as doctors and possibly a bit better so that's true for for example if you take an image of your retina called a fundus image and you look to see if you've got diabetic retinopathy there's five stages and now a neural net can do slightly better than a doctor telling you not quite as good as the best doctors but slightly better than your average doctor telling you ophthalmologist are telling you what stage you've written diabetic retinopathy it is that's really important because in India for example you could stop a lot of people going blind if you could cheaply figure out which ones to treat and there aren't enough ophthalmologists but this system is going to in a few years time it's going to be it's already being field trials and things it's gonna stop a lot of people going blind because it's going to be able to do it fast and cheap and that's going to be true for lots of other medical images like when you get old you wake up one morning you discover this is funny patch and you want to know so I go to the doctor and I say I got this funny patch and I'm fairly sure it wasn't there last week and the doctor says yes actually actually a burn you must have splattered hot fat on yourself when you're frying something see oh yeah I remember that right that's embarrassing right that's one kind of error the other kind of error is you have this little black thing that you ignore and it turns out that was a malignant melanoma and if you got it a bit sooner you would have lived that's the opposite kind of error pretty soon we're going to be able to make something in your cell phone and you just show it this thing and it tells you what it probably is and whether it's worth taking it to a doctor and in fact pretty soon the cell phone is gonna be a lot better than the dermatologists it won't be as good as taking a sample of it putting it on a slide and doing the pathology on the slide which will also be done by a computer which going to be a lot better than the pathologist that stuff is coming quite quickly obviously there's a normal enormous commercial incentive for it because medicine is expensive and we'd like to make medicine better and cheaper I doubt it'll make it cheaper and I just mean we get a lot more treatment and more efficient treatment I present the system trained on a hundred and thirty thousand skin lesions he's about the same as a dermatologist slightly better but he was only trained on one hundred and thirty thousand skin lesions once had been trained on 10 million it'll be much better than dermatologists paper came out last week for example of looking at cat scans of the head and they're looking for like 30 different things aneurysms and hematomas and all sorts of things that I don't even know what they mean the computer system now gets a clinically significant error rate of 0.03 percent actually point over three seven zero point oh four percent and the average doctor who who's trained the average board-certified person for looking at these CT scans gets a clinically significant miss rate of 0.8% so the computer is 20 times better has 20 times less misses than the doctors what's more the computer can be fast and for a lot of these things if you've got a hematoma or something you want to diagnose it fast now this is an archive paper so you can't necessarily believe all the results and I'm sure the referees will make them modify their claims but that's just a sign of things to come and the reason they could get such good results was they put together the results of 29,000 studies so they get this itself a database of three and a half million cat-scans normally these studies have just a few cats cats and that was the crucial thing a crucial thing was getting big data which you guys know about once you've got that big data then you can make systems that have much more experience than any doctor and therefore would be much better than doctors one other thing if I may not sure that's in image analysis where it's dead obvious this is going to happen I get into trouble with doctors for saying you're all going to be out of work in five years so I won't say that um that's just radiologists so they're not actually give me an ax work they're going to be able to spend their time with patience explaining what their options are and what this all means which is a much better use for human empathy so the other thing that's going to happen is there's a huge amount of data that about every patient and that the amount of that data that doctors actually used to decide what to do next you decide what tests to decide how to treat you is miniscule there's your whole genome pretty soon that'll be cheap to get there's your whole medical history not just the results of tests but all the things you said when you were talking to a doctor in among which are all sorts of information the doctor didn't pick up on this does your microbiome there's your epigenome which is your how your genome has been messed with by environmental effects all of that information could be to much better medical treatment it could lead to sort of predicting what's gonna happen to this patient next fairly reliably and treating it before it does so things like screening will look very primitive we'll have the equivalent of screening each person by taking into account all their properties and how that relates to all the other properties of all the other people for whom we know the results it's going to be hugely better and that's going to happen but that's going to be slower a lot of people realize that now is going to be a lot of regulatory issues like how did you get hold of all this data but it's very clear that a present medicines making almost no use of the data available and it's going to get much much better when it uses all this data and it's going to be much too much for a person to you so it's gonna have to be computers that's that's a great example very specific is there anything in the world of say Business and Finance similarity that you could point to with like you know this thing is really gonna change in a big way like we do it this way and five years from now we're going to do it totally different way most of what you do like I say don't really know much about finance anytime you have to predict something you're gonna get better predicting it I mean already things happening like big neural network so reading everybody's tweets to try and figure out what this says about whether Facebook shares are going to go up or down much more of that sentiment analysis you know and so on that point of you know Twitter and such just to shift gears slightly as a Google employee you're certainly very much aware of a lot of the controversies around the role of the internet and fake news and you know accuracy of information and how it influences elections and all those things I mean and generally speaking I think the there's a lot of criticism at the moment of the of Google and Facebook especially in Twitter of doing sort of a poor job of policing for fake news and other kinds of abuses that happen on these systems is is our neural nets going to be able to help us with that problem or are they already helping Google with that problem your onus will certainly help with that problem so one nice example of that is with spam so it used to be that spam was a real problem and you're on it's a no very good at detecting spam also the other day I got this very plausible thing saying I owed money to the University and I failed to pay this request for payment from the University and could I just click on this to find out what the payment should be and I almost fell for it so I fact I did fall for it so it was from my university was from a plausible account and I thought I would you know I must have just missed that so I clicked on it and Gmail said do you really want to open this link it's it's a highly suspicious link and then I looked and I realize yeah I was stupid to click on that it was clearly trying to get information out of me so neural nets are going to be very good at providing a sort of envelope of protection against these things mm-hmm what what do you see as the kind of philosophical implications of neural nets in other words how do you know if you look even out you know beyond where we can predict you know how do they kind of change the society change the culture change the conception of self even yeah I think well one thing's obviously going to happen if we're right about this approach to making intelligent systems and it is related to how the brain works we can understand a lot more about how the brain works which is going to be better for fixing brains that have gone wrong and it's also going to be better for things like education but something even more important it's going to change our understanding of the nature of what we are that's why I'm in this field I want to understand sort of how brains work and what we are and the conception 50 years ago was that we're rational beings and we do reasoning and what's inside our heads these thoughts inside your head a big symbolic expressions and I think that's pretty much nonsense there was something that happened about a hundred years ago where the concept of people as rational beings was sort of undermined somewhat by Freud who pointed out that there's all these unconscious goings-on most of them to do with sex but that's not the main point yet the main point is that we most of the reasoning we do is not conscious deliberate reasoning Freud said there's unconscious reason I wouldn't call it unconscious I just say we are devices that work by using analogies and that's much more basic to how we work than reasoning and I'll give you one piece of evidence for the you know from biology that there must be male cats and female cats and male dogs and female dogs otherwise they'd all be gone by now but if I say to you I'm going to give you a false choice and the first choice is you have to decide which of these two is more true all cats are female and all dogs are male or all cats are male and all dogs are female well at least in our culture everybody knows the answer to that cats are female and dogs are male and little kids know that right away they didn't even notice as a logical problem there that's clearly not logical reasoning that's something about the fact that you're inside your brain when you think of a cat there's a huge vector of features there's lots and lots of active neurons to represent different features of cats and also when you think of a one room when you think of a man there's lots and lots of features and it turns out the big feature vector for cat is closer to woman and the big feature than it is to man and the big feature vector for dog is close to man that is to woman and that's because these features come from experiencing these things in the real world and big stupid loud dogs chase small smart discreet cats that's the way it works that's not a political statement that's just how so that kind of understanding of the world is the primary understanding we have and it's not logical we we absorb information from data from this data to explain the data we get concepts which are huge vectors of features and if you do that with a neural net if you're training your own that for example to predict the next word in a sentence and the way it works is it first turns each word symbol into a vector it learns to do that such that these vectors are good at predicting what vectors will come next and then you look at these vectors you discover that without you telling it how to do analogical reasoning it can automatically do your reasoning so what you do is you take the vector that is extracted for King and you take the vector this is extracted for male and you actually subtract the two vectors this is vector algebra so you take this big bunch of numbers a represents King and another big bunch of numbers of represents male and you should have tracked the numbers from male from the numbers for King and then you take the big bunch of numbers a represents female and you had those numbers – what you got left and so now you've got King – male plus female and you'd look up the numbers you got left and hey presto you've got a big vector that's very close to Queen in other words it knows that King – male plus female is queen and it learned that just from modeling which word comes next in strings of words similarly if you take Paris and you subtract France and you add on Italy you get Rome this kind of automatic analogy is how these systems work logic is something that comes much later on top of these systems doesn't happen till far far later this is happening in two and three-year-olds logic is happening later that's really the essence of how we work and it's very like Freud's idea of the unconscious that there's all these inferences so you're making very thinking you're making inferences they just happen automatically and if you want to understand how to manipulate politics you need to understand that's how people work that's fascinating so I you know I have a final question because we're about out of time here but you know these things you described are very very powerful and and obviously extremely important and I think one one thing that people wonder is like is all of this ultimately just going to reside all this kind of power really and and knowledge going to reside and in a few big companies you know Google Microsoft where all your students are now running these programs but I mean these companies are so giant and so powerful and seem to be so far ahead of most in having the capability to really advance these technologies so is that going to be the case that that really will be reliant on you know big five companies to sort of develop and hand down you know these these technologies or is it kind of evolved differently than that I can't speak for the other companies but what's happening at Google is they developed this very advanced software for creating these neural net models and they've also developed very fast chips for making the models run really fast and they're putting all they made the software public and they're putting all that on the cloud so what's gonna happen is people will be using a lot of Google cloud services but what will be available on the cloud is the same technology as Google has and in fact even within Google there's so many applications that it's hard to get people to design all these neural networks so we now have neural networks designing your networks and those neural networks that design your own networks will be available on the cloud so if you're if you're a medium-sized company that has lots of data suppose you're a small supermarket chain you've got a lot of data you'd like to predict what special offers will suck in the right customers who will then spend lots on perfume or whatever you'd like to model your data you don't want to have your whole in house team of neural net experts that aren't enough to go around you'll be able to use the software and use the neural Nets for designing neural nets in the cloud so that the the ability to model data this way is going to be available to everybody of course Google would like you to use the Google cloud but I'm sure Microsoft will like use the Microsoft right so will kind of have neural networks has a utility yeah I don't think an use yep okay well that that's really fascinating well we are out of time now and want to thank you very much for an outstanding [Applause]

29 thoughts on “Geoffrey Hinton – The Neural Network Revolution

  1. Saying how cats are females and dogs masculine is actually tautology in Serbian language. FYI, we have genders for every single noun and surprisingly enough, cat (noun) is feminine, while dog is – masculine.

  2. He is a pure genius that has immense deep knowledge. I was amazed how he explained that our brains work based on building analogies and that "rational thinking" develops later in life. It's freaky to imagine that this field is exponentially growing and the outcome and the application potential of neuronal networks would be limitless and I hope that they would not be used for malignant motives. Many thanks for posting this great interview.

  3. This man is a national treasure and a virtual god. His portrait should be on our Canadian currency post haste.

  4. 42:07 That's the reaction to that magnificent speech? People should bring a little bit smarter hosts in front of geniuses like Hinton.

  5. computer technology is able to create a new language that can be understood by the human brain, animal brain and computer machine at once. machine learning technology can definitely do it, for example; what happens to synapses, aukson and dendrites when canadians learn Japanese or vice versa. The language coding of the brain can later be used for further technology.

  6. amazing talk …very insightful that we are not rational people but deriving our decisions from our unconscious mind or rather what he called vectors of feature representation that is more like an analogy based reasoning than logical reasoning. Very deep about how our brain works!

  7. Intelligence is a wave, not a particle… our brains have chemicals that regulate the signal to noise ratio of information being received, what sort of regulators do the silicon based neural network have?

  8. Thanks for uploading this in depth talk on ANN. Professor Geoffrey is undoubtedly at the forefront of science, the advances he and his team (including students) establish will expand the knowledge and understanding of science and thus man.

  9. Thanks for uploading this in depth talk on ANN. Professor Geoffrey is undoubtedly at the forefront of science, the advances he and his team (including students) establish will expand the knowledge and understanding of science and thus man.

  10. Can someone give a link for paper on the so called 'press secretary network' which labels images by connecting up two networks

  11. A neural network in nature doesn't imply "intelligence" at least not human intelligence. It only implies a specialized collection of cells designed to accomplish a specific task. The key here is that these are specialized cells and not a one size fits all generic algorithm. Neural Networks in computer science are a general purpose algorithm but their value only comes as a result of specialized frameworks and code around them that makes sense of the results. And they are computationally very expensive because they are not specialized to a specific task. So the statistical results that they produce have no meaning and have no "value" outside of very custom code and logic built into the system.

    Now compare that to neural networks in nature. They are much more sophisticated and produce more accurate result based on years of evolution around a specific task. All nerves deal with sensory data and as a function of this, they all have some inherent behaviors common across the board. These common "building blocks" are what become advanced "cognitive" abilities based on evolution as in the human species. One of these common building blocks is the ability to say one set of sensory data is "similar" to another, just like in math where you have similar triangles. Therefore auditory nerves can tell at a very fundamental level that one sound is "similar" to another or even "the same". Similarly on a very basic level optical nerves are able to tell if one set of optical data is "similar" to or "the same" as another. It doesn't matter what the optical data is or how it is "labeled" , because that labeling is the function of another set of specialized cells. And right now current neural networks are not built around that same kind of computational building block. This is why humans and animals can recognize things after one or two times seeing it, while neural networks take a very long time.

    But the only reason for that is because the current state of Neural Nets does not take advantage of the most important aspect of Neural Nets and that is feature extraction. If you give a single image to a neural network, it should be able to generate hundreds if not thousands of "features" based on very simple concepts like "color value" (rgb) or shape. And then on top of that you build up more complex features like parts and whole, (meaning something is part of something else), then from there you build up more complex feature sets on top of that, all from pure visual input and not any sort of labeling. Evolving a neural network to be come "specialized" at seeing then would be much more powerful, especially when combined with larger and faster memory by which you could run functions on to compare against the next image being processed.

    This has nothing to do with naming objects. Dogs and cats can see just fine and different species of animals see better than humans, but they cant talk and they don't "think" like we do. However, we know they "recognize" things when they see them.

    Language and labeling would be a function of another specialized set of neural networks, working in conjunction with the first, but specialized in other things.

    A good example of specialized "optical" Neural Networks would be something like the new AI tool called Sensei.

  12. Google Are Definitely NOT! Making proper tools available to public, Public Version Of Tensor-Flow is no where near the same as Google's internal use version.

  13. Hm, why people ask about autonomous cars, is stupid, because we have them already.
    Maybe not in mass production, and not as good as people, but they are and they are not so bad.

  14. It felt like a sizeable portion of this interview went like this:

    Interviewer: Dystopian fiction scares me, could you please make me less scared?
    Expert: Well, instead of talking about fiction, here's what's going on in the real world.

    Rinse & repeat.

  15. nn cant do cognition .. have to replicate symbol manipulation anyway… nn are good for perception. nn cant even do identity function.

  16. so Damn Interesting, …I'm glad you mentioned the solution of applying a parallel network for an understanding of the initial networks results. This is a Life and Death race to the Future, and what concerns me the most is what happens when we not only dont understand why the results are presented but we deliberately do not implement those results, what will that do to the AI's

  17. Very good talk. Apart from his knowledge of neural networks the man is apparently also highly skilled in circumventing political correctness issues.

  18. well be telling great storys about how we were the first to know about this tech… and how we were the ones prepared to the changes it brought..

  19. Gotta love Geoffrey Hinton since his insight into applications of neural nets is wonderful and always so well articulated.

Leave a Reply

Your email address will not be published. Required fields are marked *