Click here to jump to my foreword and skip the background. If you want to read the foreword in pdf, click here. If you’re interested in the tables of contents from the series, click here. Here’s my video message for their publication celebration: https://youtu.be/O78XdDYRZfc. Background: Right before COVID-19 struck NY heavily this past Spring, K-12 teachers from Busan, Korea stopped by at NYC on their trip to US for studying various AI education strategies in US, and asked me for a short meeting. Frankly i was quite skeptical about this meeting, and was assuming it was their vacation in disguise.
[WARNING: there is nothing “WOW” nor technical about this post, but a piece of thought i had about GPT-3 and few-shot learning.] Many aspects of OpenAI’s GPT-3 have fascinated and continue to fascinate people, including myself. these aspects include the sheer scale, both in terms of the number of parameters, the amount of compute and the size of data, the amazing infrastructure technology that has enabled training this massive model, etc. of course, among all these fascinating aspects, meta-learning, or few-shot learning, seems to be the one that fascinates people most. the idea behind this observation of GPT-3 as a
Update on October 23 2020: After I wrote this post, i was invited to give a talk on this topic of social impacts & bias of AI at the course <Ethics in AI> by Prof. Alice Oh at KAIST. I’m sharing the slide set here: Unreasonably shallow deep learning [slides]. There have been a series of news articles in Korea about AI and its applications that have been worrying me for sometime. I’ve often ranted about them on social media, but I was told that my rant alone is not enough, because it does not tell others why I ranted about
[this post was originally posted here in March 2020 and has been ported here for easier access.] TL;DR: after all, isn’t $k$-NN all we do? in my course, i use $k$-NN as a bridge between a linear softmax classifier and a deep neural net via an adaptive radial basis function network. until this year, i’ve been considering the special case of $k=1$, i.e., 1-NN, only and from there on moved to the adaptive radial basis function network. i decided however to show them how $k$-NN with $k > 1$ could be implemented as a sequence of computational layers this year,
I am happy to share the news that Cristina Savin and I have been selected to receive the Google Faculty Research Award this year in the area of computational neuroscience with the topic on <Online Meta-Learning>. See https://research.google/outreach/past-programs/faculty-research-awards/ for the list of awardees.