a short note on “Rebooting AI” by Marcus & Davis

Disclaimer: I received the hard copy of <Rebooting AI> from the publisher, although I had by then purchased the Kindle version of the book myself on Amazon. I only gave a quick look at the book on my flight between UIUC and NYC and wrote this brief note on my flight back to NYC from Chicago. I also felt it would be good to have even a short note by a machine learning researcher to balance all those praises by “Noam Chomsky, Steven Pinker, Garry Kasparov” and others.   <Rebooting AI> is a well-written piece (somewhat hastily) summarizing the current state of

Discrepancy between GD-by-GD and GD-by-SGD

The ICLR deadline is approaching, and of course, it’s time to write a short blog post that has absolutely nothing to do with any of my manuscripts in preparation. i’d like to thank Ed Grefenstette, Tim Rocktäschel and Phu Mon Htut for fruitful discussion. Let’s consider the following meta-optimization objective function: $$\mathcal{L}'(D’; \theta_0 – \eta \nabla_{\theta} \mathcal{L}(D; \theta_0))$$ which we want to minimize w.r.t. θ₀. it has become popular recently thanks to the success of MAML and its earlier and more recent variants to use gradient descent to minimize such a meta-optimization objective function. the gradient can be written down as* $$\nabla_{\theta_0} \mathcal{L}'(D’; \theta_0 – \eta \nabla_\theta \mathcal{L}(D; \theta_0) =

Sharing some good news and some bad news

I have some news, both good and bad, to share with everyone around me, because I’ve always been a big fan of transparency and also because i’ve recently realized that it can easily become awkward when those who know of these news and who don’t are in the same place with me. Let me begin. The story, which contains all these news, starts sometime mid-2017, when I finally decided to apply for permanent residence (green card) after spending three years here in US. As I’m already in the US, the process consists of two stages. In the first stage, I,

Best paper award at the AI for Social Good Workshop (ICML’19)

The extended abstract version of <Deep Neural Networks Improve Radiologists’ Performance in Breast Cancer Screening> has received the best paper award at the AI for Social Good Workshop co-located with ICML’19 last week in Long Beach, CA. Congratulations to the first author Nan who is a PhD student at the NYU Center for Data Science, the project lead Krzysztof who is an assistant professor at NYU Radiology, and all the other members of this project!

BERT has a Mouth and must Speak, but it is not an MRF

It was pointed out by our colleagues at NYU, Chandel, Joseph and Ranganath, that there is an error in the recent technical report <BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field Language Model> written by Alex Wang and me. The mistake was entirely on me not on Alex. There is an upcoming paper by Chandel, Joseph and Ranganath (2019) on a much better and correct interpretation and analysis of BERT, which I will share and refer to in an updated version of our technical report as soon as it appears publicly. Here, I would like to briefly point