Bye, Felix

Note: i wrote this on December 9 2024 but could not dare posting it because i did not want to and could not believe what just happened then. my heart still aches so much to even think about it, but i’m posting it on the last day of 2024 to remember Felix. It was sometime early summer in 2014. I was a postdoc in Montreal under supervision of Yoshua Bengio, and Felix was a visiting student who just arrived in Montreal then. I was struggling with building a neural machine translation system that can handle long source/target sentences, and in

Amortized Mixture of Gaussians (AMoG): A Proof of Concept for “Learning to X”, or how I re-discovered simulation-based inference

here’s my final hackathon of the year (2024). there are a few concepts in deep learning that i simply love. they include (but are not limited to) autoregressive sequence modeling, mixture density networks, boltzmann machines, variational autoencoders, stochastic gradient descent with adaptive learning rate and more recently set transformers. so, as the final hackathon of this year, i’ve decided to see if i can put together a set transformer, an autoregressive transformer decoder and a mixture density network to learn to infer an underlying mixture of Gaussians. i’ve got some help (and also misleading guidances) from Google Gemini (gemini-exp-1206) ,

Stochastic variational inference for low-rank stochastic block models, or how i re-discovered SBM unnecessarily

Prologue a few weeks ago, i listened to Sebastian Seung’s mini-lecture at Flatiron Institute (CCM) about the recently completed fruit fly brain connectome. near the end of the mini-lecture, sebastian talked about the necessity of graph node clustering based on the type-level connectivity patterns instead of node-level connectivity patterns. i thought that would be obviously easy to solve with latent variable modeling and ChatGPT. i was so wrong, because ChatGPT misled me into every possible wrong corner of the solution space over the next two weeks or so. eventually, i implemented a simple variational inference approach to latent variable clustering,

i sensed anxiety and frustration at NeurIPS’24

last week at NeurIPS’24, one extremely salient thing was the anxiety and frustration felt and expressed by late-year PhD students and postdocs who were confused by the job market that looks and feels so much different from what they expected perhaps when they were applying for PhD programs five or so years ago. and, some of these PhD students and postdocs are my own under my supervision. this makes me reflect upon what is going on or what has been going on in artificial intelligence research and development. this post will be more of less a stream of thoughts rather

<The Atomic Human> by Neil Lawrence

i can’t recall exactly but it was sometime in 2013 when Neil Lawrence visited Aalto University (it was january, apparently!). he gave a talk in a pretty small lecture room which was completely packed (and i was there as well.) he talked about his years-long effort in introducing probabilistic interpretation (and thereby extensions) to (hierarchical) unsupervised learning, which was back then being consumed by deep learning based approaches. that’s when i first learned clearly the intuition and motivation behind so-called GP-LVM (Gaussian process latent variable models). that was beautiful, or to be precise, how neil delivered his inspiration, motivation and

1 2 3 16