I was trying to listen to classical Indian fusion on YouTube today and found a certain limitation during this process. Compared to Pandora (which is music recommendation service like lastfm) listening to songs on YouTube is limited. I don’t have the option of starting my own station and relax and listen.
The prime difference between YouTube and Pandora would be the limited number/expressiveness of tags that YouTube videos have or which are being currently used. For instance a classical song on YouTube is often tagged as just ‘music’ or ‘entertainment’ w/o any regards to it’s musical or video content. On the other hand Pandora has pretty descriptive tags for it’s songs, which makes it easy to find songs in same genre even though these songs might not have the same artist or usual features like name-viewers. Moreover during a small glance I also observed that the options/suggestions that YouTube provides, while viewing a video, have a similar name/viewers, than content based features like classical, rock, or other.
Thus in future it will be interesting to see if Google could use these noisy tags in an intelligent way so that a user can use YouTube depending on his preference. Thus one one day I can use it as Pandora that recommends music based on similar content or like the current YouTube. Google has also been doing some related work here, one of which I came through was published in ICCV’11 by Thomas Leung and others on using weakly supervised learning for handling noisy labels/tags (noise is other source of nuisance). Another homework that I should do is read about the recommendation system that Google used and if they are doing something similar to those employed by online retail sites that use algorithms like latent models, matrix factorization etc (mostly unsupervised). I will revisit this discussion in future.
Interesting way in which Google conducts Research by targeting not artificial datasets but real-world data. They also published a paper in Communications of the ACM.
I got this interesting (nerdy) analogy between physical attractiveness and initial conditions in an optimization problem. However before making the statement I would like to point out that this statement is based on some assumptions and may not be true in every case and you are allowed to disagree with it.
The hypothesis is that when two people meet- for instance see each other at a pub or party, the first things (in most cases) that makes them talk or adore each other is physical attractiveness of either or both. In other words they started talking because they were attracted to each other physically (citing Freund). Let’s further suppose these two people started dating each other after sometime and as is the word by sages (another assumption), physical attractiveness tends to matter less then since these two people might be over that initial phase/excitement.
Thus ‘physical attractiveness’ could be compared to initial conditions in an optimization problem, where if the initial conditions are not apt the algorithm might never reach an optima. Also the only use of initial conditions is to initialize the algorithm and they don’t really matter afterwords.
I am keeping any further interpretations or cases (like convex optimization problems) open to the readers.
This is a nice comparison of different machine learning approaches highlighting their advantages and disadvantages.
While reading wavelets I decided to go back and read the basics of Fourier transform since reading wavelet analysis wasn’t satisfying me all along. Thus I decided to re-visit signal processing esp. Fourier transform. While doing this I came across this interesting tutorial where the writer has shown simple cases of a normed-vector space and extended it to function space.
In one line, functions can be seen as points in a vector space. However since we have a function space now, we define an orthonormal basis along with notions of dot-product and norm (length).
This topic has always fascinated me since I also came across this idea while working on multiple instance learning and reading Gradient Boosting. May be my mind is still looking for a central theory (like vector algebra) to explain or break-down lot of other things. I shall write more about these connections later. Here’s the link
One question that seldom comes to my mind is whether it is necessary to have been born with extra capabilities (mind) to become a great scientist. I have changed my opinion multiple times after making new observations, however I think I have a consistent hypothesis now according to which it is not necessary to have a great mind to be a great scientist.
I was recently reading a book by Feynman, given to me by my self-proclaimed intelligent roomate and after a while I went and asked him this question- ‘Was Feynman an extra intelligent guy’. Following which Yogesh had forwarded me a video from Feynman where he says that extra-intelligence or talent is not required to be a great scientist. According to him famous scientist are normal people who became interested in their fields and put in a lot of hard-work, reading, writing, learning and problem solving to master it. His case also makes sense since the lives of great scientists contain the story of hard-work, time and commitment they have put into their work. On the other side I also believe that mind is not something static but a dynamical system that evolves and forms neural connections as you make it work. For instance, if you train your brain to learn reading, it can automatically learn patterns and re-organize itself to be able to do well at reading new books and it just goes better. I am sure not how much neuro-scientists will agree with me on this. There is also a counter-argument here where you can ask yourself of any scientist who didn’t work hard or struggled and won the Nobel prize.
Hence to be precise what requires to succeed in a field is motivation, interest, commitment and finally some work.
I am glad to share that my joint work with Abhinav Dhall on ‘Weakly supervised Pain Localization using Multiple Instance Learning’ got accepted today. We got some rave comments on this work. The idea is to learn from weakly supervised data i.e. labels mentioning if a person is in pain or no-pain in a sequence. Learning paradigm is a variant of latent methods called multiple instance learning. I gave a recent talk on this work in pixel cafe and here is the presentation.
However I am sad for the other paper that got rejected.
I should talk about latent models sometime.