Tag Everything

As mentioned in the previous post, given any visual information (image, video), it’s still difficult for computers to recognize the object that contained in the information. Although a lot of works have been done in this area, we still have only limited progress. Unfortunately, in the near future, we still can’t be too optimistic about the progress towards the field. But the good news is that we do have an easier solution: tag everything!

Read the rest of this entry »


Design and Research

As a problem solver, I’m interested in the similarity and difference of design and research. Paul Graham has also written an article to discuss this issue. He summarized as the following:

Design doesn’t have to be new, but it has to be good. Research doesn’t have to be good, but it has to be new. I think these two paths converge at the top: the best design surpasses its predecessors by using new ideas, and the best research solves problems that are not only new, but actually worth solving.

Basically I agree with him, but I would like to address some personal thoughts here. Having both academic and industrial background, sometimes I feel academia doesn’t offer enough recognition to the design approach. Even if you can solve a problem in a more elegant way than any other previous works, say invent a web browser that’s greater than any current browser, it’s hard for you to publish as long as your work lacks enough “technical originality” in terms of algorithm, etc.

Read the rest of this entry »


High-level Knowledge on the Web

Let computers “think” like a human being is probably the dream of almost every computer scientist. However, despite extraordinary capability in repetitive computation, computers totally lack the high-level knowledge that human possess. For instance, while a five-year old child can easily understand what the object is given a photo, the image understanding problem is still an open problem in the computer vision community. The same can also be true to many other problems, e.g., translation between two different human languages.

Read the rest of this entry »


Vikipedia (Visual Wikipedia)

Imagine that you can move your cursor on a digital map, tap and see the real time photo or video of that physical point as if you were there. You use this to check the traffic condition from your home to your office; to check whether your favorite restaurant is open today or not; to check whether maple trees in the mountain is in its best and deserve your trip, etc. That kind of life there would be easy and awesome. How can we achieve it?

Read the rest of this entry »