Welcome! Please contribute your ideas for what challenges we might aspire to solve, changes in our community that can improve machine learning impact, and examples of machine learning projects that have had tangible impact.
Lacking context for this site? Read the original paper: Machine Learning that Matters (ICML 2012). You can also review the slides.
Object Detection
  • Object detection in images is a hugely impactful unment challenge.
  • 8 Comments sorted by
  • Can you elaborate and be more specific about the challenge? What level of performance, on what kind of particular task, would constitute and important and meaningful result?
  • Imagenet is a suitable dataset. Contains over 20,000 object categories. The best performance achieved so far was in the press recently - andrew ng - 9 layer unsupervised neural network. However, it achieved 16% - getting to 100% is a challenge.

    The impact is absolutely huge - robocars and service robots (domestic and in restaurants etc.).

    Specific challenges

    - increasing the accuracy on imagenet

    - increasing the size of imagenet (categories, and number of images per category)

    - pooling computing resources to run algorithms on imagenet

    - creating a central imagenet datastore. Currently imagenet consists of links to images, mostly hosted on flickr, but flickr can be shut down or api access limited at any time.
  • That's closer to a specific challenge.  How would you connect performance on this data set to a "real" problem, though?  Is there some application or endeavor that could actively integrate whatever algorithm did well on this data set?  That is, "solving" object recognition on an isolated data base, while impressive, isn't enough (in my view) to impact the real world, unless someone out there can benefit in some way.  Let's think creatively, and figure out who that could be.
  • It's estimated that human vision can recognize between 20,000 and 100,000 object categories, so have good performace on imagenet will enable real use.

    Specific

    - collaborate with robotics researchers - one that I can think of is willow garage

    - collaborate with economists to quantify the economic impact of robust object detection.

    - inventory taking app on smartphones, ie. anyone can take their smartphone to a walmart and have it detect any category. This is basically what google goggles does, though it is not very good at present. Also, this app could do inventory of e.g. a household kitchen or dressing table etc.
  • Also, have computer transcribe videos/movies at human performance level. It could then be used in "judge" of quality system (regression on a movie dataset), so that it can be used in a "AI director" type role.

    Set up industrial robot to cook a meal in a kitchen (this is the kind of thing willow garage wants do if they could get good object detection technology).
  • Those are all really cool ideas!  Your more ambitious ones, of course, require a lot more than just reliable object recognition (transcription: NLG, cooking: robotics/planning/sensors), but that capability would be an important enabling piece of the system.
  • Well, with the movies I really meant just the vision part, the speech part has been worked on by companies such as google/apple/microsoft etc.

    Also, with the robotics, there's already a lot of roboticists out there, they just need the objection detection technology that's missing.

    I was just offering my 2 cents that object recognition or scene understanding is the highest impact near term (ie. doable) machine learning impactable area. The challenge is really the getting enough compute hours and data (the algorithms - sparse representation/deep learning - are showing good results).
  • You raised primordial questions in your ICML paper and the slides, Kiri. Let me write down some of my thoughts after reading your paper.

    Personally I think that the main tasks and activity sketch you made should be seen as an ideal. I imagine weakly a single ML researcher that manages to grasp the math/statistics AND domain knowledge + deployment issues in real life + business at the same time. I would put it bit differently - our research should use domain data and expert input to guide our efforts, but we as researchers in ML should focus, uniformize and render more accessible the developed algorithms, allow easy and seamless reproductivity and be ready to run it on ANY other dataset. Beyond that, we need to concentrate our efforts to build teams of people who contribute their experience and ensure everybody understands the underpinnings, challenges and results of each party. The output will be much more than a pile of publications. The output will be a working system, may be far from perfect, but accompanied with valuable research results, experience obtained by the whole team.

    Coming back to object detection. I would rather defend the point of view to manage and use those big public databases. As long as we can compare the results on the same data, we can constantly push the barrier in computer vision and pattern recognition. Nevertheless, I cannot more agree with you when it comes to an algorithm showing 75% of precision on a dataset X and it is not clear what impact it will give in a real-world system doing real job. Here I would be very interested to see that accompanying paper written by an expert close the application which explains, concludes and points out the weaknesses and strong points of the system in field conditions. That's a nice idea! ;)

Welcome!

To post or add a comment, please sign in or register.

Tip: click the star icon to bookmark (follow) a discussion. You will receive email notifications of subsequent activity.
If search doesn't work, try putting a + in front of your search term.