Y-NODE Software Y-NODE Corporate Blog main page Y-NODE website

October archive

Mysterious NoReverseMatch Error

A couple of days ago we have faced quite a creepy issue: a Django-based site with tens of thousands hits/day has been throwing NoReverseMatch errors with no particular pattern. Basically, any url on the site could be found in the error emails. These emails were flowing in at a very worrysome rate, and user complaints started to arrive as well.

I'm sharing our findings just so that if anyone gets into the same trouble he could find the info on google.

All the credit for discovering the roots of the problem goes to Vadim Shender whom you might know for his OCaml posts if you read this blog.

If you are using Django, by now you are probably thinking "swallowed ImportError". If only it was that simple! Without going into too much detail, here's what we eventually found: django/core/urlresolvers.py is not thread-safe and the server was using Apache+mod_wsgi config with multiple threads per process. Once you know this is a thread safety problem, it's trivial to find the relevant info, for instance: http://code.djangoproject.com/wiki/DjangoSpecifications/Core/Threading. The conclusion: Django is known to have thread safety issues so if you are using a multithreaded config be prepared to see some magic under the load and react to it.

Possible solutions?

  1. Switch to a prefork config. This is safest, but you probably won't get as much performance.
  2. Patch your copy of Django. We shall post a patch for this particular issue here.
  3. Increase the maximum number of requests per process. Haven't tried that yet, but it should be helpful because the problem is in initialization which is only done once per each new process (see details below).
  4. Other suggestions are very welcome.

For the record: the site I mentioned uses Django SVN revision 7153. Just looked at trunk head at it seems to have the same problem, but this needs testing. We are going to submit a ticket and probably a patch which we'll also attach to this post.

For curious, here's a more detailed description. The code that causes the problem:

    def _get_reverse_dict(self):
        if not self._reverse_dict and hasattr(self.urlconf_module, 'urlpatterns'):
            for pattern in reversed(self.urlconf_module.urlpatterns):
                if isinstance(pattern, RegexURLResolver):
                    for key, value in pattern.reverse_dict.iteritems():
                        self._reverse_dict[key] = (pattern,) + value
                else:
                    self._reverse_dict[pattern.callback] = (pattern,)
                    self._reverse_dict[pattern.name] = (pattern,)
        return self._reverse_dict
This is a method from RegexURLResolver class, instances of which are shared globally, so if one thread is inside this method and filling the dict and the other one requests it, the second thread will see a partially-populated dict and has all chances of not finding the views it looks for. Since the initialization of the dict only happens once per Python process, increasing the process lifetime (number of requests) should reduce the frequency of these failures.

Posted by Alexander Tereshkin on 2008-10-30 14:29:25
Tags: django, python, web

Computer Vision: the evolution of one object detection approach

In the course of my recent work I've been investigating Computer Vision field, and more specifically, object detection and tracking problem. Object detection is a classic problem in computer vision, with applications in many areas. In this post I'm going to share my findings about the latest research advances in this area.

Everyone who is familiar by hearsay with it, knows the great paper “Robust Real-Time Face Detection” by Paul Viola and Michael J. Jones. Following their approach it is possible to detect faces at 15 frames per second on a conventional Intel Pentium III (as they did). Our task is to achieve the same results on some conventional embedded processor. Also, their learning algorithm http://en.wikipedia.org/wiki/Machine_learning may be running days or even weeks. Our goal is to try to reduce this time.

There are three main contributions of their paper: an integral image http://en.wikipedia.org/wiki/Summed_Area_Table, a simple and efficient classifier (at that time) and an approach for combining more complex classifiers.

The integral image allows for extremely fast feature evolution. Viola and Michael used a set of Haar-like features http://en.wikipedia.org/wiki/Haar-like_features. Using the integral image any Haar-like feature can be computed at any scale or location in constant time. Haar features are similar to Haar wavelets http://en.wikipedia.org/wiki/Haar_wavelet. The feature set considers rectangular regions of image and sums up the pixels in this region. The results are used to classify images. The Haar wavelets are a natural set basis functions which encode differences in average intensities between different regions.

Face detection is a rare event binary classification problem, in the sense that among millions of sub-windows, only a few contain faces. The classifier introduced by those guys filters out over 50% of the image while preserving 99% of the faces. As we will see later, we could go far beyond this limits. This classifier is built using AdaBoost learning algorithm http://en.wikipedia.org/wiki/AdaBoost to select a small number of critical visual features from a very large set of potential features. The basic idea of AdaBoost is to train an ensemble of M weak classifiersin the form:

,

where are voting coefficients. This is done by training, for m from 1 to M. However, at m-th stage, the training examples are weighted differently using a weighting function, based on performance of the previous stage. The idea is to increase the weights of those correctly classified examples, so that subsequent weak classifiers focus more on the wrongly classified examples.

With each feature is associated one weak classifier, which classifies a sub-window by first integrating the sub-window with the feature using integral images, and then thresholding the value with a properly chosen threshold. Then they are combined into a “cascade” which allows useless regions to be quickly discarded. Those sub-windows which are not rejected by initial classifier of the cascade are processed by a sequence of classifiers, each slightly more complex than the last.

We finished with Viola and Michael's paper. Now let us examine some ideas to improve performance of the learning algorithm and detector.

Minh-Tri Pham and Tat-Jen Cham brought a new real insight into the aforementioned method with their paper “Fast training and selection of Haar features using statistics in boosting-based face detection”. It is extremely fast and comparable in accuracy. They claim that “traditional techniques for training a weak classifier usually run in, with N examples... and T features” And they present a novel approach to train a weak classifier in time, where d is the number of pixels of the probed image sub-window, by using only the statistics of the weighted input data. Did you see it?! Rather than trying to reduce either N or T, they break up the NT factor. I do not want to reprint that exciting paper because it contains a lot of math, so just go and read the paper. By the way, this is currently the world's fastest method for training a face detector. So it is worth the reader's attention. If you need help in statistics, I strongly advise to read first chapters of “Pattern classification” by R.O. Duda, P. E. Hart, and D. G. Stork.

Let us look in another direction. The main limitations of the aforementioned detectors is to gather a representative training set. To overcome that limitation we should resort to the help of H.Grabner, P. M. Roth, and H.Bischof. They have published an “inspiring” piece of work: “Is Pedestrian Detection Really a Hard Task?”.

The basic idea is to train a separate classifier for each image location which has only to discriminate the object from background at a specific location. For the complexity is reduced “we can use a simple update strategy that requires only a few positive samples and is stable by design”. And some limitations arise there: fixed cameras looking always at the same scene. This approach uses on-line unsupervised learning methods http://en.wikipedia.org/wiki/Unsupervised_learning which usually tend to wrong updates which reduces the performance of the detector. The detector might start to drift and would end in an unreliable state. There are variations of the on-line learning algorithm which are suffering from the drifting problem:

  • Self-training. In a self-training framework the current classifier evaluates an input sample and predicts a label which is then directly used to update the classifier.

  • Co-training. In a co-training framework two classifiers are trained in parallel using different views of data. The confident predicted labels are used to update the other classifier, respectively.

  • Autonomous supervision. The results obtained by the classifier are verified by an analyzer and if the obtained labels are confident the samples are used for updating the classifier.

And again, authors overcome these problems and proved that whether the false positive rate is increased nor that recall is decreased if the system is running for a longer period of time. Their method does not take into account the classifier response for delivering update.

For practical implementation a fixed highly overlapping grid (both in location and scale) is placed in the image. Each grid element corresponds to a classifier , which is responsible only for its underlying image patch.

To be continued...


References:

  1. Paul Viola, Michael Jones. Robust Real-Time Face Detection. 2004

  2. Mihn-Tri Phan, Tat-Jen Cham. Fast training and selection of Haar features using statistics in boosting-based face detection. 2007

  3. R. O. Duda, P. E. Hart, and D. G. Stork. Pattern classification. Wiley-interscience. 2000

  4. H. Grabner, P. M. Roth, and H. Bischof. Is Pedestrian Detection Really a Hard Task. 2007

Posted by Ivan Chernetsky on 2008-10-23 17:39:02
Tags: computer vision

Copyright © Y-NODE Software 2008.

All rights reserved. Privacy Policy.

E-mail Us Download our Corporate Brochure