CS 385 Object Detection

Gurman Gill

EVIL DETECTION
CLASSIFICATION BY PHYSIOGNOMY

___

By Trevor Fox, Cody Mccants, Zack Urbany

“It is possible to infer character from features, if it is granted that the body and the soul are changed together by the natural affections.” - Aristotle

___

INTRODUCTION

Just like in face recognition, first impressions from faces are raw, spontaneous assessments that are a result of perceiving rather than reasoning. That is why it is a job better left for a computer, which does not have subjective baggages or biases behind its reasoning to determine physiognomic classification. The automated inference on criminality eliminates the variable of meta-accuracy (the competence of the human judge) all together. Besides the advantage of objectivity, sophisticated algorithms based on machine learning may discover very delicate and elusive nuances in facial characteristics and structures that correlate to innate personal traits and yet hide below the cognitive threshold of most untrained nonexperts (this is at least a distinct theoretical possibility)[1]. Our work shows the balance of good and evil characteristics in a user uploaded face.

Introduction, Cont.

Using a database filled with pictures of faces from criminals and people generally accepted as ‘good’, our program uses techniques such as: SIFT, Eigenfaces, Bag-of-Words, and Nearest-Neighbor with Support Vector Machine (SVM) to compute the similarities between a user uploaded face and those of our database. Our program then displays a percentage of either how ‘good’ or ‘evil’ the user is based on features detected from our database.

Purpose

If social attributes and facial features are correlated, then the validity of face-induced social inference from automated face detection can be empirically validated. If the classification rate is low, then the validity of face-induced social inference can be safely negated.

Database

Our database of faces are controlled for still images presenting the neck and above, with the person staring forward with little to no smile or other occlusions, backgrounds removed, eyes centered in the same position in all pictures, and also grayscaled. We tried to find pictures with uniform frontal lighting. Taking these precautions lessens the degree of dissimilarity based on non-relevant features. The features we care about mostly are those of the corners of the eye, mouth, top of the nose, and forehead. These are features found on every face and are where the features can be matched to a greater degree of accuracy. These strategic positions on a face are features that are invariant to source cameras. All images are normalized and aligned into size 112 × 92. All images were taken from the web.

Physiognomy: a person's facial features or expression, especially when regarded as indicative of character or ethnic origin.

Results Using SIFT

The SIFT classifier performed well most of the time. However, it had some inconsistencies when matching features.

Here are some results that we expected to get from SIFT feature matching:

However, SIFT was not right all of the time. Here are some outlier (bad) results we got:

SIFT sometimes found random features such as the one highlighted in red shown here:

Usually, these seemingly random features matched to incorrect areas from the faces in our database.

We had to choose a consistent threshold for feature matching. We found that 2.0 was a good choice and lead to relevant parts of the face matching accurately. With too low of a threshold, we received too many features that were not as relevant.

Here is an example with 1.0 threshold:

Now, with threshold set to 2.0:

We also noticed that the background had some influence in our results, when we encountered this image:

We then decided to remove the backgrounds from all images to ensure more relevant matches were found with SIFT.

Something I pondered about was the fact that we got different results if we did not scale the images down to exactly [112 x 92]. I couldn’t think why that would make any difference, but it did influence our results. Another thing we had to do was play around with different values for the peak_threshold and the edge_threshold.

Here are features detected by SIFT with ranging peak_thresholds and edge_thresholds:

peak_threshold = 0 peak_threshold = 3 peak_threshold = 5

edge_threshold = 10 edge_threshold = 7 edge_threshold = 5

*Note that the first image contains the default values used by SIFT.

The best results were matched with both peak and edge threshold set to 5. This reduces the number of smaller, less relevant features which lead to more erroneous matches.

I noticed a pattern of common feature matches detected by SIFT:

The ridge and/or tip of the nose.
Tip of user’s hair was matched with the tip of the sample face hair.
There were also a lot of forehead matches, and by that I mean a small portion of the forehead was matched.
The skin directly under the nose but above the mouth.
Cheeks where the lightning is the strongest.

Combining the features matched from our database to the user returns 2 results: Number of features matched from the ‘Good’ people, and the number of features matched from the ‘Evil’ people. We took the highest features found from either good or evil sample set and subtracted from the other. This determined the ‘%’ of ether how ‘good’ or ‘evil’ a user is.

Results Using Bag-Of-Words

Using the Bag-Of-Words Technique to try and compute whether or not someone is a criminal lead to surprising results with RPC and ROC curves in the high 90% depending on image processing size.

Image Size of 200

Image Size of 300

Image Size of 400

When the image size is increased we get a more accurate RPC and ROC curve, but as we grow the size larger and larger the computation times grow. At and image size of 400 the total processing time could be 5 plus minutes.

Using this method this technique did a decent job at correctly guessing whether or not someone is a criminal, according to the guidelines that we have set, or not. Ted Bundy was correctly classified as a criminal, and Trevor was correctly identified as a non-criminal. However the Boston Marathon Bomber was incorrectly classified as a non-criminal. This could have happened based on the fact that our database being so small or by not having a diverse enough database.

Results Using Eigenfaces

When we used the Eigenfaces technique to classify images, the feature matching results were horribly inaccurate. The RPC curve was down in 60%’s and so we decided to scrap it from the program. Here is the curve:

Conclusion

The results from this program when run on an image of Hitler return:

>> main('hitler.jpg')

Number of Features Matched in GOOD Faces With USER Face: 97

Number of Features Matched in EVIL Faces With USER Face: 108

You are 11% Evil!

Criminal

This concludes that our project is indeed a success.

Works Cited.

[1] Wu, Xiaolin, and Xi Zhang. "Automated Inference on Criminality Using Face Images." [1611.04135] Automated Inference on Criminality Using Face Images. Cornell University Library, 13 Nov. 2016. Web. 05 Dec. 2016.