Friday, March 24, 2017

The Mad Catter - Salesforce Predictive Vision Services


No animals were harmed in the creation of this blog post or the associated presentation. The standard disclaimer applies.

I've got a problem. A cat problem to be precise. While I'm more of a dog person, I don't particularly mind cats. However, recently there has been a bit of an influx of them with the neighbors. There are at least a half dozen of them that roam freely around the neighborhood. Not the end of the world, but they have a nasty habit of leaving presents on the lawn for the kids to find. Things like the following:

In short, they poop everywhere and leave the remains of birds lying around. Things that aren't so great to step on or for the kids to find when playing in the garden.

I spoke with the immediate neighbor who owns two of the cats, and he suggested spraying them with water to deter them away. While that did indeed prove to be a very effective, amusing, and satisfying approach to move them on it required 24 hour vigilance as they just kept coming back.

Get a Dog

My wife keeps saying we should just get a dog. I guess getting a dog to chase the cats away is an option. But it seems like training a dog to use the hose might be more trouble in the long run.

Technology to the Rescue

Thankfully I found a handy device the attaches to the end of the house and activates a sprinkler head when a built in motion detector is set off.

Perfect! Great! Cat comes into range, then a sudden noise and spray of water sends them off towards someone else's lawn to do their cat business. Problem solved and I can go back to doing more fun activities.

Except there was one small problem. The PIR motion sensor didn't particularly care what was moving in front of it. Cats, birds, the kids on their way to school, a courier with a parcel for me, a tree in the wind, the mother in law. It would spray them all regardless of whether I wanted it to or not.

Salesforce Predictive Vision Service

Technology wasn't solving my problem. I needed to use more of it!

I recalled a recent presentation by the Salesforce Developers team - Build Smarter Apps with New Predictive Vision Service. The short version of that presentation is you can train a deep learning image classifier with a set of training images. Then when you give it a new image it will give you probabilities about what is likely in the picture. I've create a quick start unmanaged package for it to save you going through most of the install steps.

To make this work I needed a large collection of cat images to train the dataset from a create a model. Luckily for me, providing pictures of cats is something that the internet excels at.

The second challenge with using the predictive vision services is managing how many images I am going to send through to the service. If I just point a web camera out the window it could be capturing 30+ frames per second. Not really practical to send off each frame to the service for identification when there might be nothing of interest happening 99% of the time.

Motion detection

I had a few options here.

Option One would to to stick with the basic PIR motion sensor, but it would still be generating a ton of false positives that would need to pass through the image recognition. A simple cool down timer would help, but the image captured immediately after the first motion is detected would likely get something as it is just entering the frame.

I figure since I'm going to need a camera to capture the initial image I might as well get some usage out of it in detecting the motion. Because of the initial processing step I can exclude motion from certain areas, such as a driveway or tree that often moves in the wind. There can also be a slight delay after the motion is detected and before the prediction image is captured. This gives the subject time to move into the image.

The prototype solution looks like this:

  1. A webcam that can be pointed at the area to be monitored.
  2. Motion Detection software to process the video feed and determine where the movement is and the magnitude. The magnitude is useful, as particularly small subjects like birds can be completely ignored.
  3. The ability for that software to pass frames of interest off to the Salesforce Predictive Vision Service. This is a simple REST POST request using an access token.
  4. If the probability from the frame indicates a Cat is present, send a signal to the Raspberry Pi.
  5. On the signal, the Raspberry Pi activates the GPIO pin connected to a relay.
  6. When activated, the relay supplies power to the existing automated sprinkler, which activates on initial power on when the sensitivity is set to maximum. Another option here is directly connecting a solenoid water value to the hose line.

When all put together the end result looks something like this:

The Einstein bust with terminator-esque glowing red eyes was part of the presentation I gave on this topic.

While filming that video I inadvertently live tested it on myself as well. An aging fitting on the hose connector to the sprinkler had come loose outside at the tap. So I went out to fix that, restored the water pressure to the sprinkler, then walked back to the laptop to resume the test. Only when I checked the motion detection screen did I realize it had captured my image passing in front of the sprinkler. Thankfully the predictive vision services came back indicating I didn't resemble a cat and the sprinkler didn't activate. Success!


It occured to me that there were further improvements that could be made.

The first and easiest change I made is to activate on things other than cats. It can be equally selective to activate on a wandering neighbors dog, squirrels, general wildlife, etc...

I needed a way to deal with unfortunate false positives, such as a person wearing something with a picture of a cat on it. These can partially be avoided by looking at all the probabilities that Einstein is returning and having thresholds against each label. I.e. Activate on any Cat prediction above 50% unless there is any prediction indicating a person in the field of view. Images kept from the activations could also be used to further refine the examples in the dataset.

These first two refinements are actually present in the video above. When using the general image classifier it typically identifies the stuffed cat as a teddy bear. So in the small section to the bottom right of the app you can mark labels to activate on and labels that will override and prevent the activation.

Other changes I might consider making:

The motion sensor could be maintained and introduced as the step prior to activating the video feed. This would increase the time between the target entering the area and the sprinkler activating, but would save a lot of endless processing loops looking at an unchanging image.

If I forgo some of the more processing intensive motion tracking the whole solution could be moved onto the Raspberry Pi. This would make it a much more economical solution.

However, another option with the motion detection still in place would be to crop the frame image to just the area where the motion was detected. This should lead to much higher prediction accuracy as it is only focusing on the moving subject.

When real world testing commences with live subjects I'll need to add a video capture option to cover the time from just before the sprinkler is activated till just after it switches off. I think the results will be worth the extra effort.

I have a range of other devices that could easily be activated via the relay attached to the Raspberry Pi. One such device is an ultrasonic pest repeller. Perhaps combined with a temperature sensor as a slightly kinder deterrent on cold nights.

User Group Presentation

I gave a talk to the Sydney developer user group on this project. The slides, as they were:

I still feel the need to settle on a name for the project. Options include:

  • The Mad Catter (after the elusive Catter Trailhead badge)
  • The Einstein Cannon
  • The Cattinator (After the general them of the presentation.)

See also: