A new paper by Paul Tar et al. published in Advances in Space Research looks at training computers, rather than people, to identify features on planetary landscapes.
The story of our solar system’s evolution encoded in the landscapes of planets, moons and asteroids is now available to us through the images returned by exploration missions. In these images we see evidence of many processes familiar to us Earth dwellers.
Human beings are excellent at spotting features in landscapes, but the quantity of image data now available has outstripped the number of experts available to examine it. One way to overcome this is through citizen science projects, where members of the public are recruited to help classify images online. Another way is to train a computer to identify regions of images dominated by certain landforms; in this way the process of identifying images and extracting basic data might be automated. In addition, although they aren’t as good at spotting patterns as humans, computers are more consistent – they do the same thing every time, without fail (and right or wrong!). Potentially this consistency can be exploited when we want to test whether features in different areas of a planet are significantly different.
What sort of information might you want from a computer system that sifts through images? Bearing in mind that computers are nothing like as good at image analysis as human beings, and that some images will contain unanticipated features, one output you would want is something to tell you whether the computer’s interpretation is a satisfactory account of what’s in the image. If it is satisfactory, you would want to know what fraction of each image represented each of a set of landscape forms (for instance, what fraction of an image is covered by dune fields, what fraction by lava fields…). In this paper we develop a way of doing this.
We’re used to the idea of pixels, but how do you describe the patterns present in an image? A single black pixel might be part of zebra stripes or a bullseye. In this work we encode the pattern centred on each pixel by looking at the contrasts between pairs of pixels near it. This allows us to reduce the pattern to a single number based on whether the first or second member of each pair is the darkest, one binary digit for each pair. We can then count how many examples of each pixel-centred pattern there are in each image and make a histogram.
The next step is to train the computer using a set of representative features classified by humans. For instance, we might get a set of images that humans identify as dune fields. Working from the histograms of the images, the computer can generate a reduced set of histograms that can be added together in various proportions to make the ones generated from the training data. It’s a bit like reducing the myriad varieties of cake to a limited set of ingredients from which they are all made. We can do the same for images identified as other terrain types (e.g. desiccated ground, lava flows). When the computer is given a new image, it can test how much (if any) of the ingredients of each type of terrain are present.
The advantage of this approach is that it allows us to decide whether the new image really can be made from patterns the computer has seen before. This is because of the counting element in generating histograms. Scientists understand that if you count instances of some event repeatedly, the uncertainty in the answer will be the square root of the number of counts (this is called a Poisson distribution). For instance, if cars pass a point on a road at a constant average rate of 36 every 10 minutes, a histogram of repeated counts for 10 minutes would have a width of 6 – quite often only 32 would pass in ten minutes, it would be very rare than only 18 passed. Because we know how we expect the number of entries in each histogram bin to vary, we can test whether the combination of know patterns is a satisfactory match to the image being analysed. When the match is acceptable, we can say what fraction of the image is represented by each terrain type.
In this paper we show how this new approach to histogram analysis, which we call “Linear Poisson Models”, can be used to classify planetary images. But it has many more potential applications. With funding from the Leverhulme Trust we are now exploring a range of them, including how we can use it to interpret data acquired in our laboratories more precisely.
The full citation for this paper is: Paul Tar, Neil Thacker, Jamie Gilmour and Merren Jones (2015) Automated quantitative measurements and associated error covariances for Planetary Image Analysis. Advances in Space Research 56, 92-105. doi:10.1016/j.asr.2015.03.043
This summary was written by Jamie Gilmour.