Spatial Search Elasticsearch tutorial

I'm going to use a trivial example to demonstrate Elastic Search's spatial search capabilities: given any point, find the closest large US city. In this case, we'll define large as any city with a population of more than 100,000.

The population and location data used in this example is from GeoNames.

Mapping type setup

curl -XPUT http://localhost:9200/us_large_cities -d '
{
    "mappings": {
        "city": {
            "properties": {
                "city": {"type": "string"},
                "state": {"type": "string"},
                "location": {"type": "geo_point"}
            }
        }
    }
}
'

Index population

I pre-generated a list of curl commands you can use to populate your index. You can download it from here.

Here's a small snapshot of what the commands look like:

curl -XPOST http://localhost:9200/us_large_cities/city/ -d '{"city": "Anchorage", "state": "AK","location": {"lat": "61.2180556", "lon": "-149.9002778"}}'
curl -XPOST http://localhost:9200/us_large_cities/city/ -d '{"city": "Birmingham", "state": "AL","location": {"lat": "33.5206608", "lon": "-86.8024900"}}'
curl -XPOST http://localhost:9200/us_large_cities/city/ -d '{"city": "Huntsville", "state": "AL","location": {"lat": "34.7303688", "lon": "-86.5861037"}}'
curl -XPOST http://localhost:9200/us_large_cities/city/ -d '{"city": "Mobile", "state": "AL","location": {"lat": "30.6943566", "lon": "-88.0430541"}}'

To execute it, you can either copy and paste the commands into a terminal, or do this:

sh insert_big_cities.sh

Searching

Let's look for the nearest big cities near El Cerrito, CA, a city neighboring Berkeley in the San Francisco Bay Area.

The lat/long of El Cerrito is 37.9174, -122.3050.

curl -XGET 'http://localhost:9200/us_large_cities/city/_search?pretty=true' -d '
{
  "query": {
    "filtered" : {
        "query" : {
            "match_all" : {}
        },
        "filter" : {
            "geo_distance" : {
                "distance" : "20km",
                "location" : {
                    "lat" : 37.9174,
                    "lon" : -122.3050
                }
            }
        }
    }
  }
}'

Sure enough, ES returns a list of cities in the Bay Area, nearest first.

"hits" : {
    "total" : 4,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "us_large_cities",
      "_type" : "city",
      "_id" : "dzOx9AXlQDKS7gzkRTt6dg",
      "_score" : 1.0, "_source" : {"city": "Berkeley", "state": "CA","location": {"lat": "37.8715926", "lon": "-122.2727470"}}
    }, {
      "_index" : "us_large_cities",
      "_type" : "city",
      "_id" : "Zl0UZ6NiQbukO_v-M92eEA",
      "_score" : 1.0, "_source" : {"city": "Richmond", "state": "CA","location": {"lat": "37.9357576", "lon": "-122.3477486"}}
    }, {
      "_index" : "us_large_cities",
      "_type" : "city",
      "_id" : "qfLjTtASTYOfM8Y6gVi3_w",
      "_score" : 1.0, "_source" : {"city": "Oakland", "state": "CA","location": {"lat": "37.8043722", "lon": "-122.2708026"}}
    }, {
      "_index" : "us_large_cities",
      "_type" : "city",
      "_id" : "EXVx26nDT2S_fsbTtACIQQ",
      "_score" : 1.0, "_source" : {"city": "San Francisco", "state": "CA","location": {"lat": "37.7749295", "lon": "-122.4194155"}}
    } ]
  }

Obtaining the distance

You'll notice in the response above that the distance of each hit from the point is not displayed. Here's a query that does return the distance:

curl -XGET 'http://localhost:9200/us_large_cities/city/_search?pretty=true' -d '
{
  "sort" : [
      {
          "_geo_distance" : {
              "location" : {
                    "lat" : 37.9174,
                    "lon" : -122.3050
              },
              "order" : "asc",
              "unit" : "km"
          }
      }
  ],
  "query": {
    "filtered" : {
        "query" : {
            "match_all" : {}
        },
        "filter" : {
            "geo_distance" : {
                "distance" : "20km",
                "location" : {
                    "lat" : 37.9174,
                    "lon" : -122.3050
                }
            }
        }
    }
  }
}'

This time, you'll see the distance returned in the sort field:

"hits" : {
    "total" : 4,
    "max_score" : null,
    "hits" : [ {
      "_index" : "us_large_cities",
      "_type" : "city",
      "_id" : "Zl0UZ6NiQbukO_v-M92eEA",
      "_score" : null, "_source" : {"city": "Richmond", "state": "CA","location": {"lat": "37.9357576", "lon": "-122.3477486"}},
      "sort" : [ 4.269137329311211 ]
    }, {
      "_index" : "us_large_cities",
      "_type" : "city",
      "_id" : "dzOx9AXlQDKS7gzkRTt6dg",
      "_score" : null, "_source" : {"city": "Berkeley", "state": "CA","location": {"lat": "37.8715926", "lon": "-122.2727470"}},
      "sort" : [ 5.827010863571717 ]
    }, {
      "_index" : "us_large_cities",
      "_type" : "city",
      "_id" : "qfLjTtASTYOfM8Y6gVi3_w",
      "_score" : null, "_source" : {"city": "Oakland", "state": "CA","location": {"lat": "37.8043722", "lon": "-122.2708026"}},
      "sort" : [ 12.921705300451462 ]
    }, {
      "_index" : "us_large_cities",
      "_type" : "city",
      "_id" : "EXVx26nDT2S_fsbTtACIQQ",
      "_score" : null, "_source" : {"city": "San Francisco", "state": "CA","location": {"lat": "37.7749295", "lon": "-122.4194155"}},
      "sort" : [ 18.7589675567265 ]
    } ]
  }