ElasticSearch Cookbook(Second Edition)
上QQ阅读APP看书,第一时间看更新

Adding a field with multiple mappings

Often, a field must be processed with several core types or in different ways. For example, a string field must be processed as analyzed for search and as not_analyzed for sorting. To do this, you need to define a multi_field special property called fields.

Note

In the previous ElasticSearch versions (prior to 1.x), there was the multi_field type, but this has now deprecated and will be removed in favor of the fields property.

The fields property is a very powerful feature of mapping because it allows you to use the same field in different ways.

Getting ready

You need a working ElasticSearch cluster.

How to do it...

To define a multifields property, you need to:

  1. Use field as a type – define the main field type, as we saw in the previous sections.
  2. Define a dictionary that contains subfields called fields. The subfield with the same name as the parent field is the default one.

If you consider the item of your order example, you can index the name in this way:

"name": {
    "type": "string",
    "index": "not_analyzed",
    "fields": {
      "name": {
        "type": "string",
        "index": "not_analyzed"
      },
      "tk": {
        "type": "string",
        "index": "analyzed"
      },
      "code": {
        "type": "string",
        "index": "analyzed",
        "analyzer": "code_analyzer"
      }
    }
  },

If you already have a mapping stored in ElasticSearch and want to migrate the fields in a fields property, it's enough to save a new mapping with a different type, and ElasticSearch provides the merge automatically. New subfields in the fields property can be added without a problem at any moment, but the new subfields will be available only to new indexed documents.

How it works...

During indexing, when ElasticSearch processes a fields property, it reprocesses the same field for every subfield defined in the mapping.

To access the subfields of a multifield, we have a new path value built on the base field plus the subfield name. If you consider the earlier example, you have:

  • name: This points to the default field subfield (the not_analyzed subfield)
  • name.tk: This points to the standard analyzed (tokenized) field
  • name.code: This points to a field analyzed with a code extractor analyzer

In the earlier example, we changed the analyzer to introduce a code extractor analyzer that allows you to extract the item code from a string.

Using the fields property, if you index a string such as Good Item to buy - ABC1234 you'll have:

  • name = "Good Item to buy - ABC1234" (useful for sorting)
  • name.tk=["good", "item", "to", "buy", "abc1234"] (useful for searching)
  • name.code = ["ABC1234"] (useful for searching and faceting)

There's more...

The fields property is very useful for data processing, because it allows you to define several ways to process a field's data.

For example, if you are working on a document content, you can define analyzers to extract names, places, date/time, geolocation, and so on as subfields.

The subfields of a multifield are standard core type fields; you can perform every process you want on them such as search, filter, facet, and scripting.

See also

  • The Specifying a different analyzer recipe in this chapter