Using explicit mapping creation
If you consider an index as a database in the SQL world, a mapping is similar to the table definition.
ElasticSearch is able to understand the structure of the document that you are indexing (reflection) and creates the mapping definition automatically (explicit mapping creation).
Getting ready
You will need a working ElasticSearch cluster, an index named test (see the Creating an index recipe in Chapter 4, Basic Operations), and basic knowledge of JSON.
How to do it...
To create an explicit mapping, perform the following steps:
- You can explicitly create a mapping by adding a new document in ElasticSearch:
- On a Linux shell:
#create an index curl -XPUT http://127.0.0.1:9200/test #{acknowledged":true} #put a document curl -XPUT http://127.0.0.1:9200/test/mytype/1 -d '{"name":"Paul", "age":35}' # {"ok":true,"_index":"test","_type":"mytype","_id":"1","_version":1} #get the mapping and pretty print it curl –XGET http://127.0.0.1:9200/test/mytype/_mapping?pretty=true
- On a Linux shell:
- This is how the resulting mapping, autocreated by ElasticSearch, should look:
{ "mytype" : { "properties" : { "age" : { "type" : "long" }, "name" : { "type" : "string" } } } }
How it works...
The first command line creates an index named test, where you can configure the type/mapping and insert documents.
The second command line inserts a document into the index. (We'll take a look at index creation and record indexing in Chapter 4, Basic Operations.)
During the document's indexing phase, ElasticSearch checks whether the mytype
type exists; if not, it creates the type dynamically.
ElasticSearch reads all the default properties for the field of the mapping and starts processing them:
- If the field is already present in the mapping, and the value of the field is valid (that is, if it matches the correct type), then ElasticSearch does not need to change the current mapping.
- If the field is already present in the mapping but the value of the field is of a different type, the type inference engine tries to upgrade the field type (such as from an integer to a long value). If the types are not compatible, then it throws an exception and the index process will fail.
- If the field is not present, it will try to autodetect the type of field; it will also update the mapping to a new field mapping.
There's more...
In ElasticSearch, the separation of documents in types is logical: the ElasticSearch core engine transparently manages it. Physically, all the document types go in the same Lucene index, so they are not fully separated. The concept of types is purely logical and is enforced by ElasticSearch. The user is not bothered about this internal management, but in some cases, with a huge amount of records, this has an impact on performance. This affects the reading and writing of records because all the records are stored in the same index file.
Every document has a unique identifier, called UID, for an index; it's stored in the special _uid
field of the document. It's automatically calculated by adding the type of the document to the _id
value. (In our example, the _uid
value will be mytype#1
.)
The _id
value can be provided at the time of indexing, or it can be assigned automatically by ElasticSearch if it's missing.
When a mapping type is created or changed, ElasticSearch automatically propagates mapping changes to all the nodes in the cluster so that all the shards are aligned such that a particular type can be processed.