Store and access your data #

Kuzzle uses Elasticsearch as a document-oriented storage.

All documents, whether internal documents such as User, Profile or Role or user documents, are stored in Elasticsearch indexes.

Kuzzle's storage capacities are therefore directly linked to Elasticsearch's capacities and limits.

Data storage organization #

There are 4 hierarchical levels in data storage:

indexes
collections
documents
fields

An index brings together several collections, which in turn contains several documents, each of which is composed of several fields.
data storage organization

Comparison with a relational database #

Even if Elasticsearch is not, strictly speaking, a database, the way it stores data is very similar to that of document-oriented databases.

If you're more familiar with the way relational databases store data, here is how it compares:

Document-oriented storage	Relational databases storage
index	database
collection	table
document	line
field	column

Comparing document-oriented storages with relational databases would require a more thorough analysis, but for the purposes of this guide, we shall reduce the list of differences to the following 3 items:

Documents are identified with a unique identifier, which is stored separately from the content of documents (compared to primary/foreign keys, stored alongside the data they identify),
no advanced join system,
a typed mapping system to define how Elasticsearch should index the fields.

All these differences should be taken into account when modeling your data model and your application.

Creating indexes and collections #

The creation of indexes and collections is done through the API via the methods index:create and collection:create.

For example, to create a nyc-open-data index:

curl -X POST localhost:7512/nyc-open-data/_create?pretty

Click to see Kuzzle API answer

{
  "requestId": "e9ab8d1a-ea1a-4fdd-ad50-07c82245d88c",
  "status": 200,
  "error": null,
  "controller": "index",
  "action": "create",
  "collection": null,
  "index": "nyc-open-data",
  "volatile": null,
  "result": {
    "acknowledged": true,
    "shards_acknowledged": true,
    "index": "nyc-open-data"
  }
}

Then a yellow-taxi collection in this index:

It is recommended to specify a data mapping when creating a collection so that its content can correctly be indexed by Elasticsearch.

curl -X PUT localhost:7512/nyc-open-data/yellow-taxi?pretty

Click to see Kuzzle API answer

{
  "requestId": "1d5b7afe-9d81-4c0e-92bc-aa57b24c35eb",
  "status": 200,
  "error": null,
  "controller": "collection",
  "action": "create",
  "collection": "yellow-taxi",
  "index": "nyc-open-data",
  "volatile": null,
  "result": {
    "acknowledged": true
  }
}

It is also possible to define in advance a set of indexes and collections, then load them at the start of Kuzzle (option --mappings, via the CLI or with the API method admin:loadMappings

Writing documents #

The Kuzzle API offers several methods to create, modify or delete documents in its storage space.

Each of these methods has its own specificities, we can distinguish two main families of methods: those acting on a document and those acting on multiple documents.

Methods acting on a single document:

document:create: creates a new document
document:createOrReplace: creates a new document or replaces an existing one
document:delete: deletes a document
document:replace: replaces an existing document
document:update: updates fields in an existing document

Methods acting on multiple documents

document:deleteByQuery: deletes documents matching an Elasticsearch query
document:mCreate: creates multiple documents
document:mCreateOrReplace: creates or replaces multiple documents
document:mDelete: deletes multiple documents
document:mReplace: replaces multiple documents
document:mUpdate: updates fields of multiple documents

The bulk controller features low-level methods for injecting documents in collections.

For example, to create a new document in our index:

curl -X POST -H "Content-Type: application/json" -d '{ "driver": "liia", "arriveAt": "2019-07-26"  }' http://localhost:7512/nyc-open-data/yellow-taxi/document-uniq-id/_create?pretty

Click to see Kuzzle's answer

{
  "requestId": "e146e2a5-ff5b-4b6f-a603-8cde43f353fe",
  "status": 200,
  "error": null,
  "controller": "document",
  "action": "create",
  "collection": "yellow-taxi",
  "index": "nyc-open-data",
  "volatile": null,
  "result": {
    "_index": "nyc-open-data",
    "_type": "yellow-taxi",
    "_id": "document-uniq-id", // Document ID
    "_version": 1,
    "result": "created",
    "created": true,
    "_source": {                   // Document body
      "driver": "liia",
      "arriveAt": "2019-07-26",
      "_kuzzle_info": {            // Kuzzle metadata
        "author": "-1",
        "createdAt": 1561443009768,
        "updatedAt": null,
        "updater": null,
        "active": true,
        "deletedAt": null
      }
    }
  }
}

Using the document:update method allows us to add a new field while keeping the old ones:

curl -X PUT -H "Content-Type: application/json" -d '{ "car": "rickshaw"  }' http://localhost:7512/nyc-open-data/yellow-taxi/document-uniq-id/_update?pretty

Click to see Kuzzle's answer

{
  "requestId": "1be6c9e6-2626-4f85-ad64-d1cc248c7bee",
  "status": 200,
  "error": null,
  "controller": "document",
  "action": "update",
  "collection": "yellow-taxi",
  "index": "nyc-open-data",
  "volatile": null,
  "result": {
    "_index": "nyc-open-data",
    "_type": "yellow-taxi",
    "_id": "document-uniq-id",
    "_version": 2,
    "result": "updated"
  }
}

Reading documents #

There are two ways to retrieve documents:

using the document unique identifiers,
by performing a search with an Elasticsearch query.

Getting documents #

To retrieve a document when you know its unique identifier, you have to use the document:get or the document:mGet method.

For example, to retrieve the documents we created in the previous examples:

curl http://localhost:7512/nyc-open-data/yellow-taxi/document-uniq-id?pretty

Click to see Kuzzle's answer

{
  "requestId": "62af64c8-5dc6-48c1-942b-2604bf97686e",
  "status": 200,
  "error": null,
  "controller": "document",
  "action": "get",
  "collection": "yellow-taxi",
  "index": "nyc-open-data",
  "volatile": null,
  "result": {
    "_index": "nyc-open-data",
    "_type": "yellow-taxi",
    "_id": "document-uniq-id",
    "_version": 2,
    "found": true,
    "_source": {
      "driver": "liia",
      "arriveAt": "2019-07-26",
      "_kuzzle_info": {
        "author": "-1",
        "createdAt": 1561443222474,
        "updatedAt": 1561443279526,
        "updater": "-1",
        "active": true,
        "deletedAt": null
      },
      "car": "rickshaw"
    }
  }
}

Searching documents #

Searching documents is performed using the Elasticsearch Query DSL.
As Elasticsearch is an indexing engine designed for document search, it offers a wide range of advanced search options like geo queries, full text queries, aggregations, and more.

Requests must be made through Kuzzle using the document:search method.

When a document is created or modified, its latest version is not immediately available in the results of a search.
First, you have to wait until Elasticsearch has finished updating its index.
It is possible to makes Elasticsearch wait for the indexation before sending the answer by setting refresh=wait_for. It's also possible to wait indexation after every requests before sending the answer with index:setAutoRefresh.

For example, to retrieve documents between the ages of 25 and 28:

# First create some documents
for i in {18..42}; do; curl -X POST -H "Content-Type: application/json" -d "{ \"driver\": \"driver-$i\", \"age\": $i  }" http://localhost:7512/nyc-open-data/yellow-taxi/_create &; sleep 0.05; done

# Search for drivers between 25 and 28 years
curl -X POST -H "Content-Type: application/json" -d '{ 
  "query": { 
    "range": { 
      "age": { "gte": 25, "lte": 28 } 
    } 
  }  
}
' http://localhost:7512/nyc-open-data/yellow-taxi/_search?pretty

Click to see Kuzzle's answer

{
  "requestId": "836768a4-0b46-447a-b4c5-8932101f24de",
  "status": 200,
  "error": null,
  "controller": "document",
  "action": "search",
  "collection": "yellow-taxi",
  "index": "nyc-open-data",
  "volatile": null,
  "result": {
    "took": 12,
    "timed_out": false,
    "hits": [
      {
        "_index": "nyc-open-data",
        "_type": "yellow-taxi",
        "_id": "AWuNXWff6MDMyQmSeEuT",
        "_score": 1,
        "_source": {
          "driver": "driver-27",
          "age": 27,
          "_kuzzle_info": {
            "author": "-1",
            "createdAt": 1561444837342,
            "updatedAt": null,
            "updater": null,
            "active": true,
            "deletedAt": null
          }
        }
      },
      {
        "_index": "nyc-open-data",
        "_type": "yellow-taxi",
        "_id": "AWuNXWd46MDMyQmSeEuR",
        "_score": 1,
        "_source": {
          "driver": "driver-25",
          "age": 25,
          "_kuzzle_info": {
            "author": "-1",
            "createdAt": 1561444837239,
            "updatedAt": null,
            "updater": null,
            "active": true,
            "deletedAt": null
          }
        }
      },
      {
        "_index": "nyc-open-data",
        "_type": "yellow-taxi",
        "_id": "AWuNXWgQ6MDMyQmSeEuU",
        "_score": 1,
        "_source": {
          "driver": "driver-28",
          "age": 28,
          "_kuzzle_info": {
            "author": "-1",
            "createdAt": 1561444837391,
            "updatedAt": null,
            "updater": null,
            "active": true,
            "deletedAt": null
          }
        }
      },
      {
        "_index": "nyc-open-data",
        "_type": "yellow-taxi",
        "_id": "AWuNXWer6MDMyQmSeEuS",
        "_score": 1,
        "_source": {
          "driver": "driver-26",
          "age": 26,
          "_kuzzle_info": {
            "author": "-1",
            "createdAt": 1561444837290,
            "updatedAt": null,
            "updater": null,
            "active": true,
            "deletedAt": null
          }
        }
      }
    ],
    "total": 4,
    "max_score": 1
  }
}

What Now? #

Exploit the full capabilites of Elasticsearch with Data Mappings
Read our Elasticsearch Cookbook to learn more about how querying works in Kuzzle
Use document metadata to find or recover documents
Keep track of data changes using Real-time Notifications

Edit this page on Github