Export and import data

All Kuzzle data are contained in Elasticsearch and separated into two categories:

Public data representing documents created by users within collections.
They are accessible through the API with the controllers index, collection, document and realtime.

Internal indexes, containing data used internally by Kuzzle.
Users, profiles and roles are examples of internal documents.
These data are not directly accessible like public data, but only through methods exposed in the auth and security controllers.

Backup entire Kuzzle content

If you want to make a complete backup of all the data contained in Kuzzle (public and internal), you can use Elasticdump.

Elasticdump will generate JSON files containing:

  • the mappings of the Elasticsearch indexes
  • the documents
  • analyzers

The content of a dump allows to completely restore the state of a Kuzzle cluster after an incident for example.

Copied to clipboard!
$ multielasticdump --direction dump --input http://localhost:9200 --output ./dump

Restore the entire Kuzzle content

If you want to restore the entire Kuzzle data (public and internal) from a full backup, you can use Elasticdump.

It is advisable to turn off Kuzzle during import because this operation is not atomic and the interactions performed with the API during import could cause integrity problems.

Copied to clipboard!
$ multielasticdump --direction load --input ./dump --output http://localhost:9200

Export public data

Public data can be exported with Kourou, the Kuzzle CLI.

To export your data with Kourou, you can use the commands index:export or collection:export.

These commands will extract all the data contained in the collections and save them in JSONL format.

The mapping associated with each collection will also be exported and saved next to the data file.

Copied to clipboard!
# Export index "nyc-open-data" to directory "./dump/nyc-open-data"
$ kourou index:export nyc-open-data ./dump/

 🚀 Kourou - Export an entire index content (JSONL format)
 
 [] Connecting to http://localhost:7512 ...
  Dumping index "nyc-open-data" in nyc-open-data/ ...
  Dumping yellow-taxi |==================================================== 100% ||| 42532/42532 documents
  Dumping green-taxi |==================================================== 100% ||| 21782/21782 documents
 [] nyc-open-data index exported

$ tree dump/

nyc-open-data
├─ yellow-taxi
│ ├── documents.jsonl
│ └ └── mappings.json
└── green-taxi
    ├── documents.jsonl
    └── mappings.json

Import public data

Public data can be imported with Kourou, the Kuzzle CLI.

To import your data, you can use the commands index:import or collection:import.

These commands use the files generated by the export commands to import the collections and their documents into Kuzzle.

It is possible to import only the data with the --no-mappings option.

Copied to clipboard!
$ kourou index:import ./nyc-open-data       
 
 🚀 Kourou - Import a previously dumped index
 
 [] Connecting to http://localhost:7512 ...
 [] Start importing dump from nyc-open-data in same index
 [] Successfully imported 42532 documents in "nyc-open-data:yellow-taxi"
 [] Dump directory nyc-open-data/yellow-taxi imported
 [] Successfully imported 21782 documents in "nyc-open-data:green-taxi"
 [] Dump directory nyc-open-data/green-taxi imported

Export internal data

To export all internal data, you can use Elasticdump and export all internal collections.

This will export:

  • data from the internal collections: roles, profiles, users, api-keys, validations and config.
  • plugin collections

It is necessary to export the plugin collections if you want to keep the user authentication information because they are stored in the internal storage space of each plugin.

Copied to clipboard!
# Export all internal collections
$ multielasticdump --direction dump --input http://localhost:9200 --output ./dump --match '%'.

If you want to export only roles and profiles, you can use the commands kourou role:export and kourou profile:export.

Import internal data

To import internal data, you can use Elasticdump and the files generated by the export.

It is advisable to turn off Kuzzle during import because this operation is not atomic and the interactions performed with the API during import could cause integrity problems.

Copied to clipboard!
$ multielasticdump --direction load --input ./dump --output http://localhost:9200