Skip to main content

 

ThoughtSpot

About Backups

Overview

Explains what the feature is or what its benefits are to the user or customer.

Feature

You can use a backup to restore a cluster to a prior state, a differently configured appliance, or move it to another appliance. Some advanced administrative operations also use backups.

When to Use a Backup

Backups are created from an existing snapshot, but they differ from snapshots in the following ways:

  • Backups are stored on disk in a directory, while snapshots are stored in HDFS.
  • You can use a backup to recover from data loss or corruption, even if your cluster has been destroyed. Snapshots can be lost if the HDFS name node fails, you lose multiple disks at once, or the entire cluster is destroyed.
  • If you need to move data between two appliances, you must use a backup. Snapshots may only be used to restore to the cluster they were taken from.

Backups can be full, lightweight, or dataless:

Full Backups

Full backups are entire backups of the cluster with all data, whether loaded from the web interface or from tsload. They are written to a directory, which may be moved between clusters, even if the cluster configuration is different. Full backups are very large, so before taking one, you should make sure there is enough disk space in the directory where it will be stored. NAS (network attached storage) is recommended for storing backups.

Lightweight Backups

Lightweight backups contain everything that makes up a cluster, except for any data loaded through ThoughtSpot Loader (tsload). Any data loaded via tsload can be re-loaded after the cluster has been restored, using the same scripts or remote connections you used to load it initially.

The lightweight backups contain the following:

  • Cluster configuration (SSH, LDAP, etc.)

  • In-memory data cache

  • All data that is stored in HDFS

  • Data uploaded by users

  • Metadata for the data store

  • Users, groups and permissions

  • Objects created by users (answers, pinboards, worksheets, and formulas) with their shares and permissions.

  • Data model and row-level security rules.

Dataless Backup

Dataless backup saves a backup of the schema, with no data. This is provided mainly for support purposes, to enable you to send a copy of your cluster metadata to ThoughtSpot Support for troubleshooting, without compromising data security and privacy. When restoring from a dataless backup, you must supply the correct release tarball, since this type of backup does not include the software release.

  • Was this article helpful?