ADX

Clarify Kusto delete commands on ADX

Christoph Thale|
#Azure#ADX#KQL#Kusto#Delete

Azure Data Explorer (ADX) provides robust mechanisms for data management, including sophisticated data deletion strategies. Understanding the differences between hard and soft deletes, as well as the role of retention policies, is crucial for effective data lifecycle management. This article delves into these concepts, helping you navigate the intricacies of data deletion in ADX.

Hard Deletes vs. Soft Deletes

In ADX, data deletion can be broadly categorized into two types: hard deletes and soft deletes. Each method serves different purposes and has distinct implications.

  • Hard Deletes: A hard delete is an irreversible process. When you use commands like .purge, the data is permanently removed from the disk. This means that once the operation is completed, there is no way to recover the deleted data. Hard deletes are useful when you need to ensure that sensitive or unnecessary data is completely eradicated. You also have to enable this on your ADX cluster in the settings.

  • Soft Deletes: Soft deletes, on the other hand, provide a way to mark data for deletion without immediately removing it from the disk. Commands such as .drop table and .delete initiate a soft delete. In this process, the data appears to be removed from view, for instance, through tools like Kusto Explorer. However, the actual data remains on the disk for an additional 14 days. During this period, a background job within ADX will remove the data or the entire extent.

The key difference between hard and soft deletes is the 14-day recovery window provided by soft deletes. This period allows for potential recovery of data if needed, offering a safety net for accidental or premature deletions. If you accidentally soft-delete data, you should immediately contact the Azure Microsoft Support Team. The engineers can restore the data within the 14-day window, as previously mentioned.

Retention Policies

In addition to hard and soft deletes, ADX uses retention policies to manage data lifecycle more comprehensively. Retention policies help in defining how long data should be retained and under what conditions it can be deleted.

Retention policies in ADX are characterized by two main attributes: SoftDeletePeriod and Recoverability.

  1. SoftDeletePeriod:

    • This attribute specifies the duration for which data will be retained after the soft delete operation. For example, a policy might be configured with a SoftDeletePeriod of 3650.00:00:00 days, which equals 10 years. This means that from the time of data ingestion, the data will be kept for 10 years before it is eligible for deletion.
  2. Recoverability:

    • The Recoverability attribute determines whether soft-deleted data can be recovered after the SoftDeletePeriod has elapsed. If set to Enabled, the data will undergo a soft delete process after the retention period ends, providing an additional 14 days for potential recovery. If set to Disabled, the data will be permanently deleted from the disk once the retention period is over, without any recovery window.

Consider the following retention policy configuration:

{
  "SoftDeletePeriod": "3650.00:00:00",
  "Recoverability": "Enabled"
}

In this scenario, data is retained for 10 years from ingestion. After the 10-year mark, a soft delete is performed, allowing for an additional 14 days for recovery. If the Recoverability attribute were set to Disabled, the data would be immediately and permanently deleted after the 10 years.

That is all you have to know about deletion in my opinion but if you want to dive deeper into this topic you could read these docs:

Thank you for reading and we will see you in the next blog article.

Back to Blog