Gracefully delete all pipelines from a Synapse workspace that has pipeline to pipeline dependencies

by Jon Shah | Aug 25, 2024 | Az Synapse, PowerShell

As you build out a Synapse workspace, you will inevitably start to create a dependency model between pipelines. When deploying artefacts to the workspace, you may need to clear down the existing content before the deployment. This article provides a code solution to deleting all pipelines in a Synapse workspace,

You could use CLI functions az synapse pipeline list and az synapse pipeline delete in a list and delete loop. This is a ‘brute force’ approach, that will work eventually, but will need to be allowed to loop many times over many hours (depending on the pipeline count and number of dependencies) to clear down the pipelines.

A more graceful way is to identify the dependent pipelines during the deletion process and to move through the dependency chain, until a non-dependent pipeline is found that can be deleted.

This simple example has Pipeline 2 being called from Pipeline 1.

The code does the following:

  • Build an initial list and count of workspace pipelines.
  • Iterate the pipeline list one at a time.
  • A deletion call is made for each pipeline.
  • Successfully deleted pipelines are added to a deleted pipelines list.
  • If the deletion is unsuccessful, the dependent pipeline is retrieved from the error message and a call is made to delete this pipeline if it isn’t in the deleted pipelines list. The process works through the dependency hierarchy until a non-dependent pipeline can be deleted. Pipelines that can’t be deleted are added to a locked pipelines list for this dependency hierarchy.
  • The process continues, until the workspace pipeline list has been exhausted. The list is refreshed again and the process restarted, until all pipelines are removed.
$deletedPipelines = [PSCustomObject]@()
$pipelines = Get-AzSynapsePipeline -WorkspaceName ${{ variables.target_synapse_workspace_name }}
$totalPipelines = ($pipelines).count

Write-Host "`n> Starting workspace pipeline deletion process"
Write-Host "> $totalPipelines pipelines found in workspace"

if ($totalPipelines -eq 0) {
  Write-Host "> Nothing to do."
}
else {
  Write-Host "> This will take some time...`n"

  do {
    for ($i = 0; $i -lt $pipelines.count; $i++)  {
      $pipeline = $pipelines[$i].Name
      $lookup = ($deletedPipelines | Where-Object { $_.deletedPipeline -eq $pipeline }).count
      $pipelineNo = $i+1

      if ($lookup -eq 0) {
          Write-Host "> Processing pipeline $pipelineNo of $totalPipelines"
          $lockedPipelines = [PSCustomObject]@()
          DeletePipeline $pipeline
      }
    }

    $pipelines = Get-AzSynapsePipeline -WorkspaceName ${{ variables.target_synapse_workspace_name }}
    $totalPipelines = ($pipelines).count

    if ($totalPipelines -ne 0) {
      Write-Host "> Restarting Process Loop...`n"
    }
  } until (
      $pipelines.count -eq 0
  )
}
Write-Host ">> Complete"

function DeletePipeline {
    param (
        [Parameter(Mandatory=$true)][string]$pipeline
    )
    try {
        $isDeletedPipeline = ($deletedPipelines | Where-Object { $_.deletedPipeline -eq $pipeline }).count

        if ($isDeletedPipeline -eq 0) {
            write-host ">> Attempting to delete '$pipeline'"
        
            $deletePipeline = "Remove-AzSynapsePipeline `
                                -WorkspaceName ${{ variables.target_synapse_workspace_name }} `
                                -Name ""$pipeline"" `
                                -Force `
                                -ErrorAction Stop"
            Invoke-Expression $deletePipeline

            Write-Host ">> '$pipeline' deleted`n"
            
            $deletedPipelines += [PSCustomObject]@{
                deletedPipeline = $pipeline
            }
        }
        else {
            write-host ">> '$pipeline' has already been deleted!`n"
        } 
    }
    catch {
        if ($_ -match '.*The document cannot be deleted since it is referenced by.*') {
          $errorString = ($_.Exception.Message.Replace('"','')).Replace('.','')
          $dependentPipeline = RetrievePipelineName `
                                -startString "referenced by" `
                                -endString "}}" `
                                -fullString $errorString

          if ($dependentPipeline.length -ne 0) {
              Write-Host ">> '$pipeline' depends on '$dependentPipeline'"

              $isDeletedPipeline = ($deletedPipelines | Where-Object { $_.deletedPipeline -eq $dependentPipeline }).count
              $isLockedPipeline = ($lockedPipelines | Where-Object { $_.lockedPipeline -eq $dependentPipeline }).count

              $lockedPipelines += [PSCustomObject]@{
                  lockedPipeline = $pipeline
              }

              if ($isDeletedPipeline -eq 0 -and $isLockedPipeline -eq 0) {
                  DeletePipeline $dependentPipeline
              }
              else {
                Write-Host ">> '$dependentPipeline' is dependency locked!  Requeuing it for deletion.`n"
              }
          }
        } else {
            Write-Host ">>> Error <<< '$pipeline' cant be deleted!"
            Write-Host ">>>"
            Write-Host $_.Exception.Message
            Write-Host "<<<`n"
        }
    }
}

The deletion attempt of Pipeline 2, generates a standard dependency error message, which contains the dependent pipeline name.

The pipeline name can be retrieved from the error message using this PowerShell function which is called by the DeletePipeline function above.

function RetrievePipelineName($startString, $endString, $fullString){
    $pattern = "$startString(.*?)$endString"
    $dPipeline = [regex]::Match($fullString,$pattern).Groups[1].Value
    return $dPipeline.Trim()
}