Azure blob download is incredibly slow using PowerShell (via Get-AzureStorageBlobContent), but fast via Azure Explorer, etc? -
with basic code loops through storage account , mirrors containers , blobs local disk, i'm finding get-asurestorageblobcontent cmdlet incredibly slow? seems take real time second or 2 per blob regardless of blob size...which adds considerable overhead when we've got thousands of tiny files.
in contrast, on same machine , network connection (even running simultaneously), azure explorer same bulk copy 10x 20x faster, , azcopy literally 100x faster (async), it's not network issue.
is there more efficient way use azure storage cmdlets, or dog slow nature? get-azurestoragecontainer mentions -concurrenttaskcount option implies ability async, there's no documentation on how achieve async , given operates on single item i'm not sure how could?
this code i'm running:
$localcontent = "c:\local_copy" $storageaccountname = "myblobaccount" $storageaccountkey = "mykey" import-module azure $blob_account = new-azurestoragecontext -storageaccountname $storageaccountname -storageaccountkey $storageaccountkey -protocol https get-azurestoragecontainer -context $blob_account | foreach-object { $container = $_.name get-azurestorageblob -container $container -context $blob_account | foreach-object { $local_path = "$localcontent\{0}\{1}" -f$container,$_.name $local_dir = split-path $local_path if (!(test-path $local_dir)) { new-item -path $local_dir -itemtype directory -force } get-azurestorageblobcontent -context $blob_account -container $container -blob $_.name -destination $local_path -force | out-null } }
i looked @ source code get-azurestorageblobcontent
on github , found interesting things may cause slowness of downloading blobs (especially smaller sized blobs):
line 165:
icloudblob blob = channel.getblobreferencefromserver(container, blobname, accesscondition, requestoptions, operationcontext);
what code makes request server fetch blob type. add 1 request server each blob.
line 252 - 262:
try { downloadblob(blob, filepath); channel.fetchblobattributes(blob, accesscondition, requestoptions, operationcontext); } catch (exception e) { writedebuglog(string.format(resources.downloadblobfailed, blob.name, blob.container.name, filepath, e.message)); throw; }
if @ code above, first downloads blob downloadblob
, tries fetch blob attributes channel.fetchblobattributes
. haven't looked @ source code channel.fetchblobattributes
function suspect making 1 more request server.
so download single blob, code making 3 requests server reason slowness. certain, trace requests/response through fiddler , see how cmdlet interacting storage.
Comments
Post a Comment