Problem Statement
There are multiple properties associated with a file uploaded on Azure Blob Storage / Azure Data Lake Storage
![Overcoming Limitations of Get Metadata Activity in Azure Data Factory/Synapse]()
One can leverage Get Metadata Activity within the pipelines to get only the below sub set of properties :
![Overcoming Limitations of Get Metadata Activity in Azure Data Factory/Synapse]()
Is it possible to get other properties of the file like Creation Time, Content-Type etc. in Synapse / Data Factory pipelines.
Prerequisites
- Azure Data Factory / Synapse
- Azure Blob Storage / Azure Data Lake Storage
Solution
1. We would be leveraging Azure Blob Storage REST API : Get Blob to get the blob file properties.
2. Provide Synapse / Data Factory Storage Blob Data Reader access within the Azure Blob Storage to authenticate via Managed Identity.
![Overcoming Limitations of Get Metadata Activity in Azure Data Factory/Synapse]()
a) Go to Access Control IAM of Azure Blob Storage and Click on Add & Select Add Role Assignment
![Overcoming Limitations of Get Metadata Activity in Azure Data Factory/Synapse]()
b) Search Storage Blob Data Reader role and proceed further
![Overcoming Limitations of Get Metadata Activity in Azure Data Factory/Synapse]()
![Overcoming Limitations of Get Metadata Activity in Azure Data Factory/Synapse]()
3. Create a pipeline within Synapse / Data Factory leveraging Web Activity to trigger the REST API.
![Overcoming Limitations of Get Metadata Activity in Azure Data Factory/Synapse]()
URL
In case of Azure Blob Storage
https://<<StorageAccountName>>.blob.core.windows.net/<<ContainerName>>/<<FileName>>
In case of Azure Data Lake Storage
https://<<DataLakeStorageName>>.dfs.core.windows.net/<<ContainerName>>/<<FileName/DirectoryName>>
Method: GET
Authentication: System Assigned Managed Identity
Resource: https://storage.azure.com/
Headers:
1 x-ms-version : 2017-11-09
Output
Get Metadata Activity output
![Overcoming Limitations of Get Metadata Activity in Azure Data Factory/Synapse]()
Web Activity Output (Azure Blob Storage)
![Overcoming Limitations of Get Metadata Activity in Azure Data Factory/Synapse]()
where [x-ms-creation-time] represents the file creation time.
Web Activity Output (Azure Data Lake Storage)
Directory Property
![Overcoming Limitations of Get Metadata Activity in Azure Data Factory/Synapse]()
Web Activity
![Overcoming Limitations of Get Metadata Activity in Azure Data Factory/Synapse]()