@lmazuel I received an answer from Microsoft engineers, in case you're interested:
"...neither the Python SDK nor the REST API support the creation of Azure Databricks services / workspaces at this time.
You can create an Azure Databricks workspace by using an ARM template. This is a JSON file that defines the infrastructure and configuration for your project.
You can deploy resources with ARM templates and Resource Manager REST API https://docs.microsoft.com/en-us/azure/azure-resource-manager/templates/deploy-rest
You can also deploy an ARM template is by using one of the following options
o Install Azure PowerShell
o Install Azure CLI on Windows / Linux or macOS.
Documentation on these two options can be found here https://docs.microsoft.com/en-us/azure/azure-resource-manager/templates/template-tutorial-create-first-template?tabs=azure-powershell#command-line-deployment"
client.workspaces.create_or_update(
{
"managed_resource_group_id": "<subscritpion_id_to_replace>/resourceGroups/"+managedResourceGroupName,
"sku": {"name":"premium"},
"location":"westus"
},
resource_group_name,
workspace_name
).wait() # wait for completion
msrest.exceptions.SerializationError: Unable to build a model: Unable to deserialize to object: type, AttributeError: 'str' object has no attribute 'get', DeserializationError: Unable to deserialize to object: type, AttributeError: 'str' object has no attribute 'get'
Hi @tomarv2 , I'm not directly working on Mgmt anymore, but I looked at the SDK and indeed you need now to wrap the string with the type StorageAccountCheckNameAvailabilityParameters
:
availability = storage_client.storage_accounts.check_name_availability(StorageAccountCheckNameAvailabilityParameters("storage_account_name"))
If you found an incorrect sample, please create an issue on Github so I can tag the right team on it to fix it. https://github.com/Azure/azure-sdk-for-python/issues
Thanks!
api-operations
class. When I am using the create_or_update function then the SDK gives an error of Api not found
. Instead of creating a new API instance, the function is trying to update the value. Can anyone help me with this? sdk link: https://azuresdkdocs.blob.core.windows.net/$web/python/azure-mgmt-apimanagement/0.1.0/azure.mgmt.apimanagement.operations.html#azure.mgmt.apimanagement.operations.ApiOperations
ServiceBusAdministrationClient
that forward_to
requires additional bearer token headers to be provided. We have fixed the bug in the PR:Azure/azure-sdk-for-python#15610. The fix will be carried in our next release.
I am using azure-storage-file-datalake package to connect with ADLS gen2
from azure.identity import ClientSecretCredential
# service principal credential
tenant_id = 'xxxxxxx'
client_id = 'xxxxxxxxx'
client_secret = 'xxxxxxxx'
storage_account_name = 'xxxxxxxx'
credential = ClientSecretCredential(tenant_id, client_id, client_secret)
service_client = DataLakeServiceClient(account_url="{}://{}.dfs.core.windows.net".format(
"https", storage_account_name), credential=credential) # I have also tried blob instead of dfs in account_url
Folder structure in ADLS gen2 from where I have to read parquet file look like this. Inside container of ADLS gen2 we folder_a which contain folder_b in which there is parquet file.
folder_a
|-folder_b
parquet_file1
from gen1 storage we used to read parquet file like this.
from azure.datalake.store import lib
from azure.datalake.store.core import AzureDLFileSystem
import pyarrow.parquet as pq
adls = lib.auth(tenant_id=directory_id,
client_id=app_id,
client_secret=app_key)
adl = AzureDLFileSystem(adls, store_name=adls_name)
f = adl.open(file, 'rb') # 'file is parquet file with path of parquet file folder_a/folder_b/parquet_file1'
table = pq.read_table(f)
How do we proceed with gen2 storage, we are stuck at this point
http://peter-hoffmann.com/2020/azure-data-lake-storage-gen-2-with-python.html is the link that we have followed.
Note - We are not using databrick to do this