Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Destination Azure Blob Storage: failing with container already exists #14662

Open
TamGB opened this issue Jul 13, 2022 · 13 comments
Open

Destination Azure Blob Storage: failing with container already exists #14662

TamGB opened this issue Jul 13, 2022 · 13 comments
Labels
community connectors/destination/azure-blob-storage frozen Not being actively worked on lang/java team/destinations Destinations team's backlog type/bug Something isn't working

Comments

@TamGB
Copy link

TamGB commented Jul 13, 2022

Environment

  • Airbyte version: example is 0.39.34-alpha
  • Deployment: Kubernetes
  • Destination Connector and version: airbyte/destination-azure-blob-storage | 0.1.5
  • Step where error happened: Setup new connection

Current Behavior

When setting up Azure Blob Storage destination on an existing container yields a non-json response error, which seems to be triggered at AirbyteRequestService.ts

Attached is a screenshot of the actual behavior, where the bronze container already exists, with the remaining part /airbyte/ga being created by Airbyte successfully. The odd behavior is twofold:

  • On the one hand, the container subdirectories and the test files are created, just the response seems to be invalid
  • When specifying a non-existing container or even, a container created by Airbyte, works like a charm

Could this be due to a change in the response when the container was not created by Airbyte? Unfortunately no logs are produced.

image

image

Expected Behavior

The setup should work like any other case, regardless of the Container existing beforehand or not.

Logs

No output in the Server nor Scheduler logs.

Steps to Reproduce

  1. Create an Azure Blob Storage Container
  2. Setup an Azure Blob Storage destination on top of said container
  3. Wait for it to fail
@TamGB
Copy link
Author

TamGB commented Jul 13, 2022

Question: the current connector (destination) is in Java. Are we allowed to submit a new version in Python? Including additional features.

@natalyjazzviolin
Copy link
Contributor

@TamGB that's an excellent question; most destination connectors share standard code and logic, so it might best not to diverge from that. I'm looking further into this though!

@Guilherme-B
Copy link

@natalyjazzviolin let me know if its possible to do it in Python, if it is, I'm glad to create an MR with a new version including the changes solicited by @TamGB (#14248) since we need both of these as well :)

@burpeenerd
Copy link

Both #14662 and #14248 hit us in the face immediately after taking the very first baby-steps with airbyte.

Our Usecase: Syncing a Source to a directory 'landing' within a azure blob container named 'typo3' inside an Gen2 Storage Account (what feels like a very basic usecase of an sync task to be honest).
Using "typo3/landing" as Bucket Name results in a successful test, seeing the testblob being created and deleted in the landing folder, but Connection is failing during the first sync.

@natalyjazzviolin
Copy link
Contributor

@grishick this is one of the issues I was referring to yesterday. What would be a good next step?

@burpeenerd
Copy link

We now achieved syncing into virtual subFolder(s) by using the subFolder name as Connection Destination Stream Prefix, instead of appending it directly to the container-name in the destination configuration.

So what works now for my above usecase is: destination container name 'typo3' and connection destination stream prefix "landing/". Besides, this way we don't get any Destination errors when validating the config, even when the container 'typo3' already exists.

@natalyjazzviolin
Copy link
Contributor

@TamGB @Guilherme-B regarding rewriting this connector in Python: we cannot support a Python AzureBlobStorage destination connector or include it in Airbyte Cloud. You could develop a custom connector in Python with the functionality you need and give it a different name, we just won't include it in Airbyte Cloud. Would you be interested in solving the current issue that way?

@marcmesh I'm glad you found a work around.

@TamGB
Copy link
Author

TamGB commented Jul 21, 2022

Hey @natalyjazzviolin , got it. As for solving the issue through that, personally it doesn't make sense, its a bug, one that can impact several people hence, we need to come up with a way to solve the current issue. Whether its an open-source or paid platform, accepting a bug and leaving it as-is is never a professional solution and just establishes the platform as a bug-fest.
Did you manage to understand what is causing the issue? Which response is the invalid? Or can you guide me into how understanding what the issue is since Airbyte is not producing logs on the topic?

@natalyjazzviolin
Copy link
Contributor

@TamGB, we certainly plan on fixing the bug. I've requested that someone from the connector team take a look at this, so you should receive more guidance on debugging this soon!

@grishick grishick added the team/destinations Destinations team's backlog label Sep 27, 2022
@TamGB
Copy link
Author

TamGB commented Oct 11, 2022

The problem persists even in Airbyte Cloud.
This effectively makes Airbyte unuseable with Azure.

Whilst the file is created in the Azure Blob Storage for the connection checking, the connection fails on Airbyte.

The logs clearly show Airbyte is scanning directories it should not. The specified container is bronze/airbyte, yet the log shows Airbyte scanning all of the bronze container (e.g. ga/* etc), which may be part of the problem on its own, possibly related with #16016.

@marcosmarxm marcosmarxm changed the title AzureBlobStorage Destination: Failing with container already exists Destination Azure Blob Storage: failing with container already exists Nov 30, 2022
@jfox-teine
Copy link

@TamGB I'm connected and using azure blob storage perfectly fine in a small use case. However, I spun up a test case to see whether I could reproduce by creating a destination with thousands of virtual folders and got this exact issue. It likely does have to do with scanning too many folders (which it shouldnt) and getting a response that's too large or times out.

@TamGB
Copy link
Author

TamGB commented Feb 7, 2023

@jfox-teine any updates? Did you manage to make it work?

@leo-schick
Copy link
Contributor

@TamGB Do you still have this issue? According to your initial message, you used 0.1.5. Can you check if it works with a newer version?
It seems to me that it works for me. I use version 0.1.6 of the connector. There is version 0.2.0 available as well.

Regarding the own development of this connector: I found issue #5687 which IMO is the right thing to push for.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community connectors/destination/azure-blob-storage frozen Not being actively worked on lang/java team/destinations Destinations team's backlog type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

10 participants