-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Destination Azure Blob Storage: failing with container already exists #14662
Comments
Question: the current connector (destination) is in Java. Are we allowed to submit a new version in Python? Including additional features. |
@TamGB that's an excellent question; most destination connectors share standard code and logic, so it might best not to diverge from that. I'm looking further into this though! |
@natalyjazzviolin let me know if its possible to do it in Python, if it is, I'm glad to create an MR with a new version including the changes solicited by @TamGB (#14248) since we need both of these as well :) |
Both #14662 and #14248 hit us in the face immediately after taking the very first baby-steps with airbyte. Our Usecase: Syncing a Source to a directory 'landing' within a azure blob container named 'typo3' inside an Gen2 Storage Account (what feels like a very basic usecase of an sync task to be honest). |
@grishick this is one of the issues I was referring to yesterday. What would be a good next step? |
We now achieved syncing into virtual subFolder(s) by using the subFolder name as Connection Destination Stream Prefix, instead of appending it directly to the container-name in the destination configuration. So what works now for my above usecase is: destination container name 'typo3' and connection destination stream prefix "landing/". Besides, this way we don't get any Destination errors when validating the config, even when the container 'typo3' already exists. |
@TamGB @Guilherme-B regarding rewriting this connector in Python: we cannot support a Python AzureBlobStorage destination connector or include it in Airbyte Cloud. You could develop a custom connector in Python with the functionality you need and give it a different name, we just won't include it in Airbyte Cloud. Would you be interested in solving the current issue that way? @marcmesh I'm glad you found a work around. |
Hey @natalyjazzviolin , got it. As for solving the issue through that, personally it doesn't make sense, its a bug, one that can impact several people hence, we need to come up with a way to solve the current issue. Whether its an open-source or paid platform, accepting a bug and leaving it as-is is never a professional solution and just establishes the platform as a bug-fest. |
@TamGB, we certainly plan on fixing the bug. I've requested that someone from the connector team take a look at this, so you should receive more guidance on debugging this soon! |
The problem persists even in Airbyte Cloud. Whilst the file is created in the Azure Blob Storage for the connection checking, the connection fails on Airbyte. The logs clearly show Airbyte is scanning directories it should not. The specified container is |
@TamGB I'm connected and using azure blob storage perfectly fine in a small use case. However, I spun up a test case to see whether I could reproduce by creating a destination with thousands of virtual folders and got this exact issue. It likely does have to do with scanning too many folders (which it shouldnt) and getting a response that's too large or times out. |
@jfox-teine any updates? Did you manage to make it work? |
@TamGB Do you still have this issue? According to your initial message, you used 0.1.5. Can you check if it works with a newer version? Regarding the own development of this connector: I found issue #5687 which IMO is the right thing to push for. |
Environment
Current Behavior
When setting up Azure Blob Storage destination on an existing container yields a
non-json response
error, which seems to be triggered at AirbyteRequestService.tsAttached is a screenshot of the actual behavior, where the
bronze
container already exists, with the remaining part/airbyte/ga
being created by Airbyte successfully. The odd behavior is twofold:Could this be due to a change in the response when the container was not created by Airbyte? Unfortunately no logs are produced.
Expected Behavior
The setup should work like any other case, regardless of the Container existing beforehand or not.
Logs
No output in the Server nor Scheduler logs.
Steps to Reproduce
The text was updated successfully, but these errors were encountered: