-
Notifications
You must be signed in to change notification settings - Fork 548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First pass at Astra Doc Ingest logic #2969
Conversation
@erichare Let's continue working on this from this branch. Is this running successfully for you when you test it? |
@potter-potter i had something come up earlier and haven't had a chance to fully test, i will get back to you on that soon. Also sorry for the very undescriptive branch name here haha, I forgot to change the default 😄 I'll follow up soon |
I don't think we can fully sort of copy delta-table. I was just trying to point you to something that read from a "database table"-like source. There isn't really a good example of a source connector that parallels AstraDB as far as I can tell. |
Gotchya. I tried to pull out the logic that was specific (i.e., metadata fields, we dont really have a great notion of), but the basic idea of pulling the data, storing as CSV with pandas, and then using that as the input data seemed to make sense for Astra too. but with that said, yeah, let me go ahead and test it more before wasting your time looking at this further 😄 if you see anything obvious to adjust definitely let me know though |
It actually looks like this gets the entire table: In most of the connectors, since they are not table based, That being said, this is not a "docs" based connector (its table based) so it is a little different. I will look through our source connectors again to see if there is a better connector to model. |
That is a really great point. Yep, it will pull the full table. paginated_find itself can take a filter condition, if there's any way to pass kwargs to |
@erichare I haven't had time to really look at other examples. (mainly because I don't think there is one that maps quite right to this one) I'm going to redo the destination PR so we can get it merged and then dive into this source (Ingest) logic later. |
@potter-potter - Do you know if this PR is still being actively worked? |
@MthwRobinson I got pulled into some other stuff, but would like to get this merged... with that said, we sort of had two simultaneous PRs going, and i think this particular PR is out of date. If @potter-potter agrees, i'll open a new PR with just the Astra DB Source connector logic and we can close this one. I have some updates to make to it anyway as a result of recent changes on the Astra end. |
Thanks @erichare ! |
@erichare Thanks. If you could close this and open a new one. We merged in the updates to the Astra destination (via a different PR), but this source connector needed more work. |
I'm on it @potter-potter . Thanks! |
@potter-potter i don't have write access to your branch, so opened up a PR for my changes here