Queries about the Gitlab Worker data model[To be implemented]
See original GitHub issueModel Schema expectation for Gitlab Data Collection Worker
This query is with regard to the schema which would be created during the implementation of the GitLab Data Collection Worker. After carrying out research pertaining to GitLab’s internal API & how it interacts with their data model, there are some commonly used data tables such as issues
, pull_requests
, commits
etc with respect to GitHub. The Question – Should we be using the existent tables present in the augur schema to store the data during the collection process[Additionally creating tables which are not common] (or) create a new collection of tables solely for GitLab?
I’d prefer the creation of gitlab_<data_model_name>
tables under the augur_data schema but thought it would be great to gain some inputs from the community.
Adding, I’d also love to hear about any suggestions as I’m attempting to chalk out a data model outline for the idea.
Issue Analytics
- State:
- Created 4 years ago
- Comments:13 (12 by maintainers)
Okay. Cool. Actually I once used Stitch which is a subscription-based/paid data collector worker for Lever and JIRA which pushes data to Big Query / AWS Redshift. It used to add extra columns for new attributes being added. This idea felt cool to me as storing extra data should not create a problem but could be used later.
@mrsaicharan1 just gave your proposal a glance and it’s already looking great!! I will leave some detailed review comments in the near future but you’re off to a great start!