question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Postgres Ingestion Question

See original GitHub issue

I have an already created Postgres database instance that I’m pointing a new Hasura docker container to. It’s successfully reading and ingesting data, but there’s one odd quirk that’s preventing me from automatically importing everything, and I can’t find any documentation on it.

I have a table called payload that is defined as the following:

class Payload(p.Model):
    # this is actually a sha256 from other information about the payload
    uuid = p.TextField(unique=True, null=False)
    # tag a payload with information like spearphish, custom bypass, lat mov, etc (indicates "how")
    tag = p.TextField(null=True)
    # creator of the payload, cannot be null! must be attributed to somebody (indicates "who")
    operator = p.ForeignKeyField(Operator, null=False)
    creation_time = p.DateTimeField(default=datetime.datetime.utcnow, null=False)
    # this is fine because this is an instance of a payload, so it's tied to one PayloadType
    payload_type = p.ForeignKeyField(PayloadType, null=False)
    # this will signify if a current callback made / spawned a new callback that's checking in
    #   this helps track how we're getting callbacks (which payloads/tags/parents/operators)
    pcallback = p.DeferredForeignKey("Callback", null=True)
    operation = p.ForeignKeyField(Operation, null=False)
    wrapped_payload = p.ForeignKeyField("self", null=True)
    deleted = p.BooleanField(null=False, default=False)
    # if the payload is in the build process: building, success, error
    build_container = p.TextField(null=False)
    build_phase = p.TextField(null=False, default="building")
    # capture error or any other info
    build_message = p.TextField(null=False, default="")
    # if there is a slack webhook for the operation, decide if this payload should generate an alert or not
    callback_alert = p.BooleanField(null=False, default=True)
    # when dealing with auto-generated payloads for lateral movement or spawning new callbacks
    auto_generated = p.BooleanField(null=False, default=False)
    task = p.DeferredForeignKey("Task", null=True)
    file = p.DeferredForeignKey("FileMeta", null=True)

The main thing to look at here is: wrapped_payload = p.ForeignKeyField("self", null=True) - a field in this table that references another Payload row. When the resulting database is imported by Hasura, I get the following relationships:

operator_id → operator . id - payload_operator_id_fkey
payload_type_id → payloadtype . id - payload_payload_type_id_fkey
operation_id → operation . id - payload_operation_id_fkey
wrapped_payload_id → payload . id - payload_wrapped_payload_id_fkey
wrapped_payload_id → payload . id - fk_payload_wrapped_payload_id_refs_payload
pcallback_id → callback . id - fk_payload_pcallback_id_refs_callback
task_id → task . id - fk_payload_task_id_refs_task
file_id → filemeta . id - fk_payload_file_id_refs_filemeta

I can’t find out why there are two instance of wrapped_payload_id or what the difference is between them. Because two are created, I can’t automatically import and have to delete one for the relationships to be added. I can’t determine the difference or meaning behind the foreign relationship that ends in fkey or the one that starts with fk and ends with a refs. What’s confusing me is I have another table, below that works just fine:

class FileBrowserObj(p.Model):
    task = p.ForeignKeyField(Task, null=False)
    timestamp = p.DateTimeField(null=False, default=datetime.datetime.utcnow)
    operation = p.ForeignKeyField(Operation, null=False)
    # this should be the fqdn of the host the info is from
    host = p.BlobField(null=False)
    permissions = p.TextField(null=False, default="")
    # this is the name of this file/folder
    name = p.BlobField(null=False)
    # this is the parent object
    parent = p.ForeignKeyField('self', null=True)
    # this is the full path for the parent folder
    # we need this to enable faster searching and better context
    parent_path = p.BlobField(null=False, default="")
    full_path = p.BlobField(null=False, default="")
    access_time = p.TextField(null=False, default="")
    modify_time = p.TextField(null=False, default="")
    comment = p.TextField(null=False, default="")
    # this is how we differentiate between files and folders of information
    is_file = p.BooleanField(null=False, default=False)
    size = p.TextField(null=False, default="")
    # indicates if we successfully pulled info about the object. False would be access denied for example
    success = p.BooleanField(null=True)
    deleted = p.BooleanField(null=False, default=False)

In this case, I also have parent = p.ForeignKeyField('self', null=True) - a field that references an instance of the same table. In this case though, upon parsing the database, only the following entry is created:

operation_id → operation . id - filebrowserobj_operation_id_fkey
task_id → task . id - filebrowserobj_task_id_fkey
parent_id → filebrowserobj . id - filebrowserobj_parent_id_fkey

So, what’s the difference between the two examples?

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
tirumaraiselvancommented, Dec 19, 2020

But what’s causing it to appear twice and what’s the difference? Half of those foreign-key constraints are fk_refs and half are *_id_fkey but I don’t see a rhyme or reason what makes the determination

This was already there in the database (i.e. Hasura wouldn’t know why it exists). You should check the method by which the table was created initially.

0reactions
its-a-featurecommented, Dec 19, 2020

I think i figured it out. Apparently postgres is ok with things being double defined for relationships and doesn’t complain. So something was double-defining that relationship which i guess didn’t matter with how I was using it before, but when hasura imported and saw the double define, it barfed.

Thanks!

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to ingest JSON from TXT file? - postgresql - Stack Overflow
Is this an OS specific problem? For me, it just works, on Linux. If the whitespace is not significant (i.e. outside of quotes)...
Read more >
postgresql - Dynamically generate batched data for import
So to summarize, I try to create an ingest process pseudo-batch based in which each generated .sql file that has been ingested successfully...
Read more >
PostgreSQL Interview Questions and Answers | LearnSQL.com
Top 10 Postgres Job Interview Questions · 1. What Is PostgreSQL? · 2. What Data Types Are Available in PostgreSQL? · 3. How...
Read more >
Load data in Postgres-XL at over 9M rows/sec - 2ndQuadrant
We are faced with this question: “What's the ingestion rate of Postgres-XL?”, and I realised I don't have a very good answer to...
Read more >
Cannot push data to PostgreSQL using NiFi
I am trying to push data to Postgres using NiFi. I can see the data being populated in the insert query. Data is...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found