question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

For elastic2-doc-manager mongo-connector removing _id from source.

See original GitHub issue

here is mongo connector config

{
  "mainAddress": "localhost:27017",
  "oplogFile": "/var/log/mongo-connector/oplog.cs.timestamp",
  "noDump": false,
  "batchSize": -1,
  "verbosity": 1,
  "continueOnError": true,

  "logging": {
    "type": "file",
    "filename": "/var/log/mongo-connector/mongo-connector-cs.log",
    "format": "%(asctime)s [%(levelname)s] %(name)s:%(lineno)d - %(message)s",
    "__rotationWhen": "D",
    "__rotationInterval": 1,
    "__rotationBackups": 10,

    "__type": "syslog",
    "__host": "localhost:514"
  },

  "authentication": {
    "__adminUsername": "username",
    "__password": "password",
    "__passwordFile": "mongo-connector.pwd"
  },

  "__comment__": "For more information about SSL with MongoDB, please see http://docs.mongodb.org/manual/tutorial/configure-ssl-clients/",
  "__ssl": {
    "__sslCertfile": "Path to certificate to identify the local connection against MongoDB",
    "__sslKeyfile": "Path to the private key for sslCertfile. Not necessary if already included in sslCertfile.",
    "__sslCACerts": "Path to concatenated set of certificate authority certificates to validate the other side of the connection",
    "__sslCertificatePolicy": "Policy for validating SSL certificates provided from the other end of the connection. Possible values are 'required' (require and validate certificates), 'optional' (validate but don't require a certificate), and 'ignored' (ignore certificates)."
  },

  "__fields": ["field1", "field2", "field3"],

  "namespaces": {
    "include": ["fa.fa_new_data"],
    "mapping": {
      "fa.fa_new_data": "fa_new_data.new_data"
    },
    "__gridfs": ["db.fs"]
  },

  "docManagers": [
    {
      "docManager": "elastic2_doc_manager",
      "targetURL": "localhost:9200",
      "bulkSize": 5000,
      "uniqueKey": "_id",
      "__autoCommitInterval": null
    }
  ]
}

i am getting like this,

{
"_index": "fa_new_data",
"_type": "new_data",
"_id": "5497c1c4cbb648eb618b4568",
"_version": 1,
"_score": 1,
"_source": {
"checkbox": true,
...........................
}
}

but i want “_id” inside “_source” as well. like this

{
"_index": "fa_new_data",
"_type": "new_data",
"_id": "5497c1c4cbb648eb618b4568",
"_version": 1,
"_score": 1,
"_source": {
"checkbox": true,
"_id": "5497c1c4cbb648eb618b4568",
...........................
}
}

how can i do this ?

Issue Analytics

  • State:open
  • Created 7 years ago
  • Comments:5 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
peterschrottcommented, Jul 8, 2017

I am using elastic4s on my backend to perform full text search on documents which are synced from mongo db using the mongo-connector.

Elastic4s provides play-json support, which means the _source is deserialised into case classes. If I want to include the mongo collection _id in the deserialised object, I have to find a very hacky workaround to fetch _id from metadata.

I do not see any reason why _id should not be contained in _source as it is originally a property of the source document.

Is it possible to make the removal, or actually the inclusion configurable? As _source can not have a field called _id part of the configuration should be the renaming.

Thanks

1reaction
ShaneHarveycommented, Apr 25, 2017

The Elasticsearch docs claim you can search on the _id field (or _uid field): “The value of the _id field is accessible in certain queries (term, terms, match, query_string, simple_query_string), but not in aggregations, scripts or when sorting, where the _uid field should be used instead” https://www.elastic.co/guide/en/elasticsearch/reference/5.3/mapping-id-field.html https://www.elastic.co/guide/en/elasticsearch/reference/5.3/mapping-uid-field.html

Do the above search methods cover your use case? If not, what kind of search is not possible unless _id is also stored in _source?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Remove select personally identifiable info or doxxing content ...
You can request to remove personally identifiable information (PII) from Google ... This information includes: Confidential government identification (ID) ...
Read more >
Source Connector - MongoDB
A MongoDB Kafka source connector works by opening a single change stream with MongoDB and sending data from that change stream to Kafka...
Read more >
Remove-EventLog (Microsoft.PowerShell.Management)
The Remove-EventLog cmdlet deletes an event log file from a local or remote computer and unregisters all its event sources for the log....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found