question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Search] Improve and normalizes the search data model

See original GitHub issue

Things to keep in mind:

  • Normalize text inputs fields: text, inputs, words must be normalized and use a common pattern for all tasks
  • Several es analyzers for text fields: standard and whitespace(?) for fine tuning searches. Default as standard
  • What about text fields in metadata ? For now, only terms queries are supported. It’s mean that metadata fields with large content are not enabled to be queries as full text search.
  • Created indices should contain mapping info only for its fields. A text classification index should not include mapping info for tokens or text predicted (text2text).
  • Review filter fields and align with UI names (if any)
  • What about nested fields? like token or metrics info for token classification, or label and its score for text classification. As default, query string dsl does not support nested queries, but it could be nice include some minimal support for that kind of queries.

@dvsrepo @dcfidalgo Anything to include here?

Tasks

To achieve to do the work, we need tackle following tasks (that will be created as separated issues and linked here)

  1. [Datasets] Avoid using global template for all indices
  2. [Datasets] Dataset migration mechanisms for each release
  3. [Datasets] New es document model per task with backward compatibility fields
  4. [Datasets] Apply migration to new es doc model
  5. [Datasets] Build searches and aggregations using new doc model

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:11 (11 by maintainers)

github_iconTop GitHub Comments

1reaction
frascuchoncommented, Jan 20, 2022

Not, really. The only “problem” is that you cannot select with predicted sentence you use. It will search in all of them. But i think we can assume that

0reactions
frascuchoncommented, May 26, 2022

Note: PR recognai/rubrix#1018 introduces breaking changes to version <0.9.0. So we cannot include those changes until v0.11.0 in order to keep compatibility at lease 2 version prior to release

Read more comments on GitHub >

github_iconTop Results From Across the Web

Use the CIM to normalize data at search time
You normalize your data to be CIM compliant at search time. See Getting Data In if you need more direction for capturing and...
Read more >
Search & Data Modeling
We need to data model the following for making search work: searchable attributes; ranking attributes; attributes for filtering or facets ...
Read more >
Splunk Data Models & CIM
In this post, you will find out what Splunk data models and CIM (Common Information Model) are and why they hold that much...
Read more >
Splunk CIM Performance Hacks - Deductiv
This has improved over time as Splunk continues to optimize the data model search. No indexes are specified in the CIM searches by...
Read more >
How to the Use CIM to Normalize Splunk Data
The CIM data model is a way for Splunk to normalize your data to ... It allows the Splunk end-users and APPs to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found