Implement Auditable feature to MongoDB Panache, similar to Hibernate Envers
See original GitHub issueDescription
Hi guys, hope all is well.
I’m working in a project where we need the audit capability for some entities. Our project is using MongoDB Panache.
My propose for us is implementing something similar what we have in hibernate envers, we just annotate any @MongoEntity
with the @Audited
and for every change we take a snapshot of the entity.
A entity can be like:
@MongoEntity
@Auditable
public static class Car {
}
By default we persist the snapshots in a collection with a suffix _AUD
like in Envers, in the previous scenario it will be Car_AUD
. In addition to that, the @Auditable
accept the @MongEntity
, for instance:
@Auditable(@MongoEntity(collection = "CarAuditable", database = "anotherDB", clientName = "client2"))
In this way we can define the collection, database, and clientName without reinventing the well.
The structure of the snapshots can be some like:
public class AuditableEntity {
public final Long createdTime;
public final RevType revType;
public Document content;
public enum RevType {
ADD,
MOD,
DEL;
}
}
RevType here indicates which operation generates the snapshot.
Basically, that is it. I know we could user Javers, but I’d like to make this native to Quarkus/MongoDB panache.
It is important to notice that there is “a kind of issue here”, who take care of the persist/persistOrUpdate/update/delete is the MongoOperations
, and there is the method public void persistOrUpdate(Iterable<?> entities) {
that at the end executes a collection.bulkWrite(bulk);
, and there is no way to know which one was updated, at least I think there is no way. In this case we can ignore this method for audit.
What do you think?
I can work and code a prototype of this idea if you think it is doable.
Best regards.
Implementation ideas
No response
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (3 by maintainers)
It’s an interesting proposal; worth pointing out though that anything done synchronously as part of write transactions will create more load on the database and impact latency.
Alternatively, one could look at an approach based on CDC (thanks for pinging me, @Sanne) and Debezium, where MongoDB change streams are used to track all changes and keep a log with the history of records, either in Kafka or materialized as a queryable collection in another MongoDB cluster for instance (or another store such as Elasticsearch perhaps or an OLAP system like Apache Pinot or Snowfalke, depending on your querying needs for that data). You could still use a command listener or something like that, for inserting metadata such as a business user or use case identifier for a transaction into a separate table, and then use downstream stream processing for enriching the actual log events with that metadata. This approach is described in great detail here. Although I suppose with MongoDB, where multi-doc transaction usage is not as common as from what I know, you could also addd that metadata to actual “business documents” themselves.
There’s pros and cons to either approach. An advantage of CDC is that it doesn’t impact the write path and lets you easily materialize audit trails in other systems, which is interesting from a querying perspective, and offloading the entire audit log aspect from the operational DB. OTOH, it’s a bit more complex in terms of the required set-up and it is async by definition (which shouldn’t matter for typical audit log use cases though).
Hi,
You should be able to implement it via a MongoDB command listener, it’s not documented but they are automatically added to the client configuration. You’ll need to use a transaction to be sure that the audit document is persisted with the main document. I’m not sure command listener will be part of the transaction so it must be tested.
You can also create a CDI interceptor but it will only works for repositories (which are CDI beans) and not entities.
Another idea would be to create a dedicated codec for your entity that automatically persist the audit document, not sure it’s feasible as you’ll need to inject the client inside the codec which is a bit strange and hacky.
Maybe @evanchooly would have other ideas ?