Posts don't show and can't be accessed when using MySQL DB
See original GitHub issueAuto-reviewers: @NiharikaRay @matthewwardrop @earthmancash @danfrankj
I’m trying to deploy knowledge repo with a database backend. The database is running on a different place than the knowledge_repo server and is accessed over an SSH tunnel.
It seems that Knowledge Repo can read from and write to this database just fine.
However, for whatever reason, no posts are showing up. And it’s not just in the feed, but they also can’t be accessed directly via their path. I looked in the database itself and the posts are there in the repository table, however, the posts
table that knowledge repo makes itself is empty. If I try to re-add the posts I see in the repository table, I expectedly get the error that their paths already exist.
I tried also setting SQLALCHEMY_DATABASE_URI to use sqlite, but that didn’t seem to help.
Another thing I tried was running a MySQL DB on a local docker container, and the same issue persists.
Version 0.8.1 of Knowledge Repo, on a Mac, with python 3 if that helps.
I’m sort of at a loss as to why this could be happening.
In the logs, I see stuff like this:
_mysql.connection.query(self, query)
sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) (2013, 'Lost connection to MySQL server during query') [SQL: 'SELECT index_metadata.id AS index_metadata_id, index_metadata.type AS index_metadata_type, index_metadata.name AS index_metadata_name, index_metadata.value AS index_metadata_value, index_metadata.updated_at AS index_metadata_updated_at \nFROM index_metadata \nWHERE index_metadata.type = %s AND index_metadata.name = %s \n LIMIT %s'] [parameters: ('lock', 'master_check', 1)] (Background on this error at: http://sqlalche.me/e/e3q8)
During startup of knowledge repo I also see this:
WARNING:knowledge_repo.app.index:Master indexing thread has died. Restarting...
Issue Analytics
- State:
- Created 5 years ago
- Comments:29 (14 by maintainers)
Top GitHub Comments
I explored this some more, and it’s actually a combination of minor issues and lack of documentation.
When using
deploy
with MySQL 5.6, indexing fails with the Lost Connection messages. However, usingrunserver
works. The catch is to make it work you have to make sure to delete any data in theindex_metadata
table that may have been left behind bydeploy
.Indexing also works with
deploy
when setting--workers 0
, and indexing also works when triggered manually viaknowledge_repo
’s reindex command. This suggests that there is some issue specifically with the multiple workers. So a good workaround is to deploy normally and turn off automatic indexing and do it manually via a cron job that callsreindex
.You also have to change the post status to
3
in the database (not sure how to do this in the UI or via the command line.submit
only gets status to1
). I see no options inknowledge_repo
script, or direction in the webapp about how to ‘review’ posts submitted for review and change their status. Theaccept()
andpublish()
etc methods seem to not be called by the current version of the web app or knowledge_repo script except specifically for webposts. One option is to manually navigate tohttps://host:port/edit/project_name/path_to_post.kp
but this requires thatproject_name
posts are in the list of allowed posts to edit online (this can be specified in config, and onlywebposts
is allowed by default).There is incompatibility with MySQL 5.7 changes to Strict mode. This causes indexing when using
runserver
(and alsodeploy
) to fail (unlike MySQL 5.6, where it works withrunserver
)Basically this is not currently ready to run reliably with a DB backend, but it also seems like it’s pretty close.
Finally some progress!
It appears
db.session.close()
doesn’t actually close connections when using a pool, just returns them to the pool. Then they somehow collide with connections from other queries like the one inget_posts
when using multiprocessing.I added
db.engine.dispose()
at the start of allmodels.py IndexMetadata()
methods, which moved the error todbrepository.py revision()
method. I then addedself.engine.dispose()
at the start of that method too, and now indexing seems to work, including with the latest version of sqlalchemy.As far as I can tell those are the only places where db connections are made during indexing.
While indexing works, I haven’t yet tested to see if this broke anything else.