Change the architecture
See original GitHub issueCurrently: the API server, the cache/database, the assets, and the workers (that generate the data) are running on the same machine and share the same resources, which is the source of various issues:
- a worker that requires a lot of resources can block the server (https://grafana.huggingface.co/d/rYdddlPWk/node-exporter-full?orgId=2&refresh=1m&from=now-24h&to=now&var-DS_PROMETHEUS=HF Prometheus&var-job=node_exporter_metrics&var-node=datasets-preview-backend&var-diskdevices=[a-z]%2B|nvme[0-9]%2Bn[0-9]%2B)
- we have to kill the warming process if memory usage is too high to preserve the API resources, which requires manual supervision
- also related to resources limits: we currently run the warming and refreshing tasks on one dataset at a time, while they are logically independent and could be launched on different workers in parallel, reducing the duration of these processes
- also: I’m not sure if the current implementation of the database/cache (diskcache) really supports concurrent access (it does, but I’m not sure I used it adequately in the code, see http://www.grantjenks.com/docs/diskcache/tutorial.html /
cache.close()
) - having everything in the same application also means that everything is developed in Python (since the workers have to be in Python), while managing a queue and async processes could be easier in node.js, for example
The architecture I imagine would have these components:
- API server
- queue
- database
- file storage
- workers
The API server would:
- deliver the data (
/rows
,/splits
,/valid
,/cache-reports
,/cache
,/healthcheck
): directly querying the database. If not in the database, return an error. - serve the assets from the storage
- command the queue (
/webhook
,/warm
,/refresh
) -> add authentication? Send new tasks to the queue
The queue would:
- manage the tasks sent by the API server
- launch workers for these tasks
- add/update/delete the data in the database and the assets in the storage
The database would:
- store the datasets’ data
The storage would:
- store the assets (image files for example)
The workers would:
- compute the data for one dataset
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (1 by maintainers)
Top Results From Across the Web
6 Interesting Reasons Why Architecture Changes Over Time
From how buildings are used, to environmental changes, to historical impacts, there are many reasons why architecture changes over time.
Read more >Design for Change: How Architecture Can Adjust to the ...
The Framework's “Design for Change” measure encourages architects to design for future changes in our environment and help people navigate ...
Read more >How Architecture Has Changed in the Past Two Decades
One of the biggest changes in architecture is coming up with greener design solutions – the notion also known as going green. That...
Read more >Building resilient architecture using change-driven design
Change is a constant in technology. Learn how to think about designing architecture in a manner that can adapt and change to things...
Read more >The Architecture of Change: Building a Better World
The Architecture of Change is a collection of articles that demonstrates the power of the human spirit to transform the environments in which...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Partially done in https://github.com/huggingface/datasets-preview-backend/releases/tag/0.14.0:
Done