-
Notifications
You must be signed in to change notification settings - Fork 1
Description
This issue is critical from an infrastructure perspective. Please, let me know what you think:
Some of the docker images are unexpectedly large (e.g., 7-8 GBs or more). Because of that:
- the system fails to execute these jobs due to timeout
- even if we override the timeout issue by pre-downloading the huge image, the cluster nodes get full very quickly, rendering the system unusable
I think we should reconsider our policy on the docker image sizes and set size restrictions. If we don't, we will experience a lot of failures and our storage resources will get consumes really fast. Also, if we run the system on a commercial cluster, the cost of the infrastructure will rise too high very quickly and it will be nearly impossible to scale.
I wonder how images get so big. What kind of software code requires so many GBs, considering that desktop OSs overloaded with heavy apps can be bundled in just 5 to 10 GBs. Is it just software or do we allow people to load images with data as well?
Putting any kind of data inside the images is a highly inefficient practice because it makes our system too expensive to run (I can explain this further, if you like).
What do you think we should do?