Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPOSITORY_DIR will indefinitely collect files #6304

Open
grantfitzsimmons opened this issue Mar 7, 2025 · 2 comments
Open

DEPOSITORY_DIR will indefinitely collect files #6304

grantfitzsimmons opened this issue Mar 7, 2025 · 2 comments
Labels
2 - Exporting Data Issues that are related to exporting data to DwC, GBIF, IPT, Web Portal, etc.

Comments

@grantfitzsimmons
Copy link
Member

Describe the bug
When exporting a query or building an export for DwCA, Specify deposits this file into the DEPOSITORY_DIR configured for the installation, configured in the specify_settings.py file.

# Asynchronously generated exports are placed in
# the following directory. This includes query result
# exports and Darwin Core archives.
DEPOSITORY_DIR = '/home/specify/specify_depository'

If Specify 7 is installed via Docker, you can enter into a container and find this by going to /volumes/static-files/depository, for example, on the host system (here's where the dir is created and here's where it is set):

$ docker exec -it specifycloud-herb-rbge-1 bash

Inside the container, you can see how much space is being used here:

specify@138cc69fa32c:/opt/specify7$ du -ch /volumes/static-files/depository/* | grep total$
818M	total

Just for this instance, 818 MB of space is being used up by query result exports dating back to 2023!

To Reproduce
Steps to reproduce the behavior:

  1. Navigate to the DEPOSITORY_DIR location on the container/system hosting Specify 7
  2. See that there are many export files dating back as long as you have had this Specify instance

Expected behavior
There should be some way to automatically clear this out regularly, with a default option (7 days) and the possibility to configure this to any limit (30 days, 6 months, 1 year).

Version
This is happening in v7.9.6.2 and v7.10.0, but has almost certainly been an issue since the system was introduced.

Reported By
Bill Kuntz at Florida Museum of Natural History, University of Florida on the Speciforum

@grantfitzsimmons grantfitzsimmons added the 2 - Exporting Data Issues that are related to exporting data to DwC, GBIF, IPT, Web Portal, etc. label Mar 7, 2025
@grantfitzsimmons
Copy link
Member Author

grantfitzsimmons commented Mar 7, 2025

I made a script that iterates through all Specify 7 containers and reports the space usage of static files for each region (not just each instance). Just as a start, I found the following totals from the big 3:

Europe:
1280.00 MB

United States:
5780.40 MB

Canada:
2490.30 MB

@specifysoftware
Copy link

This issue has been mentioned on Specify Community Forum. There might be relevant details there:

https://discourse.specifysoftware.org/t/depository-dir-cleanup/2403/2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2 - Exporting Data Issues that are related to exporting data to DwC, GBIF, IPT, Web Portal, etc.
Projects
None yet
Development

No branches or pull requests

2 participants