Skip to content

Thumbnails API for SourceImages#1306

Open
loppear wants to merge 47 commits into
mainfrom
feat/thumbnail-source-images
Open

Thumbnails API for SourceImages#1306
loppear wants to merge 47 commits into
mainfrom
feat/thumbnail-source-images

Conversation

@loppear

@loppear loppear commented May 14, 2026

Copy link
Copy Markdown
Collaborator

Summary

Provide thumbnails for SourceImages, stored in default_storage, generated on image request and paths "cached" in new table SourceImageThumbnail.

List of Changes

  • Settings THUMBNAILS to define storage prefix and pre-defined sizes.
  • New model SourceImageThumbnail
  • New API read-only Viewset at /api/v2/captures/thumbnails/<source_image_id> (that only implements detail) for proxying image requests and perform thumbnailing
    • If a cached record for the requested size (requested by label, queried by dimension) does not exist, retrieves SourceImage from deployment datasource and resizes to save to default storage, returning a redirect to the media URL.
  • Modifies captures API responses to include a thumbnails property with urls keyed by thumbnail size label, e.g. .thumbnails.small.

Update 2026-06-10 (follow-up commits, several promoted from the #1331 proposal so this PR is deployable on its own):

  • Thumbnail rows now record the requested spec width instead of the encoder's output, so Pillow's aspect-ratio rounding (239px for a 240px spec) no longer causes affected thumbnails to regenerate on every request.
  • The freshness check trusts the cached row without issuing a storage existence check (an S3 HEAD request) per warm request.
  • Thumbnails are encoded as progressive JPEGs (optimize pass, quality 82) so browsers can paint a coarse preview while bytes arrive.
  • The thumbnail redirect carries Cache-Control: private, max-age=300, letting browsers reuse it across page views instead of re-paying the round trip per thumbnail.
  • Rows with a blank stored path (failed or interrupted generation) regenerate instead of redirecting to the storage root.

Also landed during review (earlier follow-up commits, not in the original description):

  • First pass of frontend wiring: the captures table and gallery use the small thumbnail, the session detail viewer and session-list example captures use the medium one, all falling back to the full-size image URL when no thumbnail is present.
  • Deleting a thumbnail row (directly or via cascade from its capture) also deletes the blob from storage, so removed captures don't leave dead thumbnail files behind.
  • New Django admin panel for thumbnails; the capture admin marks size, last_modified, checksum, and checksum_algorithm read-only since they are populated from the source file during sync/upload.
  • Image fetching during generation now has connect/read timeouts (5s/30s) so a stalled upstream storage can't tie up web workers (fetch_image_content moved to ami/utils/media.py).

Related Issues

Implements #236

Detailed Description

Thumbnails are not generated during capture requests, but delayed until frontend requests image, at which point we check if there is a valid thumbnail already or need to generate.

(Update 2026-06-10, design context worth recording: lazy generation is a deliberate fit for how captures are accessed, not just an implementation convenience. A project can hold on the order of a million captures, but users review occurrence crops, not raw captures — only a few hundred captures are ever opened, typically while troubleshooting or checking the intermediate context around a detection. Generating thumbnails eagerly (at upload/sync) would spend storage and compute on images that are never viewed; generating on first request means cost scales with what people actually look at.)

This does NOT implement front-end changes to use this thumbnails property of captures. Happy to include a first stab at that for the captures list gallery before this is accepted.

(Update 2026-06-10: the first stab at the frontend is now included — captures list/gallery and session views use the new thumbnails with a fallback to the original image URL.)

Other considerations implicated:

  • Monitoring default media storage usage, as this is the first larger use of it?
  • Monitoring response time and django worker availability in performing thumbnailing in-request.

How to Test the Changes

Adds tests (manage.py test -k thumbnail) for the new API view and modifications to serializing captures.

Deployment Notes

Adds database migration for new SourceImageThumbnail table (now migrations 0091 and 0092 after a renumber following the merge of main)

Checklist

  • [ x ] I have tested these changes appropriately.
  • [ x ] I have added and/or modified relevant tests.
  • [ x ] I updated relevant documentation or comments.
  • [ x ] I have verified that this PR follows the project's coding standards.
  • [ x ] Any dependent changes have already been merged to main.

Summary by CodeRabbit

  • New Features

    • Added thumbnail support with configurable sizes and storage prefix; capture list/detail include thumbnail URLs and a new thumbnail retrieval endpoint that redirects to stored images.
    • Admin UI: thumbnails and related source/deployment attributes are shown and filterable in the admin list view.
  • Behavior

    • Thumbnails are cached and automatically regenerated when sources or size settings change.
  • Tests

    • Added tests for generation, retrieval, size selection, caching, and regeneration.

Review Change Stack

@netlify

netlify Bot commented May 14, 2026

Copy link
Copy Markdown

Deploy Preview for antenna-ssec canceled.

Name Link
🔨 Latest commit 63d103f
🔍 Latest deploy log https://app.netlify.com/projects/antenna-ssec/deploys/6a2a148aba1e9d0007f72dd0

@netlify

netlify Bot commented May 14, 2026

Copy link
Copy Markdown

Deploy Preview for antenna-preview canceled.

Name Link
🔨 Latest commit 63d103f
🔍 Latest deploy log https://app.netlify.com/projects/antenna-preview/deploys/6a2a148ab870a9000846e75b

@loppear loppear requested a review from mihow May 14, 2026 16:41
@coderabbitai

coderabbitai Bot commented May 14, 2026

Copy link
Copy Markdown
Contributor

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds end-to-end thumbnail support: THUMBNAILS settings, SourceImageThumbnail model and migration, SourceImage.find_or_generate_thumbnail_for_label, a thumbnail retrieval viewset (retrieve → redirect), serializer thumbnails field, admin registration, router entry, and tests for generation and API behavior.

Changes

SourceImage Thumbnails

Layer / File(s) Summary
Thumbnail settings
config/settings/base.py
Adds THUMBNAILS config with STORAGE_PREFIX and SIZES (small, medium).
Model: generation and thumbnail model
ami/main/models.py
Adds SourceImage.find_or_generate_thumbnail_for_label, imports to read source images, and SourceImageThumbnail model with metadata and nullable FK to SourceImage.
DB migration
ami/main/migrations/0085_sourceimagethumbnail.py
Creates SourceImageThumbnail table with timestamp and metadata fields, index, and unique constraint.
API: thumbnail viewset
ami/main/api/views.py
Adds SourceImageThumbnailViewSet (list → 404, retrieve → validates label, delegates to model generator, redirects to storage URL).
Serializer: include thumbnail URLs
ami/main/api/serializers.py
Adds thumbnails SerializerMethodField to SourceImageListSerializer and builds per-label detail URLs using settings.THUMBNAILS.
Admin: register thumbnail model
ami/main/admin.py
Registers SourceImageThumbnailAdmin with list_display/list_filter and optimized get_queryset.
Tests: thumbnail view behavior
ami/main/tests.py
Adds TestImageThumbnailViews, NEW_THUMBNAIL_SETTINGS, fixture usage, and tests for list/detail responses, label validation, regeneration triggers, and serializer thumbnail links.
API router registration
config/api_router.py
Registers captures/thumbnails route to SourceImageThumbnailViewSet with basename="sourceimagethumbnail".

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

A rabbit hops and crafts a thumb,
From big bright captures, pixels come,
It stores a tiny trail of light,
Redirects the viewer to delight,
Small previews dance — a joyous hum 🐰✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 14.29% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main feature added in this PR: a new thumbnails API for SourceImages.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The PR description covers all major required sections: Summary, List of Changes, Related Issues, Detailed Description, How to Test, Deployment Notes, and Checklist. All required information is present and complete.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/thumbnail-source-images

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@loppear loppear requested a review from annavik May 14, 2026 16:41

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (8)
ami/main/models.py (2)

2401-2401: ⚖️ Poor tradeoff

Consider CASCADE instead of SET_NULL for the source_image foreign key.

The current on_delete=models.SET_NULL with null=True will orphan thumbnail records when the parent SourceImage is deleted. Since thumbnails are derived artifacts with no value without their source image, on_delete=models.CASCADE would be more appropriate to automatically clean up thumbnails when their source is deleted.

♻️ Proposed fix
-    source_image = models.ForeignKey(SourceImage, on_delete=models.SET_NULL, null=True, related_name="thumbnails")
+    source_image = models.ForeignKey(SourceImage, on_delete=models.CASCADE, related_name="thumbnails")

Note: This would require a new migration and removing null=True may require a two-step migration if there are existing NULL records.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ami/main/models.py` at line 2401, Change the Thumbnail model's source_image
ForeignKey to use on_delete=models.CASCADE instead of SET_NULL and remove
null=True (i.e., update the source_image field referencing SourceImage and the
related_name "thumbnails"); then create the Django migration(s) to apply this
change and, if there are existing NULL source_image rows, perform a two-step
migration (first backfill or delete orphaned thumbnails, then make the field
non-nullable) so the DB transition is safe.

2390-2402: ⚡ Quick win

Add unique constraint and index for (source_image, label).

The model allows duplicate thumbnail records for the same source_image and label combination. Consider adding a unique constraint to prevent duplicates and an index to optimize thumbnail lookups by the ViewSet.

♻️ Proposed fix
 `@final`
 class SourceImageThumbnail(BaseModel):
     """A thumbnail cache of a SourceImage"""
 
     path = models.CharField(max_length=255, blank=True)
     label = models.CharField(max_length=255, blank=True, null=True)
     width = models.IntegerField(null=True, blank=True)
     height = models.IntegerField(null=True, blank=True)
     size = models.BigIntegerField(null=True, blank=True)
     last_modified = models.DateTimeField(null=True, blank=True)
 
     source_image = models.ForeignKey(SourceImage, on_delete=models.SET_NULL, null=True, related_name="thumbnails")
+
+    class Meta:
+        constraints = [
+            models.UniqueConstraint(
+                fields=["source_image", "label"],
+                name="unique_source_image_thumbnail"
+            ),
+        ]
+        indexes = [
+            models.Index(fields=["source_image", "label"]),
+        ]
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ami/main/models.py` around lines 2390 - 2402, The SourceImageThumbnail model
allows duplicates for the (source_image, label) pair; add a database-level
uniqueness constraint and an index to speed lookups by that pair: update the
SourceImageThumbnail class to include a Meta with a
UniqueConstraint(fields=["source_image","label"], name="uniq_sourceimage_label")
and an Index(fields=["source_image","label"], name="idx_sourceimage_label"),
then create and run the Django migration so the DB enforces uniqueness and
provides the lookup index.
config/settings/base.py (1)

592-603: ⚡ Quick win

Consider adding explicit format and quality settings.

The configuration only specifies width for each size preset. Consider adding explicit settings for image format (JPEG/PNG/WebP) and quality/compression to make the thumbnail generation behavior more predictable and configurable.

💡 Example enhancement
 # Sizes for Source Image Thumbnails
 THUMBNAILS = {
     "STORAGE_PREFIX": "thumbnails/",
+    "FORMAT": "JPEG",  # or "WebP" for better compression
+    "QUALITY": 85,  # JPEG quality 1-100
     "SIZES": {
         "small": {
-            "width": 240
+            "width": 240,
+            # height calculated to preserve aspect ratio
         },
         "medium": {
-            "width": 1024
+            "width": 1024,
         }
     }
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@config/settings/base.py` around lines 592 - 603, The THUMBNAILS config
currently only defines width for each preset (THUMBNAILS -> SIZES ->
"small"/"medium") which makes output format and compression unpredictable;
extend each preset to include explicit format (e.g., "format":
"jpeg"|"png"|"webp") and quality (e.g., "quality": 75) keys, and add top-level
defaults under THUMBNAILS (e.g., "DEFAULT_FORMAT" and "DEFAULT_QUALITY") so
thumbnail generation code can read format/quality from THUMBNAILS or fall back
to defaults; update any thumbnail generator (references: THUMBNAILS, SIZES,
"small", "medium") to use these new keys when producing images.
ami/main/tests.py (1)

226-226: 💤 Low value

Remove unnecessary f-string prefix.

The f-string on line 226 contains no placeholders and should use a regular string literal.

♻️ Proposed fix
     def test_thumbnail_no_list(self):
-        response = self.client.get(f"/api/v2/captures/thumbnails/")
+        response = self.client.get("/api/v2/captures/thumbnails/")
         self.assertEqual(response.status_code, 404)

As per coding guidelines: Ruff static analysis detected F541 (f-string without placeholders).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ami/main/tests.py` at line 226, The test uses an unnecessary f-string when
calling self.client.get(f"/api/v2/captures/thumbnails/"); change it to a regular
string literal by removing the f-prefix so the call becomes
self.client.get("/api/v2/captures/thumbnails/") to eliminate the F541 warning
and satisfy Ruff static analysis.
ami/main/api/serializers.py (1)

1103-1112: ⚡ Quick win

Add error handling for THUMBNAILS settings access.

The get_thumbnails method accesses settings.THUMBNAILS["SIZES"] without checking if the setting exists. If THUMBNAILS is not configured (e.g., in a test environment or during migration), this will raise a KeyError during serialization, breaking the entire captures API response.

Consider adding a guard to return None or an empty dict if the setting is missing:

🛡️ Proposed fix
     def get_thumbnails(self, obj: SourceImage) -> dict | None:
+        if not hasattr(settings, 'THUMBNAILS') or 'SIZES' not in settings.THUMBNAILS:
+            return None
         return {
             label: reverse_with_params(
                 "sourceimagethumbnail-detail",
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ami/main/api/serializers.py` around lines 1103 - 1112, The get_thumbnails
serializer method currently assumes settings.THUMBNAILS["SIZES"] exists and will
raise if THUMBNAILS is missing; modify get_thumbnails to guard access to
settings.THUMBNAILS (and/or the "SIZES" key) and return None or an empty dict
when the config is absent, e.g., check for getattr(settings, "THUMBNAILS", None)
and/or use .get("SIZES") before iterating; keep the existing reverse_with_params
logic for when sizes are present so callers still receive the thumbnail URLs.
ami/main/api/views.py (3)

719-719: 💤 Low value

Clarify queryset model choice or update naming.

The queryset is SourceImage.objects.all() but the viewset is named SourceImageThumbnailViewSet and registered with basename sourceimagethumbnail. This means the URL PK parameter refers to a SourceImage ID (not a thumbnail ID), which may confuse API consumers who expect /thumbnails/<thumbnail_id>.

If this design is intentional (thumbnails accessed by source image ID + label), consider adding a docstring clarifying that the PK is a source image ID, or renaming the viewset to better reflect this pattern.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ami/main/api/views.py` at line 719, The viewset SourceImageThumbnailViewSet
currently uses queryset = SourceImage.objects.all() while being registered under
basename "sourceimagethumbnail", which makes the URL PK refer to a SourceImage
ID rather than a thumbnail ID; either update the viewset name/registration to
reflect it operates on SourceImage (e.g., rename to
SourceImageThumbnailProxyViewSet or change basename to "sourceimage") or keep
the name and add a clear docstring on SourceImageThumbnailViewSet stating that
the PK is a SourceImage ID and that thumbnails are accessed by source image ID +
label; update any related API docs/registration to match the chosen naming so
consumers aren’t confused.

771-771: 💤 Low value

Verify deletion strategy for existing thumbnails.

Line 771 deletes all prior thumbnails matching the label before creating a new one. This prevents stale thumbnails when settings change (e.g., dimensions updated), but creates a small window where no thumbnail exists if the subsequent create fails. Consider whether this is acceptable or if the deletion should happen after successful creation.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ami/main/api/views.py` at line 771, Current code deletes existing thumbnails
via obj.thumbnails.filter(label=label).delete() before creating a new one,
creating a window with no thumbnail if creation fails; change the flow to delete
only after a successful creation by first creating/saving the new thumbnail (the
block that constructs/saves the new thumbnail object where save() or create(...)
is called) and then calling
obj.thumbnails.filter(label=label).exclude(pk=new_thumb.pk).delete(), or wrap
creation + swap in a transaction (transaction.atomic) so deletion happens only
on success; ensure you reference obj.thumbnails.filter(label=label) and the new
thumbnail instance (new_thumb) when implementing this change.

727-727: 💤 Low value

Remove unnecessary f-string prefix.

The string has no placeholders, so the f prefix is redundant.

🔧 Suggested fix
-        raise api_exceptions.NotFound(detail=f"No collection of thumbnails")
+        raise api_exceptions.NotFound(detail="No collection of thumbnails")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ami/main/api/views.py` at line 727, The raise statement uses an unnecessary
f-string; replace the f-prefixed string in the api_exceptions.NotFound call so
the detail argument is a normal string (change
api_exceptions.NotFound(detail=f"No collection of thumbnails") to
api_exceptions.NotFound(detail="No collection of thumbnails")), leaving the
exception type and detail parameter unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ami/main/admin.py`:
- Around line 314-319: get_queryset is calling select_related with an invalid
path "deployment__data_source" which causes a FieldError because
SourceImageThumbnail has no direct deployment FK; change the select_related call
in get_queryset to traverse via source_image (e.g. use "source_image",
"source_image__deployment", and "source_image__deployment__data_source" or
combine into a single list without the incorrect "deployment__data_source") so
the relationships are resolved through source_image__deployment__data_source.

In `@ami/main/api/views.py`:
- Around line 759-766: The BytesIO buffer is left at EOF after img.save so
default_storage.save reads nothing; before calling default_storage.save (where
thumbnail_key and thumbnail_path are used) reset the buffer position to the
start (seek to 0) or pass the earlier captured contents instead so the saved
file contains the JPEG data; update the code around buffer, img.save and
default_storage.save to ensure buffer.seek(0) is called (or use contents) prior
to saving.
- Around line 746-748: Replace the unconventional except clause that references
ami.utils.s3.botocore.exceptions.ClientError with the standard import pattern:
import botocore.exceptions (or from botocore.exceptions import ClientError) at
the top of views.py, catch botocore.exceptions.ClientError (or ClientError) in
the except block that currently logs via logger.error for obj.path, and re-raise
the api_exceptions.NotFound using "raise ... from e" to preserve the original
exception context while keeping the existing logger.error and NotFound detail
message.

In `@ami/main/tests.py`:
- Around line 260-269: The test test_thumbnail_settings_change_regenerates
mutates settings.THUMBNAILS["SIZES"]["small"]["width"] directly which can
pollute other tests; change it to save the original value before mutation and
restore it after the test (either by restoring in tearDown of the TestCase that
contains test_thumbnail_settings_change_regenerates or by using a
try/finally/context-manager inside the test) so the original settings are always
reinstated; reference the test method name
test_thumbnail_settings_change_regenerates and the settings key
THUMBNAILS["SIZES"]["small"]["width"] when implementing the save/restore logic.

---

Nitpick comments:
In `@ami/main/api/serializers.py`:
- Around line 1103-1112: The get_thumbnails serializer method currently assumes
settings.THUMBNAILS["SIZES"] exists and will raise if THUMBNAILS is missing;
modify get_thumbnails to guard access to settings.THUMBNAILS (and/or the "SIZES"
key) and return None or an empty dict when the config is absent, e.g., check for
getattr(settings, "THUMBNAILS", None) and/or use .get("SIZES") before iterating;
keep the existing reverse_with_params logic for when sizes are present so
callers still receive the thumbnail URLs.

In `@ami/main/api/views.py`:
- Line 719: The viewset SourceImageThumbnailViewSet currently uses queryset =
SourceImage.objects.all() while being registered under basename
"sourceimagethumbnail", which makes the URL PK refer to a SourceImage ID rather
than a thumbnail ID; either update the viewset name/registration to reflect it
operates on SourceImage (e.g., rename to SourceImageThumbnailProxyViewSet or
change basename to "sourceimage") or keep the name and add a clear docstring on
SourceImageThumbnailViewSet stating that the PK is a SourceImage ID and that
thumbnails are accessed by source image ID + label; update any related API
docs/registration to match the chosen naming so consumers aren’t confused.
- Line 771: Current code deletes existing thumbnails via
obj.thumbnails.filter(label=label).delete() before creating a new one, creating
a window with no thumbnail if creation fails; change the flow to delete only
after a successful creation by first creating/saving the new thumbnail (the
block that constructs/saves the new thumbnail object where save() or create(...)
is called) and then calling
obj.thumbnails.filter(label=label).exclude(pk=new_thumb.pk).delete(), or wrap
creation + swap in a transaction (transaction.atomic) so deletion happens only
on success; ensure you reference obj.thumbnails.filter(label=label) and the new
thumbnail instance (new_thumb) when implementing this change.
- Line 727: The raise statement uses an unnecessary f-string; replace the
f-prefixed string in the api_exceptions.NotFound call so the detail argument is
a normal string (change api_exceptions.NotFound(detail=f"No collection of
thumbnails") to api_exceptions.NotFound(detail="No collection of thumbnails")),
leaving the exception type and detail parameter unchanged.

In `@ami/main/models.py`:
- Line 2401: Change the Thumbnail model's source_image ForeignKey to use
on_delete=models.CASCADE instead of SET_NULL and remove null=True (i.e., update
the source_image field referencing SourceImage and the related_name
"thumbnails"); then create the Django migration(s) to apply this change and, if
there are existing NULL source_image rows, perform a two-step migration (first
backfill or delete orphaned thumbnails, then make the field non-nullable) so the
DB transition is safe.
- Around line 2390-2402: The SourceImageThumbnail model allows duplicates for
the (source_image, label) pair; add a database-level uniqueness constraint and
an index to speed lookups by that pair: update the SourceImageThumbnail class to
include a Meta with a UniqueConstraint(fields=["source_image","label"],
name="uniq_sourceimage_label") and an Index(fields=["source_image","label"],
name="idx_sourceimage_label"), then create and run the Django migration so the
DB enforces uniqueness and provides the lookup index.

In `@ami/main/tests.py`:
- Line 226: The test uses an unnecessary f-string when calling
self.client.get(f"/api/v2/captures/thumbnails/"); change it to a regular string
literal by removing the f-prefix so the call becomes
self.client.get("/api/v2/captures/thumbnails/") to eliminate the F541 warning
and satisfy Ruff static analysis.

In `@config/settings/base.py`:
- Around line 592-603: The THUMBNAILS config currently only defines width for
each preset (THUMBNAILS -> SIZES -> "small"/"medium") which makes output format
and compression unpredictable; extend each preset to include explicit format
(e.g., "format": "jpeg"|"png"|"webp") and quality (e.g., "quality": 75) keys,
and add top-level defaults under THUMBNAILS (e.g., "DEFAULT_FORMAT" and
"DEFAULT_QUALITY") so thumbnail generation code can read format/quality from
THUMBNAILS or fall back to defaults; update any thumbnail generator (references:
THUMBNAILS, SIZES, "small", "medium") to use these new keys when producing
images.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4960fc79-aeb3-4154-9954-83da5c3b9892

📥 Commits

Reviewing files that changed from the base of the PR and between aeb57c1 and 8077c2a.

📒 Files selected for processing (9)
  • ami/main/admin.py
  • ami/main/api/serializers.py
  • ami/main/api/views.py
  • ami/main/migrations/0085_sourceimagethumbnail.py
  • ami/main/models.py
  • ami/main/tests.py
  • ami/utils/s3.py
  • config/api_router.py
  • config/settings/base.py
💤 Files with no reviewable changes (1)
  • ami/utils/s3.py

Comment thread ami/main/admin.py Outdated
Comment thread ami/main/api/views.py Outdated
Comment thread ami/main/api/views.py Outdated
Comment thread ami/main/tests.py
Comment thread ami/main/api/serializers.py

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
ami/main/tests.py (1)

238-238: 💤 Low value

Remove unnecessary f-string prefix.

The f-string has no placeholders.

📝 Suggested change
-        response = self.client.get(f"/api/v2/captures/thumbnails/")
+        response = self.client.get("/api/v2/captures/thumbnails/")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ami/main/tests.py` at line 238, In tests.py replace the unnecessary f-string
used in the self.client.get call by removing the "f" prefix so the literal URL
string passed to self.client.get("/api/v2/captures/thumbnails/") is a normal
string; locate the call to self.client.get(f"/api/v2/captures/thumbnails/") (the
assignment to response) and change it to use a plain string.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ami/main/api/views.py`:
- Around line 771-775: The thumbnail regeneration only deletes DB rows
(obj.thumbnails.filter(label=label).delete()) but not the underlying storage
blob; update the logic in the thumbnail creation flow (the block around
obj.thumbnails.filter(label=label).delete() and obj.thumbnails.create(...)) to
first iterate existing thumbnail instances for that label, call
default_storage.delete(existing.path) for each existing.path (checking
storage.exists if desired), then remove the DB rows and create the new thumbnail
via obj.thumbnails.create(path=thumbnail_path, ...). Ensure you use the same
identifiers: default_storage, obj.thumbnails.filter(label=label), existing.path,
and thumbnail_path so old media files are removed from storage when
regenerating.
- Around line 741-742: The thumbnail cache currently ignores source changes;
modify the thumbnail lookup in the view (where it checks thumb and
default_storage.exists(thumb.path) and in the similar block at the other
occurrence) to also compare the cached row's stored last_modified (or size)
against obj.last_modified (which is updated by sync) and treat the cache as a
miss if they differ; persist obj.last_modified into the thumbnail/cache model
when creating/updating a thumbnail so future requests can compare and invalidate
stale thumbs, and on mismatch either delete the stored thumbnail
(default_storage.delete(thumb.path)) or skip returning the redirect so a fresh
thumbnail is generated.
- Around line 753-758: The code assigns new_size only inside the "if not height"
branch causing an UnboundLocalError when size includes height; to fix it, always
build new_size: retrieve width = size["width"] and height = size.get("height",
None), compute height = int(orig_height * (width / float(orig_width))) only when
height is falsy, then set new_size = (width, height) outside/after that
conditional so img.thumbnail(new_size) always has a defined tuple (refer to
variables width, height, new_size, orig_width, orig_height and the img.thumbnail
call).
- Around line 723-724: _class attributes `_sizes` and `_prefix` are read at
import time and must be changed to per-request properties that read
settings.THUMBNAILS each access so runtime changes (e.g. via override_settings)
take effect; replace the static attributes `_sizes` and `_prefix` with `@property`
methods (or equivalent instance properties) that return
settings.THUMBNAILS["SIZES"] and settings.THUMBNAILS["STORAGE_PREFIX"]
respectively, update any references in the class (e.g., inside the thumbnail
generation methods) to use self._sizes and self._prefix so the view (tested by
test_thumbnail_settings_change_regenerates) reads fresh values on every request.

---

Nitpick comments:
In `@ami/main/tests.py`:
- Line 238: In tests.py replace the unnecessary f-string used in the
self.client.get call by removing the "f" prefix so the literal URL string passed
to self.client.get("/api/v2/captures/thumbnails/") is a normal string; locate
the call to self.client.get(f"/api/v2/captures/thumbnails/") (the assignment to
response) and change it to use a plain string.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 841ee2b6-dc36-4a0e-94fd-826bffbf030b

📥 Commits

Reviewing files that changed from the base of the PR and between 8077c2a and 04cb3b3.

📒 Files selected for processing (6)
  • ami/main/admin.py
  • ami/main/api/serializers.py
  • ami/main/api/views.py
  • ami/main/models.py
  • ami/main/tests.py
  • config/settings/base.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • config/settings/base.py
  • ami/main/api/serializers.py

Comment thread ami/main/api/views.py Outdated
Comment thread ami/main/api/views.py Outdated
Comment thread ami/main/api/views.py Outdated
Comment thread ami/main/api/views.py Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ami/main/api/views.py`:
- Around line 740-744: Change the thumbnail cache lookup so it doesn't filter by
configured dimensions (obj.thumbnails.get(**size))—instead fetch the thumbnail
by its label (e.g. obj.thumbnails.get(label=label) or
filter(label=label).first()) and then treat a width/height mismatch as a cache
miss in the same check that uses thumb.last_modified and default_storage.exists;
specifically compare thumb.width (and thumb.height when height is explicitly
configured in the SIZES entry) against the requested size and proceed to
delete/recreate only when dimensions differ or when last_modified/exists
indicate staleness. Ensure you still handle SourceImageThumbnail.DoesNotExist
and preserve the existing delete/upload flow.

In `@ami/main/migrations/0085_sourceimagethumbnail.py`:
- Around line 25-33: Change the ForeignKey on SourceImageThumbnail named
source_image to use on_delete=django.db.models.deletion.CASCADE and remove
null=True from the field definition in the SourceImageThumbnail model so
thumbnails are deleted with their SourceImage; add a pre_delete signal (or model
delete override) for SourceImageThumbnail that calls
django.core.files.storage.default_storage.delete(instance.path) to remove the
file from storage when a thumbnail is deleted; ensure you create a follow-up
migration to update existing rows that have source_image_id IS NULL (or
delete/associate them) before removing nullability.

In `@ami/main/tests.py`:
- Line 237: The test uses a no-op f-string in the GET request (response =
self.client.get(f"/api/v2/captures/thumbnails/")) which triggers Ruff F541;
change it to a plain string by removing the leading f (response =
self.client.get("/api/v2/captures/thumbnails/")) to eliminate the warning and
keep behavior identical.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f36689e1-7ffe-40f8-9061-c2db396a8a2d

📥 Commits

Reviewing files that changed from the base of the PR and between 04cb3b3 and 1a150a2.

📒 Files selected for processing (7)
  • ami/main/admin.py
  • ami/main/api/serializers.py
  • ami/main/api/views.py
  • ami/main/migrations/0085_sourceimagethumbnail.py
  • ami/main/models.py
  • ami/main/tests.py
  • config/api_router.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • ami/main/admin.py
  • ami/main/models.py
  • ami/main/api/serializers.py

Comment thread ami/main/api/views.py Outdated
Comment thread ami/main/migrations/0091_sourceimagethumbnail_and_more.py
Comment thread ami/main/tests.py Outdated
Comment thread config/settings/base.py
Comment thread ami/main/api/views.py
Comment thread ami/main/api/views.py Outdated
Comment thread ami/main/api/serializers.py Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ami/main/api/views.py`:
- Around line 726-740: The retrieve view currently issues a permanent redirect
which causes caching of the media URL; update the redirect in retrieve (the
redirect(...) call) to be temporary instead of permanent so browsers/CDNs won't
cache the target and lazy-regeneration still works (i.e., set permanent=False or
otherwise use a temporary redirect in the redirect(...) invocation in retrieve).

In `@ami/main/models.py`:
- Around line 2224-2262: The code only generates thumbnails when
self.deployment.data_source exists, causing uploaded captures (saved to
default_storage by create_source_image_from_upload()) to 404; change the
conditional in the thumbnail generation method to handle a fallback when
deployment.data_source is missing but self.path exists: attempt to read the
image via read_image(config=config, key=self.path) when deployment.data_source
is present, and otherwise open the file directly from default_storage (e.g.,
default_storage.open(self.path) and PIL.Image.open()) with the same error
handling and thumbnail creation flow (keep existing uses of read_image,
default_storage.save, self.thumbnails.create and the prior thumbnail cleanup),
so both data-source-backed and uploaded images produce thumbnails.
- Around line 2451-2463: Add a DB-level uniqueness constraint and index on the
(source_image, label) pair to prevent concurrent-creation duplicates: update the
SourceImageThumbnail model (class SourceImageThumbnail) Meta to declare a
UniqueConstraint on ('source_image', 'label') and add a matching Index for the
same columns (or use unique_together plus an Index) so lookups performed by
find_or_generate_thumbnail_for_label() are safe and fast; then create and run
the corresponding Django migration to apply the constraint.
- Around line 2247-2258: The code writes the new thumbnail to thumbnail_key via
default_storage.save() before removing existing thumbnails, which causes the
just-saved file to be deleted when iterating
self.thumbnails.filter(label=label); change the order so you first delete
existing storage objects and DB rows (iterate
self.thumbnails.filter(label=label) and call default_storage.delete(t.path) then
t.delete()) and only then call default_storage.save(thumbnail_key, buffer) and
create the new thumbnail record (thumb = self.thumbnails.create(...)). Also add
a uniqueness constraint on the SourceImageThumbnail model for (source_image,
label) (and adjust any .get(label=label) usage) to enforce the
one-thumbnail-per-label assumption.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3208e6c1-5e4e-4068-8721-2eb36d86e717

📥 Commits

Reviewing files that changed from the base of the PR and between 1a150a2 and c7484ab.

📒 Files selected for processing (2)
  • ami/main/api/views.py
  • ami/main/models.py

Comment thread ami/main/api/views.py Outdated
Comment thread ami/main/models.py Outdated
Comment thread ami/main/models.py Outdated
Comment thread ami/main/models.py
…edirects, and add index for (source_image, label).

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (3)
ami/main/tests.py (1)

237-237: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Remove the no-op f-string to fix Ruff F541.

The f-prefix serves no purpose here since there are no placeholders.

🔧 Proposed fix
-        response = self.client.get(f"/api/v2/captures/thumbnails/")
+        response = self.client.get("/api/v2/captures/thumbnails/")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ami/main/tests.py` at line 237, The failing Ruff F541 is caused by a no-op
f-string in the test where you call
self.client.get(f"/api/v2/captures/thumbnails/"); remove the unnecessary
f-prefix so the call becomes self.client.get("/api/v2/captures/thumbnails/").
Update the line in ami/main/tests.py (the test containing the response =
self.client.get(...) call) to use a plain string literal to eliminate the lint
error.
ami/main/migrations/0085_sourceimagethumbnail.py (1)

25-33: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Migration mirrors the SET_NULL FK that orphans thumbnail rows.

Same issue as raised on the model definition in ami/main/models.py: when a SourceImage is deleted, this leaves SourceImageThumbnail rows with source_image=NULL and dangling default_storage blobs. Switch to CASCADE here in lockstep with the model change so the two stay consistent.

🐛 Proposed migration change
                 (
                     "source_image",
                     models.ForeignKey(
-                        null=True,
-                        on_delete=django.db.models.deletion.SET_NULL,
+                        on_delete=django.db.models.deletion.CASCADE,
                         related_name="thumbnails",
                         to="main.sourceimage",
                     ),
                 ),

If any rows have been created with source_image_id IS NULL during testing, you'll need a follow-up data migration to delete them before the null=True can be dropped.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ami/main/migrations/0085_sourceimagethumbnail.py` around lines 25 - 33,
Migration 0085_sourceimagethumbnail.py currently defines the
SourceImageThumbnail field "source_image" as models.ForeignKey(..., null=True,
on_delete=SET_NULL) which leaves orphaned thumbnails and blobs when a
SourceImage is deleted; change the FK to use
on_delete=django.db.models.deletion.CASCADE (and remove null=True if you also
intend to make the column non-nullable) so thumbnails are deleted together with
SourceImage, and if any records with source_image_id IS NULL exist add a small
follow-up data migration to delete those orphaned SourceImageThumbnail rows
before removing null=True.
ami/main/models.py (1)

2454-2475: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

SET_NULL orphans thumbnail rows and their storage objects.

A SourceImageThumbnail is meaningless without its SourceImage (the viewset only resolves thumbnails through a SourceImage pk), so on_delete=SET_NULL leaves rows that are unreachable from the API while still occupying the DB and the underlying default_storage blob. The unique constraint also won't deduplicate orphans — Postgres treats NULLs as distinct, so multiple orphaned (NULL, "small") rows can coexist. Prefer CASCADE (drop null=True) and pair it with file cleanup via a pre_delete signal that calls default_storage.delete(instance.path).

Note: this also requires updating the migration in ami/main/migrations/0085_sourceimagethumbnail.py to match, and any production data with source_image_id IS NULL will need to be cleaned up before the migration can drop nullability.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ami/main/models.py` around lines 2454 - 2475, Change SourceImageThumbnail to
not orphan thumbnails: set source_image ForeignKey on_delete=models.CASCADE and
remove null=True (and blank if present) on the source_image field in the
SourceImageThumbnail model; add a pre_delete signal handler for
SourceImageThumbnail that calls default_storage.delete(instance.path) to remove
the underlying file when a thumbnail row is deleted; update the migration
ami/main/migrations/0085_sourceimagethumbnail.py to reflect the nullability and
on_delete change, and ensure any production rows with source_image_id IS NULL
are deleted or remapped before applying the migration.
🧹 Nitpick comments (3)
ami/main/tests.py (1)

275-275: ⚡ Quick win

Use timezone-aware datetime for Django compatibility.

datetime.datetime.now() creates a naive datetime, which can cause comparison warnings or errors in Django projects with USE_TZ=True. Use timezone.now() from django.utils instead to ensure timezone-aware datetimes.

🔧 Proposed fix

First, ensure the import at the top of the file:

from django.utils import timezone

Then update line 275:

-        self.first_capture.last_modified = datetime.datetime.now()
+        self.first_capture.last_modified = timezone.now()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ami/main/tests.py` at line 275, Replace the naive datetime assignment to
self.first_capture.last_modified with a timezone-aware timestamp: import
timezone from django.utils (add "from django.utils import timezone") and set
self.first_capture.last_modified = timezone.now() so tests use Django-compatible
aware datetimes.
ami/main/models.py (2)

2232-2235: 💤 Low value

Close the storage file handle and handle missing-file errors symmetrically.

For the uploaded-capture branch, default_storage.open(self.path) returns a file object that is never closed (PIL keeps the underlying stream alive but the storage wrapper isn't released), and a missing key raises a generic FileNotFoundError/backend-specific error that isn't translated to ObjectDoesNotExist like the S3 branch does. Use a with block and surface a consistent ObjectDoesNotExist so the viewset can map this to a 404 instead of a 500.

♻️ Proposed refactor
-            elif self.path:
-                img = PIL.Image.open(default_storage.open(self.path))
+            elif self.path:
+                try:
+                    with default_storage.open(self.path) as fh:
+                        img = PIL.Image.open(fh)
+                        img.load()  # force decode before fh closes
+                except (FileNotFoundError, OSError) as e:
+                    logger.error(f"Could not read uploaded image for {self.path}: {e}")
+                    raise ObjectDoesNotExist(
+                        f"SourceImage with id {self.pk} media not found"
+                    ) from e
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ami/main/models.py` around lines 2232 - 2235, The uploaded-capture branch
opens a storage file with default_storage.open(self.path) and never closes it
and it raises backend-specific FileNotFoundError instead of ObjectDoesNotExist;
change the code in the SourceImage loading path (the branch that checks
self.path and uses PIL.Image.open) to open the storage file inside a with
default_storage.open(self.path) as f: block, pass f (or f.file/stream) into
PIL.Image.open so the file handle is released, and catch FileNotFoundError (and
any storage-specific missing-file exceptions) to re-raise
django.core.exceptions.ObjectDoesNotExist(f"SourceImage with id {self.pk} media
config not found") so missing files are reported symmetrically with the S3
branch.

2462-2462: 💤 Low value

last_modified with auto_now_add=True duplicates BaseModel.created_at.

find_or_generate_thumbnail_for_label always deletes the existing row and create()s a fresh one on regeneration, so last_modified is set to "now" on every (re)create and never updates afterwards — semantically identical to created_at from BaseModel. If the intent is "when was this cached thumbnail last regenerated", use auto_now=True instead (and drop null=True, blank=True, which never apply when auto-managed). Otherwise consider dropping the field and using created_at in the staleness check.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ami/main/models.py` at line 2462, The last_modified DateTimeField is declared
with auto_now_add=True, making it duplicate BaseModel.created_at because
thumbnails are deleted and recreated by find_or_generate_thumbnail_for_label via
create(); change last_modified to auto-managed "last updated" semantics by
setting auto_now=True and remove null=True, blank=True (i.e., last_modified =
models.DateTimeField(auto_now=True)), or if you prefer to rely on
BaseModel.created_at for staleness checks, drop the last_modified field entirely
and update any references in find_or_generate_thumbnail_for_label to use
created_at instead.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ami/main/api/views.py`:
- Around line 727-730: The code assumes settings.THUMBNAILS["SIZES"] (_sizes) is
non-empty and uses next(iter(_sizes)) which raises StopIteration if empty;
update the code that computes label/size (the block using _sizes, label, size
and self.request.query_params) to first check if _sizes is falsy/empty and
return a clear error (e.g., raise an APIException/ValidationError or return a
Response with a descriptive message about missing thumbnail size configuration)
instead of calling next(iter(_sizes)); then only compute label =
query_params.get("label", next(iter(_sizes))) and size = _sizes.get(label, None)
when _sizes is non-empty.

In `@ami/main/models.py`:
- Around line 2218-2223: The thumbnail re-generation check can raise TypeError
because SourceImage.last_modified is nullable (set by
create_source_image_from_upload), so guard the datetime comparison in the
condition that uses thumb.last_modified < self.last_modified: ensure both
thumb.last_modified and self.last_modified are truthy datetimes before comparing
(e.g. only perform the < check when both are non-None), keeping the existing
checks for thumb, width, and default_storage.exists; update the conditional
around thumb, size, and default_storage.exists in the method that currently
contains the if (not thumb or thumb.width != size["width"] or
thumb.last_modified < self.last_modified or not
default_storage.exists(thumb.path)) to avoid comparing None with datetime.
- Around line 2236-2248: The JPEG save fails for non-RGB source images (e.g.,
RGBA, P) because img.save(buffer, format="JPEG") cannot write those modes;
before saving the thumbnail, convert the PIL Image to an appropriate mode (e.g.,
call img.convert("RGB") when img.mode is not already JPEG-compatible) so that
the subsequent img.save(buffer, format="JPEG") succeeds; update the code around
the thumbnail creation/BytesIO block (the img.thumbnail(...), buffer =
BytesIO(), img.save(...), contents = buffer.getvalue()) to perform the
conversion on the img object prior to saving.

---

Duplicate comments:
In `@ami/main/migrations/0085_sourceimagethumbnail.py`:
- Around line 25-33: Migration 0085_sourceimagethumbnail.py currently defines
the SourceImageThumbnail field "source_image" as models.ForeignKey(...,
null=True, on_delete=SET_NULL) which leaves orphaned thumbnails and blobs when a
SourceImage is deleted; change the FK to use
on_delete=django.db.models.deletion.CASCADE (and remove null=True if you also
intend to make the column non-nullable) so thumbnails are deleted together with
SourceImage, and if any records with source_image_id IS NULL exist add a small
follow-up data migration to delete those orphaned SourceImageThumbnail rows
before removing null=True.

In `@ami/main/models.py`:
- Around line 2454-2475: Change SourceImageThumbnail to not orphan thumbnails:
set source_image ForeignKey on_delete=models.CASCADE and remove null=True (and
blank if present) on the source_image field in the SourceImageThumbnail model;
add a pre_delete signal handler for SourceImageThumbnail that calls
default_storage.delete(instance.path) to remove the underlying file when a
thumbnail row is deleted; update the migration
ami/main/migrations/0085_sourceimagethumbnail.py to reflect the nullability and
on_delete change, and ensure any production rows with source_image_id IS NULL
are deleted or remapped before applying the migration.

In `@ami/main/tests.py`:
- Line 237: The failing Ruff F541 is caused by a no-op f-string in the test
where you call self.client.get(f"/api/v2/captures/thumbnails/"); remove the
unnecessary f-prefix so the call becomes
self.client.get("/api/v2/captures/thumbnails/"). Update the line in
ami/main/tests.py (the test containing the response = self.client.get(...) call)
to use a plain string literal to eliminate the lint error.

---

Nitpick comments:
In `@ami/main/models.py`:
- Around line 2232-2235: The uploaded-capture branch opens a storage file with
default_storage.open(self.path) and never closes it and it raises
backend-specific FileNotFoundError instead of ObjectDoesNotExist; change the
code in the SourceImage loading path (the branch that checks self.path and uses
PIL.Image.open) to open the storage file inside a with
default_storage.open(self.path) as f: block, pass f (or f.file/stream) into
PIL.Image.open so the file handle is released, and catch FileNotFoundError (and
any storage-specific missing-file exceptions) to re-raise
django.core.exceptions.ObjectDoesNotExist(f"SourceImage with id {self.pk} media
config not found") so missing files are reported symmetrically with the S3
branch.
- Line 2462: The last_modified DateTimeField is declared with auto_now_add=True,
making it duplicate BaseModel.created_at because thumbnails are deleted and
recreated by find_or_generate_thumbnail_for_label via create(); change
last_modified to auto-managed "last updated" semantics by setting auto_now=True
and remove null=True, blank=True (i.e., last_modified =
models.DateTimeField(auto_now=True)), or if you prefer to rely on
BaseModel.created_at for staleness checks, drop the last_modified field entirely
and update any references in find_or_generate_thumbnail_for_label to use
created_at instead.

In `@ami/main/tests.py`:
- Line 275: Replace the naive datetime assignment to
self.first_capture.last_modified with a timezone-aware timestamp: import
timezone from django.utils (add "from django.utils import timezone") and set
self.first_capture.last_modified = timezone.now() so tests use Django-compatible
aware datetimes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 331b6a24-7049-49b5-963a-a068cdf22aaf

📥 Commits

Reviewing files that changed from the base of the PR and between c7484ab and 182c035.

📒 Files selected for processing (4)
  • ami/main/api/views.py
  • ami/main/migrations/0085_sourceimagethumbnail.py
  • ami/main/models.py
  • ami/main/tests.py

Comment thread ami/main/api/views.py
Comment thread ami/main/models.py Outdated
Comment thread ami/main/models.py
mihow and others added 4 commits June 4, 2026 14:37
Two F541 (Ruff: f-string without placeholders) hits, plus a small
behavioural cleanup on the list action.

* `SourceImageThumbnailViewSet.list` now raises
  `MethodNotAllowed` (405) instead of `NotFound` (404). The route exists
  for `/captures/thumbnails/<pk>/?label=...` only; listing has no
  defined semantics. 405 sets the Allow header and is the correct REST
  response.
* Updated `test_thumbnail_no_list` to expect 405 and to use a plain
  string literal (the previous test used an f-string with no
  placeholders).

Closes review thread on ami/main/tests.py L237 (Ruff F541).

Co-Authored-By: Claude <noreply@anthropic.com>
When `settings.THUMBNAILS['SIZES']` is empty (misconfiguration), the
default-label fallback `next(iter(_sizes))` raised `StopIteration` and
surfaced as a 500 with no useful detail. Replace with an explicit empty
check that returns 404 with a clear API error message, and use
`next(iter(_sizes))` only after we know the dict is non-empty (also
switch the fallback expression to `or next(...)` since the previous
form passed `next(iter(...))` as the default arg unconditionally — fine
for non-empty dicts but still worth tidying alongside the guard).

Adds a test that overrides THUMBNAILS with an empty SIZES dict and
asserts the endpoint returns 404 with the configured-error message.

Co-Authored-By: Claude <noreply@anthropic.com>
`requests.get(url)` with no timeout blocks indefinitely if the upstream
storage is slow or unreachable. The thumbnail-generation path
(`SourceImage.find_or_generate_thumbnail_for_label`) runs this
synchronously inside a Django web request, so a stuck upstream ties up
a gunicorn worker; a gallery's worth of stuck calls drains the pool.

Default timeout is `(connect=5s, read=30s)`. Callers that need
different limits (e.g. very large originals on a slow link) can override
by passing `timeout=`. Behaviour for fast / responsive upstreams is
unchanged.

Tests verify the default is finite and that explicit overrides are
honored.

Co-Authored-By: Claude <noreply@anthropic.com>
…rent-gen race

Two related bugs in `find_or_generate_thumbnail_for_label`:

1. Under concurrent gen for the same (source_image, label) pair the loser
   could either (a) hit the new UniqueConstraint and 500 with
   IntegrityError when its delete-loop ran before the winner's INSERT,
   or (b) destroy the winner's row + storage blob when the delete-loop
   ran after. Either outcome is wrong; both were possible because the
   code did `filter(label=label).delete()` then `thumbnails.create(...)`
   instead of a single atomic upsert.

2. The pre-save delete-loop also deleted the storage blob BEFORE the
   new `default_storage.save()`, which on storage backends that suffix
   colliding keys (FileSystemStorage default) left the row pointing at
   a path that the next save could overwrite in the same way the racer
   would. Belt-and-braces failure where the cleanup order itself was the
   bug.

Fix:

* Replace the delete-then-create with `update_or_create` keyed on
  `label`. Django serializes via the UniqueConstraint and, on collision,
  catches IntegrityError on INSERT and re-runs as UPDATE. No 500s, no
  destructive deletes of concurrent rows.
* Snapshot the prior `path` before the new save and clean it up after
  the row update only when the backend gave us a new key
  (`prior_path != thumbnail_path`). Backends that overwrite in place
  (boto3 with `AWS_S3_FILE_OVERWRITE=True`) skip the cleanup and avoid
  the file-deletion race entirely.
* Force-bump `thumb.last_modified` on the UPDATE branch via
  `QuerySet.update` because `auto_now_add=True` only fires on INSERT —
  without this, regen via source-mtime bump would re-trigger regen on
  every subsequent request as the field's pre_save callback preserved
  the existing (now-stale) value.

Tests:

* `test_thumbnail_regen_reuses_row_via_upsert`: delete the storage blob
  out from under a cached row, re-request, assert the row's PK is
  preserved (proving the UPDATE path was taken, not a fresh INSERT).
* `test_thumbnail_exists_newer_modified_source` updated to assert row
  reuse + freshness bump rather than the previous (incidental) PK
  change.

The `SourceImageThumbnail.last_modified` field semantics are still
muddled (see PR thread; rename or drop tracked separately).

Co-Authored-By: Claude <noreply@anthropic.com>
mihow and others added 6 commits June 4, 2026 15:12
…w delete

Thumbnails are pure derivatives of (source_image, label). The previous
`on_delete=SET_NULL, null=True` left orphan rows AND their storage blobs
behind forever whenever the parent SourceImage was deleted, with no
other reaper in the codebase. Silent storage growth and orphan-row
accumulation; CodeRabbit flagged this in PR #1306 and the thread was
closed without a fix.

Changes:

* `SourceImageThumbnail.source_image` → `on_delete=CASCADE`, no longer
  nullable. Deleting a SourceImage now takes its thumbnail rows with it.
* New `pre_delete` signal in `ami.main.signals` does best-effort
  `default_storage.delete(instance.path)` so the blob doesn't outlive the
  row. Fires on explicit row deletes AND CASCADE deletes (signals fire
  for both), so this covers parent SourceImage deletion automatically.
  Storage failures are logged and swallowed — a transient backing-store
  error must not block the delete chain.
* Migration 0090 is two-step: reap pre-existing rows where
  `source_image_id IS NULL` (the SET_NULL aftermath) first, then
  AlterField the FK. Orphan-row reap runs through the same pre_delete
  signal so their blobs are cleaned alongside.

Tests:

* `test_source_image_delete_cascades_to_thumbnails` — create 2 thumbnail
  rows for a capture, delete the capture, assert both rows gone via
  CASCADE.
* `test_thumbnail_delete_removes_storage_blob` — plant a real blob in
  storage, wire a row to it, delete the row, assert the blob is gone
  (proving the pre_delete signal fired and cleaned up).

Co-Authored-By: Claude <noreply@anthropic.com>
`size`, `last_modified`, `checksum`, `checksum_algorithm` all record
metadata read from the source image file in storage during sync or
upload, not row-level mtime. Document that in help_text so the field's
intent shows up in the admin UI and in API schema output (the four
fields previously had no help_text at all).

Mark the cluster read-only in the SourceImage admin so operators can't
clobber the cached metadata by mistake through the change form. Left
writable on the API so a project owner can still create SourceImage
entries by hand via an API client when the sync/upload flows don't fit;
the sync endpoint at POST /api/v2/deployments/<pk>/sync/ remains the
default population path.

Adds migration 0091 for the help_text changes (no schema change beyond
the AlterField metadata).

Co-Authored-By: Claude <noreply@anthropic.com>
…0089

0089 was renamed-into-existence this branch (originally 0088, bumped
when main added 0088_detection_det_srcimg_created_idx). The follow-up
0090 only existed to switch the just-created FK from SET_NULL to
CASCADE — same model, same concern, accidentally split because 0089
shipped wrong on the first cut.

This commit:

* Rewrites 0089 to create the FK with CASCADE from the start (drops
  null=True and the SET_NULL on_delete). The RunPython orphan-reaper
  from 0090 is gone — a fresh table can't have orphans.
* Deletes 0090.
* Updates 0091's dependency to point at 0089 (was 0090).

End DB state is identical to the old 0089+0090 pair, so anyone who
already migrated past 0090 just pulls — Django sees 0089 as
already-applied (same name) and skips. 0090 will appear as an applied
migration with no matching file; warning only, no breakage. Local devs
who want a clean django_migrations table can manually
DELETE FROM django_migrations WHERE app='main' AND name='0090_sourceimagethumbnail_cascade_source_image'.

Safe because this migration only landed on the in-flight feature
branch — no production state to preserve. CI re-runs from scratch
each time and passed (TestImageThumbnailViews 18/18 with --noinput on
a fresh DB).

Co-Authored-By: Claude <noreply@anthropic.com>
The collapse rationale is unnecessary in-tree — the branch was never
merged, so no production state ever ran the SET_NULL form. Anyone
mid-rebase on this branch will just see the new migration replace the
old one cleanly.

Co-Authored-By: Claude <noreply@anthropic.com>
@mihow

mihow commented Jun 5, 2026

Copy link
Copy Markdown
Collaborator

@loppear I pushed a few small fixes related to data integrity checks, clarification and formatting. The CI is passing now. I made one bigger change, but pushed it to a separate branch/PR to see what you think. This avoids n+1 queries and hitting the Django server and S3 API for every thumbnail URL. Which is mostly important for any of the list views: #1331

I re-deployed the arctia environment with the latest feat/thumbnail-source-images branch

@loppear

loppear commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator Author

Cleanup looks fine to me, CASCADE and update_or_create changes seem right.

mihow and others added 2 commits June 9, 2026 06:36
…iginals

The sessions list and gallery rendered example captures from capture.url —
the full-resolution original (often 9000px+ wide, fetched straight from
object storage). The events API has provided a thumbnails dict on
example_captures since this branch; use the medium (1024px) thumbnail with
a fallback to the original when thumbnails are absent, matching the
pattern in capture.ts. The medium size also keeps the og:image meta tag on
session details at a reasonable resolution.

Co-Authored-By: Claude <noreply@anthropic.com>
mihow added a commit that referenced this pull request Jun 10, 2026
* fix(ci): namespace concurrency group by workflow

The `test.backend.yml` and `test.frontend.yml` workflows used the same
concurrency group (`head_ref || run_id`), so a push that touched both
`ui/**` and non-`ui/**` paths would race the two workflows against the same
group, and `cancel-in-progress: true` killed the one that started second.

Observed on PR #1306 head `0d05fb47`: Backend Tests succeeded, Frontend Tests
got cancelled three seconds after starting. The FE run was re-run manually
and passed.

Namespace the group by `${{ github.workflow }}` so each workflow has its own
group. Cancel-on-new-push within the same workflow stays as before.

Co-Authored-By: Claude <noreply@anthropic.com>

* docs: restructure agent instructions — checklists, invariants, split reference files

Slim CLAUDE.md (.agents/AGENTS.md) from 899 to ~600 lines and add the guidance
that two years of PR review history shows is missing:

- Definition-of-done checklists (endpoint/queryset, model change, pre-review)
  mapping 1:1 to the most frequent review findings: missing permission checks
  and default filters, 500s from unvalidated params, PII in serializers,
  N+1-blind single-row test fixtures, WIP debris, stale PR descriptions.
- Domain invariants section: one feature extractor per clustering query,
  timezone.now() rule, ML-backend schema boundary, raise-don't-return,
  no assert in prod code, migration-in-same-PR rule.
- Automated review bots section: verify bot findings against actual lint
  configs, push back with evidence, 150-file CodeRabbit limit.
- Split bulk reference out of the always-loaded file:
  - DB schema table + query guide + QuerySet catalog -> docs/claude/reference/query-patterns.md
  - Worktree-vs-main-stack testing -> docs/claude/reference/worktree-testing.md
  - Chaos testing section -> pointer to existing docs/claude/debugging/chaos-scenarios.md
- New docs/claude/reference/canonical-patterns.md: reuse-before-reinventing
  table with file:line refs (SingleParamSerializer, ProjectMixin, permissions,
  stats pattern, fixtures).
- New ui/CLAUDE.md: frontend conventions (i18n via language.ts, no any in
  data-services models, mutation button states, naming, which lint rules are
  actually configured).
- New docs/claude/INDEX.md: searchable index of all agent docs.

Co-Authored-By: Claude <noreply@anthropic.com>

* ci: fail backend tests when migrations are missing

Add makemigrations --check --dry-run step before the test run. Forgotten
migrations have repeatedly shipped to main and required standalone follow-up
PRs (#722, #897, #946); this catches them at PR time.

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(jobs): add missing migration for job_type_key choices

Job.job_type_key builds its choices from VALID_JOB_TYPES, so registry changes
(post_processing and data_export job types added since the last migration)
alter the field definition. The migration is state-only — no database schema
change — but Django requires it, and the new makemigrations --check CI step
fails without it. Documented the gotcha in the agent instructions.

Co-Authored-By: Claude <noreply@anthropic.com>

* docs: address Copilot review on agent-docs

- ui/CLAUDE.md: React Query is pinned to v4, so mutation in-flight state is
  isLoading, not the v5 isPending.
- docs/claude/INDEX.md: fix stray space in the title heading.
- docs/claude/reference/worktree-testing.md: use repo-relative paths instead
  of developer-specific absolute paths in the override example.
- docs/claude/reference/canonical-patterns.md: correct the url_boolean_param
  usage reference to ami/jobs/views.py:165.
- .github/workflows/test.backend.yml: add --noinput to makemigrations --check
  so a model change needing a default fails deterministically instead of
  blocking on a prompt.

Co-Authored-By: Claude <noreply@anthropic.com>

* docs: add a comment/PR-text confidence-calibration rule to agent instructions

Adds a "Writing Comments, Docstrings, and PR Text" section: keep the
load-bearing fact and cut unverified mechanism, label hunches as hunches,
keep comments short and link to the PR/issue for depth, and distinguish
measured from inferred in PR descriptions. A confident but wrong comment is
read as fact and built on, so confidence should match evidence.

Co-Authored-By: Claude <noreply@anthropic.com>

* docs: make ui/AGENTS.md canonical with ui/CLAUDE.md symlinked

Match the repo-root convention (CLAUDE.md -> .agents/AGENTS.md): the frontend
conventions file is now ui/AGENTS.md, with ui/CLAUDE.md a symlink to it. Other
agent tools read AGENTS.md; Claude Code auto-loads the nested CLAUDE.md symlink
when editing under ui/. Updated references in the instructions and index.

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(jobs): regenerate job_type_key migration to cover the full registry

After merging main, VALID_JOB_TYPES includes regroup_events (added with the
configurable session-gap work). Regenerated 0023 so the job_type_key field
choices in the migration match the current registry; makemigrations --check
passes against the merged tree.

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
mihow and others added 9 commits June 9, 2026 17:05
Main now has 0089_add_regroup_permissions and 0090_session_time_gap_seconds_help_text
(PR #1292), which made the branch's 0089/0091 migrations a second leaf in the
migration graph and failed CI. Renumber to 0091/0092 on top of main's 0090.

Co-Authored-By: Claude <noreply@anthropic.com>
TestRolePermissions hardcoded taxon_id="5" in the identification
create payload. Any test class that runs earlier and creates taxa
advances the id sequence, so pk 5 stops existing and the POST fails
validation with 400 before the permission check returns 403. The new
TestImageThumbnailViews class in this branch sorts alphabetically
before TestRolePermissions and triggered exactly that. Use the
occurrence's own determination pk instead.

Co-Authored-By: Claude <noreply@anthropic.com>
PIL.Image.thumbnail() preserves aspect ratio and can emit a width 1-2px
below the requested spec (239 for a 240 spec was observed in a dev
deployment). The regen gate compares the stored width against the spec
with strict equality, so storing the encoder output made affected rows
regenerate on every request and download the original each time.

Store the requested spec width instead — the row identifies which
configured size it satisfies, matching how derivative-image systems key
their caches (sorl-thumbnail/easy-thumbnails hash the requested
geometry, WordPress keys by registered size name). The actual encoded
height is still stored; it is informational and never compared.

Existing rows that hold an encoder-output width self-heal with a single
regeneration on their next request.

Co-Authored-By: Claude <noreply@anthropic.com>
…uest

The regen gate called default_storage.exists() on every warm hit, which
is a HEAD request against the object store (or a stat on filesystem
backends) per thumbnail per page view. The cached row is already the
warm signal; the existence check only protected against blobs deleted
out of band, which currently has no code path. Orphaned rows now show a
broken image until the row is removed and the next request regenerates.

Co-Authored-By: Claude <noreply@anthropic.com>
Progressive encoding lets browsers paint a blurry-to-crisp preview as
bytes arrive instead of filling in top-to-bottom. optimize adds an extra
Huffman pass for slightly smaller files and quality=82 raises Pillow's
default of 75 for screen-sized thumbnails. All three flags only cost
time on the cold generation path.

Co-Authored-By: Claude <noreply@anthropic.com>
302 responses are not cached by browsers unless explicitly allowed, so
every page view re-paid a full Django round trip (auth, object lookup,
freshness check) per thumbnail, even when nothing changed. A five-minute
private Cache-Control covers typical back-and-forth browsing within a
session while staying well below the shortest presigned-URL lifetime
(django-storages AWS_QUERYSTRING_EXPIRE defaults to 3600s), so a cached
redirect can never point at an expired signature.

Co-Authored-By: Claude <noreply@anthropic.com>
A row whose path is empty (failed or interrupted generation) passed the
regen gate and redirected the browser to the storage root. Treat a blank
path as cold and regenerate.

Co-Authored-By: Claude <noreply@anthropic.com>
…facts

Several comment blocks added during review explained why a change was
correct relative to code that no longer exists. Keep only the constraints
a future reader needs (spec-width vs encoder output, auto_now_add bump,
presign-lifetime bound); drop the change-justification prose, which lives
in the commit history.

Co-Authored-By: Claude <noreply@anthropic.com>
Comment thread ui/src/data-services/models/capture.ts
… sweep

Co-Authored-By: Claude <noreply@anthropic.com>
Comment thread ami/main/models.py
label=label,
defaults={
"path": thumbnail_path,
"width": size["width"],

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @loppear, I went ahead with this approach for fixing the issue where some thumbs are generated with plus or minus one pixel width off from the spec. this changes the saved width on the thumbnail record to use the width from the spec instead of the actual generated width. that way the thumbnail isn't re-generated every time due to a cold hit.

@mihow mihow self-requested a review June 11, 2026 03:13

@mihow mihow left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested multiple times in the arctia environment. I like the enhancement in #1331 to keep traffic off of the main Django web server. But I'm okay to deploy that as a follow up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants