@chr @nightpool ever since the v3.3.0 update about 5-10% of media posts on my timeline from other instances show up as "Not available" and clicking on the image opens it in a new tab instead of the media modal thing

i don't recall ever seeing that happen before the update

· · Web · 2 · 0 · 1

@violet @chr @nightpool ...I... I clicked on the image... to try to... see what it was...

@LilyVers i realized as soon as i posted it that that might be an issue

@violet @nightpool not sure if its correlated with the update, but this usually means that at the time received the post, it couldnt cache the media, either due to a problem with our media backend or the remote server’s copy of it being unavailable. not much we can do about this unless it turns out it was an issue with mastodon not fetching stuff correctly or something, i’m afraid

@chr @violet @nightpool given how much it happens regardless of instance, i'm inclined to think it's a local problem. also iirc glitchbot posted about having some uploading media failures earlier today. so maybe there's something up with the media server

@haskal @chr @nightpool it's just weird because i don't remember ever seeing it happen before the update and it seems to have been getting worse over the last few days

like, half of the media posts on my timeline with multiple images fail to load at least one of the images

technical details 

@violet @haskal @nightpool i think the problem's down to this commit: (which is new in mastodon 3.3). i added some logging to cybrespace to confirm we're hitting the case on line 255

if our storage backend errors out while we're caching a remote image, it'll hit this Seahorse::Client::NetworkingError, which will cause it to just give up on fetching the image.

before 3.3, it would have thrown all the way up to where the request came in, which would probably have resulted in the remote server seeing a 500 error from us and failing to deliver the post. it then would have gone into retry logic and probably come in successfully a few minutes later.

@violet @haskal @nightpool
long story short is in 3.3 we're now seeing posts with missing images because mastodon is handling errors from the media store, whereas before it would have forced the remote server to retry.

the real solution would be to reduce our error rate from our media store, but a temporary solution is to immediately retry media store if they fail. i'll try that and see if that helps.

technical details 

@chr @violet @haskal hmm... wouldn't the s3 fetch/upload be happening in a background job though? even if it did error, that shouldn't impact the response status (and it should get retired by sidekiq I think?)

technical details 

@nightpool @violet @haskal good point. the ProcessingWorker job would have failed instead, and gotten retried by sidekiq. might be best to just restore that behavior tbh.

technical details 

@chr @violet @haskal yeah I think that specific commit is designed as, like, a workaround for wasabi going down for days at a time lol. definitely not the case here, and the logic seems over-sensitive (and we should probably relax the timeouts in general)

@nightpool @violet @haskal okay i patched to just log and re-raise that error. let's see if it reduces the number of posts with unavailable media

Sign in to participate in the conversation

cybrespace: the social hub of the information superhighway jack in to the mastodon fediverse today and surf the dataflow through our cybrepunk, slightly glitchy web portal support us on patreon or liberapay!