Welp, just discovered I somehow managed to compose a runaway `jq` script that has been running for the last 24 hours, happily generating a 292GB (and counting) summary of 1.8MB of source data 😅

We'll give that one another go then, shall we?

Upon inspection it appears that `.arr[].field1 as $var1 | .arr[].field2 as $var2` is actually a nested loop, and I was doing four levels of that...meaning I turned 1000 items into 1000000000000 items - that's a trillion, aka turning a kilobyte into terabyte.

So, uh...yeah, that math works out. Whoopsie daisy!

Follow

For those following along at home, `.arr[] | .field1 as $var1 | .field2 as $var2` is what I actually wanted, which ran in under a second and generated the 1.9MB summary that I wanted.

It's still (just barely) bigger than the source data, but that's because in retrospect "summary" was a bad word choice, it's actually more of a denormalization.

· · Web · 0 · 0 · 1
Sign in to participate in the conversation
Cybrespace

cybrespace: the social hub of the information superhighway jack in to the mastodon fediverse today and surf the dataflow through our cybrepunk, slightly glitchy web portal support us on patreon or liberapay!