Follow

the scary thing about is when in file formats it has stuff like:

file: refers to a file
some-special-string: does a special thing

and that just invites the problem of someone using some-special-string as a file name

pls, anyone who does
do not ever do anything like this
keep your formal grammars unambiguous

if you do not, Ada Lovelace will come down from Programmer Heaven and smite you

· brutaldon · 5 · 3 · 6

this is why ADTs are sooooooooooo nice
you always know what a value refers to

(unless you fuck things up on purpose, but you'd prooobably have to write very unidiomatic code to make things ambiguous)

(might have been a protocol and not a file format, but the point stands. text is nice but only if you can fit a grammar onto it.)

@grainloom ok I'm doing it now because I want to meet Ada Lovelace

@grainloom See also: Field separators (If your value separator can under any circumstances appear in your values you fucked up (Looking at you, CSV))

@elomatreb @grainloom look up hl7. Field separators defined per field and can be changed at any point in the file. Basically CSV that has eaten itself.

@grainloom This is my number-one complaint with calls to ‘just use plain text’, I don’t want to parse poorly-specified, ambiguous schemata.

@Shamar idk, read it in some manpage the other day, not sure which
but similar problems apply to a lot of the system, eg. how basically nothing can handle spaces in file names

(hot take, $ifs should only contain tab and newline, or better yet, there should be no $ifs)

@Shamar this isn't really a Plan 9 specific problem, but it's one of those things that mess up an otherwise nice system

@grainloom

To be honest I don't know why one could need spaces in file names.

I should probably forbid them in #Jehanne at kernel level

@xj9

I can.

Every programmer continuously does that to users: we restrict what you can do with computers so that you can actually do something.

For example you can't put a 0x0 byte in the middle of a file name. Or a "/" in a domain name.

You are so used to our power you can't see it anymore. And actually most programmers are unaware of it.

But the truth is that we CAN tell you what to do and what not. And that's nothing!

Software programmers dictate how people THINK!

And this was true way before Cambridge Analytica: you perceive and act on the world through the software we write.

Through software we shape your synapses.

@grainloom

@Shamar @xj9 You only can't use 0x0 because of C's limitations. :blobshrug:
IMHO, type Path = List String (maybe NonEmptyList or something) makes more sense than using char*.

It makes sense in historical context, sure, but it doesn't have to always be like this.

And not all limits make you think, some are just practical because an upper bound is often nice to have. Eg. 255 characters per file name is probably a relief for on-disk file system formats.

@Shamar @xj9 Software isn't special in this regard btw, every protocol works like this. They restrict things so that other (hopefully useful) things can happen more easily.

@grainloom That's what NFSv4 does. Every path-component is a counted string and every path is a counted vector of components. (2³²⁻¹ max count for both)

'course then they kind of schmutzed it up by mandating STRINGPREP so not only could you not use non-Unicode filenames, but not even new versions of Unicode…but everyone ignored that and I think it got taken out in the bis.

@Shamar @xj9

@grainloom

Because of C's design.

It's not a limitation, but an explicit choice. There were alternatives, but they were explicitly discarded.

@xj9

@Shamar @xj9
So it's a chosen limitation. That doesn't change much.
Other choices might make more sense now.

@xj9

Maybe you will.

Because you might discover that such design decision makes scripting (and programming in general) easier and faster, for example.

What makes you more free, spaces in filenames or easier programming?

@grainloom

@Shamar @xj9 Easier programming IMHO. I want users to be able to do shell programming without the usual pitfalls. If that means patching coreutils to work on s-expressions or whateve, so be it.

@xj9

That's an UI issue in the software you use to organise your music.
Use an higher level UI if you don't want to mess with such low level details.

@grainloom

@xj9

You don't need a special tool, you want one. Both using spaces in file names and not using them have trade offs.

Without you have easier textual interactions.
With you have easier to write nice graphical file managers.

Depending on what you care more about you move between the two.

@grainloom

@Shamar @xj9
I mean, you can use Lua in Acme and it's like... fine. Having a proper encoding just makes things easier.

eg. linear TSV is pretty nice and simple and you don't need to fully parse it to split fields, because tabs are always escaped

@Shamar @xj9

Like, just having a standard lightweight container format would be enough, so you don't end up writing a parser for every tool.

TSV is nice because you can write an en/decoder in an hour (at most, if your C is rusty. more like 15 minutes otherwise) and it gets you enough structure without sacrificing readability

pure S-expressions are also super easy to parse, just need like... a way to do recursion??? i think? it's been a while since I wrote one (in Lua) but that too only took like 20 m.

@xj9 @grainloom

Basically it's a matter of habits.

You are used to operating systems that care much more about looking simple that about being simple.

But sometimes few consistent constraints can make a system more predictable and easy to compose.

@Shamar because they are useful

the question is, why would tools use space as a primary field separator

@Shamar like, if you wanna give files meaningful names, you're gonna use spaces

hyphens don't cut it (they have different semantic meaning in language) and underscores are irritating to type and look ugly

@xj9

Nice example. 😉

Use underscore.
If you like to SEE space instead of underscore into you file manager GUI configure your file manager GUI that way.

But since files are acted upon through commands in a textual interface that use spaces, spaces should not be allowed.

@grainloom

@Shamar @xj9 This is the same argument people (used to) use to defend ASCII-only systems. :blobshrug:

@grainloom

Actually if you completely remove text from computers, you won't have such problems with spaces.

As an alternative one might consider to replace the space with a different character in the input of textual interfaces (aka shells).

@xj9

@Shamar @xj9
Or just.... force users to use string literals for strings? And let unquoted strings mean something else?

I mean, you can't seriously say "but that's harder to learn" when we have just demonstrated how many problems it leads to.

If kids can learn Python in a week, programmers can bear with a teeeensy bit more syntax in their shells.

@grainloom

String literals mean escape rules.

And sometimes they are hard to get right for programmers too.

Every single time programmers are too shy to impose a simple restriction to users' input (or customers' will), they add to the pile of crap another source of complexity.

@xj9

@grainloom $IFS the environment variable shouldn't exist.

Something like IFS should be specifiable per-command using Pike's structural regular expressions. Possibly multiple somethings if I need a combination of both fields and lines.

Shells should have some lexical construct that lets me redefine both lines and 'fields' per-command, so if I want to work with lots of files having spaces, I can separate arguments with \0.

@Shamar
Sign in to participate in the conversation
Cybrespace

cybrespace: the social hub of the information superhighway

jack in to the mastodon fediverse today and surf the dataflow through our cybrepunk, slightly glitchy web portal