wanted: a data structure that holds a list of N objects and has the following properties

insert(X) - adds X to the list if there is room for it. If there is no room, it compares X to the maximum value currently in the list. If it is less, the maximum value is removed and X is put in its place.
get_list() - returns the list. the order of the elements does not matter

essentially, this data structure stores the smallest N values that were passed into insert(X)

now I've got some efficiency needs

insert(X) should take O(log N) time (or O(1) if amortized), and get_list shouldn't take longer than O(N^2) or something. basically insert(X) gets called a *lot* and get_list only gets called at the very end

also, the data structure shouldn't take more than O(N) space.

I'm thinking I can just use a max heap for this, but I was wondering if there were better options that aren't necessarily heap-based 🤔

@SuricrasiaOnline Would a modified binary search tree do this? especially if you store a pointer to the largest element .

@SuricrasiaOnline (not that you have to, since you can always find the largest element by going right until you can't any more)

@SuricrasiaOnline shouldn't a simple tuple (top,L) work?

insert(x) = compare top to x. If top < x, set x = top. Push x to L.

insert is O(1) and get_list is O(1)

@SuricrasiaOnline oh it would have to be a little bit more complicated. You'd need (top,top_index,L) and if top < x && L is full, remove top_index from L. Still O(1) if you use a random-access vector

@SuricrasiaOnline oh and you can even be sneaky and put x into top_index.

@zardoz I think that this would only put x into the list if it is greater than the maximum value? but I want to put it in if it is *less* than the maximum value. I want to hold the list of smallest objects passed into insert(X)

@SuricrasiaOnline ah. I misread your requirements.

I think a max-heap might be exactly what you need.

@SuricrasiaOnline i'd probably just use a max-heap for this tbh

i think this is basically just 'find the N largest elements in a list of arbitrary size', so you might want to look into online selection algorithms?

@SuricrasiaOnline This is pretty much what heaps are for - being a sorted set and handling chaotic inserts - but you could do it with a regular list, early-out by comparing against the last element in the list, then binary search to insert the new value and trim down to size, gives you O(log(n)) insert and o(n) memory footprint. Though that depends on you finding a list structure with o(1) inserts and indexes, and I don;t know that you're going to find one. Technically a list structure with o(log(n)) inserts would also work

But yeah anyway. I don't remember what heap performance characteristics actually are, but I do know they were good enough to nearly linearize my pathfinder that one time, so, seal of approval from me

Sign in to participate in the conversation

Cybrespace is an instance of Mastodon, a social network based on open web protocols and free, open-source software. It is decentralized like e-mail.