Opened 9 years ago

Closed 9 years ago

#10768 closed enhancement (duplicate)

Add dojo.distinct to the Array base functions.

Reported by: Tom Trenka Owned by: dante
Priority: high Milestone: future
Component: Core Version: 1.4.0
Keywords: Cc: Eugene Lazutkin
Blocked By: Blocking:

Description

Have a (somewhat inefficient) implementation of a function to return an array of unique values given an array, here's the code:

dojo.distinct = function(/* Array */arr){
           return dojo.filter(arr, function(item, idx){
               return dojo.indexOf(arr, item) == idx;
           });
};

(Figured it is easier than creating a full-on patch).

It's not the most efficient implementation but I'm finding this to be very useful, particularly when emulating something like SELECT DISTINCT with an array of dojo.data items.

Change History (8)

comment:1 Changed 9 years ago by Tom Trenka

Another version that allows you to pass a comparator:

dojo.distinct = function(/* Array */arr, /* Function? */comparator){
    return dojo.filter(arr, comparator || function(item, idx, a)){
        return dojo.indexOf(a, item) == idx;
    }
});

comment:2 Changed 9 years ago by James Burke

Cc: Eugene Lazutkin added

My first impulse is to not put this in Base, I feel like we should only do the array functions in base that bring people up to ES5, and leave it to other modules to do other array operations. Otherwise drawing the line on what we add in base will be harder. For instance, should reduce also be in base, that sort of thing.

I want to say Eugene might have something like this laying around in dojox? At any rate, it is a smallish function, seems easy enough to just tell people how to create it themselves.

comment:3 Changed 9 years ago by Tom Trenka

That's fine--the only reason I suggested Base is because that's where the rest of the Array functions are, and I'm not super picky about it.

I went digging in dojox.lang (dante mentioned the same thing) but didn't find anything like this in the trunk. Either way, we could leave this open as a candidate for Core or something similar. The main thing for me was to make sure the function was recorded in a ticket somewhere.

comment:4 Changed 9 years ago by Eugene Lazutkin

Sounds like a good candidate for the Core module.

My only beef is the complexity of the beast --- O(n2). Most implementations operate on sorted lists (e.g., uniq below). Even if sorting is required, it is less complex than the trivial solution: sorting is O(n log n), uniq is O(n) --- much less than O(n2). Obviously the common solution places more requirements on the data: it should be orderable, yet in most cases it is possible to invent some kind of natural or even artificial way to sort items making them suitable for uniq.

uniq = function(arr){
  return dojo.filter(arr, function(x, i, a){
    return !i || x != a[i - 1];
  });
};

The example of use:

var a = [...]; // unsorted array

// sort it
sort(a); // or sort(a, comp)
// the only goal is to group identical items

var distinct = uniq(a);

comment:5 Changed 9 years ago by Tom Trenka

I agree, and would be much happier if we knew that we had a sortable list to work with--or a way to know that the array in question is sortable, since I'm definitely not keen on the O(n2) solution. However, I think the O(n2) solution is a good fallback, since it does not rely on a list being sortable, nor does it rely on knowledge of type to work.

We could probably write it in such a way (as opposed to the comparator approach above) so that a user can pass in an optional sort function (typical a,b signature), and if that is there to take the sort + filter approach--otherwise fallback to the O(n2) approach.

comment:6 in reply to:  5 Changed 9 years ago by Eugene Lazutkin

Replying to ttrenka:

Agreed. It should be doable with a proper function signature. Something like that:

distinct = function(a, sorted){
  // sorted: if Function, sort the array using this function as a comparator (?)
  // sorted: if true, use the efficient algorithm
  // sorted: if falsy or omitted, use the brute force algorithm
};

Something like that. I am not so sure we need to support an automatic sorting as in the example above, but we can easily support a Boolean sorted flag.

comment:7 Changed 9 years ago by dante

Milestone: 1.5future

comment:8 Changed 9 years ago by dante

Resolution: duplicate
Status: newclosed

dup of #9205

Note: See TracTickets for help on using tickets.