rl | 7 Aug 16:33 2012

Re: Extracting a set of fields using distinct for large database...

I was mistaken.  In my case, I do NOT want an embedded document.  I need to have the zip data in a separate collection.  I would like to script this if I can and just run mongo on the script to run it.  I think Mongo uses Javascript, right?


Thanks in advance.

On Tuesday, August 7, 2012 10:27:05 AM UTC-4, Sammaye wrote:
Updating and manipulating either subdocuments or referenced documents depends on the language you are using, what driver do you intend to use?

I would make a document like this for subdocument:

{
name
address
city
state
country
postalcode : {
      zip
      city
      state
      country
  }
}


--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
To unsubscribe from this group, send email to
mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
See also the IRC channel -- freenode.net#mongodb
Sam Millman | 7 Aug 16:48 2012
Picon

Re: Extracting a set of fields using distinct for large database...

First things first:

4.  I want to add a new attribute "zipcode" to the first collection and have it point to the "id" in the second collection.

I don't see a field called zipcode, what exactly are you pushing back to the original collection?

Presonally I would iterate through all of the user docs and doc an upsert to the "zip" collection on "postalcode".

On 7 August 2012 15:33, rl <robert-+UiTub9Vmd0S+FvcfC7Uqw@public.gmane.org> wrote:
I was mistaken.  In my case, I do NOT want an embedded document.  I need to have the zip data in a separate collection.  I would like to script this if I can and just run mongo on the script to run it.  I think Mongo uses Javascript, right?

Thanks in advance.


On Tuesday, August 7, 2012 10:27:05 AM UTC-4, Sammaye wrote:
Updating and manipulating either subdocuments or referenced documents depends on the language you are using, what driver do you intend to use?

I would make a document like this for subdocument:

{
name
address
city
state
country
postalcode : {
      zip
      city
      state
      country
  }
}


--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
To unsubscribe from this group, send email to
mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
See also the IRC channel -- freenode.net#mongodb

--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
To unsubscribe from this group, send email to
mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
See also the IRC channel -- freenode.net#mongodb
rl | 7 Aug 16:50 2012

Re: Extracting a set of fields using distinct for large database...

I would have to dynamically add the new attribute during processing.

On Tuesday, August 7, 2012 10:48:33 AM UTC-4, Sammaye wrote:

First things first:

4.  I want to add a new attribute "zipcode" to the first collection and have it point to the "id" in the second collection.

I don't see a field called zipcode, what exactly are you pushing back to the original collection?

Presonally I would iterate through all of the user docs and doc an upsert to the "zip" collection on "postalcode".


--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
To unsubscribe from this group, send email to
mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
See also the IRC channel -- freenode.net#mongodb
Sam Millman | 7 Aug 17:01 2012
Picon

Re: Extracting a set of fields using distinct for large database...

Yea I would do something like:

var cursor = db.users.find();
var doc = null;
while(var doc = cursor.hasNext()){
    var zip_doc = db.zip.findOne({'postalcode': doc.postalcode});

    var _id = '';
    if(zip_doc){ // We could of done fancy upsert or something but meh...
        _id = zip_doc._id
    }else{
        var doc = { // Fields you wish to insert }
        db.zip.insert(doc, true); var _id = doc._id;
    }

    db.users.update({_id: doc._id}, {$set: {zip: _id}});
}

that...

On 7 August 2012 15:50, rl <robert-+UiTub9Vmd0S+FvcfC7Uqw@public.gmane.org> wrote:
I would have to dynamically add the new attribute during processing.


On Tuesday, August 7, 2012 10:48:33 AM UTC-4, Sammaye wrote:
First things first:

4.  I want to add a new attribute "zipcode" to the first collection and have it point to the "id" in the second collection.

I don't see a field called zipcode, what exactly are you pushing back to the original collection?

Presonally I would iterate through all of the user docs and doc an upsert to the "zip" collection on "postalcode".


--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
To unsubscribe from this group, send email to
mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
See also the IRC channel -- freenode.net#mongodb

--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
To unsubscribe from this group, send email to
mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
See also the IRC channel -- freenode.net#mongodb
rl | 7 Aug 17:01 2012

Re: Extracting a set of fields using distinct for large database...

I think my problem/question is complicated.  Let me ask a simpler question:


How can I do a distinct query on an attribute that returns not only the distinct attribute but a few related attributes?

i.e.

db.mycollection.distinct("postalcode")  will return a list of unique postal codes.

I want to return the list of unique postal codes along with the city and state.

Furthermore, I want to put that resulting list into a new collection.

How do I do it?

Thanks again!

--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
To unsubscribe from this group, send email to
mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
See also the IRC channel -- freenode.net#mongodb
Sam Millman | 7 Aug 17:02 2012
Picon

Re: Extracting a set of fields using distinct for large database...

Ah you want all the fields in the new table to be distinct?

On 7 August 2012 16:01, rl <robert-+UiTub9Vmd0S+FvcfC7Uqw@public.gmane.org> wrote:
I think my problem/question is complicated.  Let me ask a simpler question:

How can I do a distinct query on an attribute that returns not only the distinct attribute but a few related attributes?

i.e.

db.mycollection.distinct("postalcode")  will return a list of unique postal codes.

I want to return the list of unique postal codes along with the city and state.

Furthermore, I want to put that resulting list into a new collection.

How do I do it?

Thanks again!

--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
To unsubscribe from this group, send email to
mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
See also the IRC channel -- freenode.net#mongodb

--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
To unsubscribe from this group, send email to
mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
See also the IRC channel -- freenode.net#mongodb
rl | 7 Aug 17:05 2012

Re: Extracting a set of fields using distinct for large database...

YES.  I could do it for one attribute but I want to copy over some related attributes.  Plus I need a way of cross-referencing from the first collection to the new second collection.

On Tuesday, August 7, 2012 11:02:30 AM UTC-4, Sammaye wrote:

Ah you want all the fields in the new table to be distinct?


--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
To unsubscribe from this group, send email to
mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
See also the IRC channel -- freenode.net#mongodb
Sam Millman | 7 Aug 17:06 2012
Picon

Re: Extracting a set of fields using distinct for large database...

Ok well I would do the same as I did above but isntead of searching only by postalcode I would search by all attributes. I think that is what your looking for.

On 7 August 2012 16:05, rl <robert-+UiTub9Vmd0S+FvcfC7Uqw@public.gmane.org> wrote:
YES.  I could do it for one attribute but I want to copy over some related attributes.  Plus I need a way of cross-referencing from the first collection to the new second collection.


On Tuesday, August 7, 2012 11:02:30 AM UTC-4, Sammaye wrote:
Ah you want all the fields in the new table to be distinct?


--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
To unsubscribe from this group, send email to
mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
See also the IRC channel -- freenode.net#mongodb

--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
To unsubscribe from this group, send email to
mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
See also the IRC channel -- freenode.net#mongodb
rl | 7 Aug 18:29 2012

Re: Extracting a set of fields using distinct for large database...

My concern is that for a 25 million row collection, this would take forever...  Could be something like O(n log n) or worse..

On Tuesday, August 7, 2012 11:06:49 AM UTC-4, Sammaye wrote:

Ok well I would do the same as I did above but isntead of searching only by postalcode I would search by all attributes. I think that is what your looking for.


--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
To unsubscribe from this group, send email to
mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
See also the IRC channel -- freenode.net#mongodb
craiggwilson | 7 Aug 18:38 2012
Picon

Re: Extracting a set of fields using distinct for large database...

This is relatively easy.  Take all the attributes that you want to be distinct and hash them (MD5, SHA1, etc...).  Then store this hash along with your documents.  All the ones with the same hash are the same.  Then you can use distinct on the hash field to pull back unique documents.  That answers your simpler question. 


Your more difficult question can be answered by taking these distinct values and saving them to the new collection.  If you set the _id value to be the hash, then you already have your link back to the original records.



On Tuesday, August 7, 2012 10:01:40 AM UTC-5, rl wrote:
I think my problem/question is complicated.  Let me ask a simpler question:

How can I do a distinct query on an attribute that returns not only the distinct attribute but a few related attributes?

i.e.

db.mycollection.distinct("postalcode")  will return a list of unique postal codes.

I want to return the list of unique postal codes along with the city and state.

Furthermore, I want to put that resulting list into a new collection.

How do I do it?

Thanks again!

--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
To unsubscribe from this group, send email to
mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
See also the IRC channel -- freenode.net#mongodb
Sam Millman | 7 Aug 18:58 2012
Picon

Re: Extracting a set of fields using distinct for large database...

Yea thats a good approach though to answer the full question it will still take "forever"

On 7 August 2012 17:38, craiggwilson <craiggwilson-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
This is relatively easy.  Take all the attributes that you want to be distinct and hash them (MD5, SHA1, etc...).  Then store this hash along with your documents.  All the ones with the same hash are the same.  Then you can use distinct on the hash field to pull back unique documents.  That answers your simpler question. 

Your more difficult question can be answered by taking these distinct values and saving them to the new collection.  If you set the _id value to be the hash, then you already have your link back to the original records.



On Tuesday, August 7, 2012 10:01:40 AM UTC-5, rl wrote:
I think my problem/question is complicated.  Let me ask a simpler question:

How can I do a distinct query on an attribute that returns not only the distinct attribute but a few related attributes?

i.e.

db.mycollection.distinct("postalcode")  will return a list of unique postal codes.

I want to return the list of unique postal codes along with the city and state.

Furthermore, I want to put that resulting list into a new collection.

How do I do it?

Thanks again!

--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
To unsubscribe from this group, send email to
mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
See also the IRC channel -- freenode.net#mongodb

--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
To unsubscribe from this group, send email to
mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
See also the IRC channel -- freenode.net#mongodb

Gmane