Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Groot for Large Data Sets #52

Open
piercefreeman opened this issue Sep 28, 2015 · 3 comments
Open

Groot for Large Data Sets #52

piercefreeman opened this issue Sep 28, 2015 · 3 comments
Labels

Comments

@piercefreeman
Copy link

I'm using Groot with a pretty large provided data set (~100,000 objects with relationships). Some entities only have ~100 objects but the larger ones have around 30,000. Right now it's taking a long time for the parsing to take place, which seems to be related to the -[NSManagedObject(Groot) grt_setRelationship:fromJSONDictionary:mergeChanges:error:]. Specifically in the existingObjectsWithJSONArray method for executeFetchRequest. Does anyone have suggestions to speed up this specific process, perhaps on the CoreData level?

@aspcartman
Copy link

For large datasets it's always recommended to do things the hard way: by hand. Universality of tools comes in price. Also you should consider using "background" contexts.

@o15a3d4l11s2
Copy link

I am also interested in possible techniques for speeding up the persistence process. I tried resetting the context before persisting entities, but this did not affect the speed.

@gonzalezreal
Copy link
Owner

I think the performance problem resides on the structure of the data, rather on the amount of data. Of course this becomes more evident on large datasets.

One thing that affects performance when serializing from JSON is object uniquing, as it requires fetching data from the database before inserting.

If you take a look on how Groot is implemented, there are three serialization strategies:

  • Insert
  • Uniquing
  • Composite Uniquing

As you may guess, the first one is the most performant as it does not fetch from the database. If you know that there is no duplicate data in your data set, DO NOT set identityAttributes in your entity. This will make Groot use the Insert strategy.

Groot will pick the Uniquing strategy if the identityAttributes annotation has a single attribute, otherwise it will pick the Composite Uniquing strategy.

The Uniquing strategy requires one fetch for every array of JSON objects, whereas the Composite Uniquing strategy requires one fetch for every single JSON object (it is potentially the slowest of the three strategies).

I hope this sheds some light on the subject.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants