These are chat archives for hibernate/hibernate-orm

22nd
Jan 2014
Sanne Grinovero
@Sanne
Jan 22 2014 16:17
ok thanks that helps clarifying the high level concepts.
If you scroll up to my idea to refactor the data structure which makes a PersistenceContext, and organize entries in two levels of branching: persistenceId[ persistorSequenceId ] -> HashMap< 'id', state >
then it turns out you never need EntityKey instances ever to be allocated, nor cached.. we'd consistently stick around the pair {persisterId, id} . I totally understand that some code will be slightly less readable, but I hope the impact won't be too harsh?
BTW so far we have been inspecting the memory costs, but also to consider that from a CPU profiling point of view, a set of hot methods are those around looking up stuff from various maps by "entityName"; if we had these persisterIds assigned as unique ordinals at bootstrap, various of those Maps would be refactored in simple array lookups. Can hardly be more efficient..
Steve Ebersole
@sebersole
Jan 22 2014 16:21
At some point you'd still need entityName lookup
to resolve the "persisterid"
or "persisterordinal"
i understand your point about this, but I think readability is an important distinction here too
Sanne Grinovero
@Sanne
Jan 22 2014 16:22
sure, some helper method would exist to provide the lookup entityName -> persisterOrdinal, but if we consistently use the ordinal in most places these lookups should be limited; and keeping an int around is also going to be better than pointers to strings
Steve Ebersole
@sebersole
Jan 22 2014 16:23
a multi-level lookup on more than one value is less readable then a "tuple composite" (EntityKey)
Sanne Grinovero
@Sanne
Jan 22 2014 16:23
I can't deny that.
Steve Ebersole
@sebersole
Jan 22 2014 16:23
If we can solve this using EntityKey, I'd like to
Sanne Grinovero
@Sanne
Jan 22 2014 16:23
I just hope that we'll hide the complexity in the implementation details, and not much would leak out to the clients of the PersistenceContext
Steve Ebersole
@sebersole
Jan 22 2014 16:23
reason being that the values in a EntityKey have evolved
and it would suck to have to redesign internal structures everytime that happens
well it would simply in the forms of the method args
Sanne Grinovero
@Sanne
Jan 22 2014 16:25
so for example you'd invoke PersistenceContext.contains( id, persisterOrdinal) rather than PersistenceContext.contains( makeEntityKey( id, persister) ) .. no big deal? and less allocations.
Steve Ebersole
@sebersole
Jan 22 2014 16:25
PersistenceContext.contains( id, persister ) is valid too
probably persister would have a getOrdinal() type method
and that bring up another point
non-directive apis
contains( Object, int )
versus
contains( EntityKey )
which is more readable? which directs you better abotu usage?
this is not an api, sure
but it is an spi
Sanne Grinovero
@Sanne
Jan 22 2014 16:28
yea I can't beat that, contains( EntityKey ) is obviously superior. Still, you might need to figure out "where do I get the EntityKey from?"
Steve Ebersole
@sebersole
Jan 22 2014 16:28
definitely which is the original discussion
construct it and bear the allocation cost?
Sanne Grinovero
@Sanne
Jan 22 2014 16:29
I won't try to defend the API as it's a lost battle for sure. The question is how much are we really losing, and how much do we gain in exchange.
Steve Ebersole
@sebersole
Jan 22 2014 16:29
or, try to look it up based on another piece of the trinity?
Sanne Grinovero
@Sanne
Jan 22 2014 16:45
actually if the API allows me to look up the same stuff from various different entry points, I think it would be more confusing than providing a simple method with two arguments rather than one..
Steve Ebersole
@sebersole
Jan 22 2014 16:46
I dont understand this ^^
the problem is maybe that you are not seeing the bi-directionality here...
inherently the EntityKey is the main way to look stuff up
so, given an EntityKey (aka, processing a ResultSet, processing a Session.load call, etc) find me the corresponding entity instance and/or entry
but
Sanne Grinovero
@Sanne
Jan 22 2014 16:48
It was my understanding you where proposing to have multiple flavours for contains, like a contains( EntityKey ) and also a contains( EntityEntry ) and contains( id, persister ) .. if I understand you correctly, I'd consider that API less nice than having just one of them.
Steve Ebersole
@sebersole
Jan 22 2014 16:48
no
that was not what i was suggesting
Sanne Grinovero
@Sanne
Jan 22 2014 16:49
ah ok :)
Steve Ebersole
@sebersole
Jan 22 2014 16:49
i am suggesting a way to go from all 3 points in the trinity to the other 2
today it is not fully reflexive
Sanne Grinovero
@Sanne
Jan 22 2014 16:49
but that's hardly going to remove allocations?
Steve Ebersole
@sebersole
Jan 22 2014 16:50
it could
Sanne Grinovero
@Sanne
Jan 22 2014 16:50
unless you think of some jolly object which implements all three interfaces :)
Steve Ebersole
@sebersole
Jan 22 2014 16:50
bear in mind that the original discussion here is to avoid allocating EntityEntry
the entry for "uninitialized" entities anyway
Sanne Grinovero
@Sanne
Jan 22 2014 16:51
yes EntityEntry is the main goal, but EntityKey is right after that..
Steve Ebersole
@sebersole
Jan 22 2014 16:51
i am not too concerned with EntityKey
you are
and maybe you are right
but i also prefer readable apis ;)
Sanne Grinovero
@Sanne
Jan 22 2014 16:51
sure
Steve Ebersole
@sebersole
Jan 22 2014 16:51
persister is a great example as a matter of fact of your suggestion taken to the extreme imho
every little detail broken down into 50-gazillion arrays
Sanne Grinovero
@Sanne
Jan 22 2014 16:52
right, because the contains operation is actually one of the bottlenecks ;-)
Steve Ebersole
@sebersole
Jan 22 2014 16:53
you do realize that the object creations happen outside contains right ;)
meaning those are different discussions
bbiab, need to kick start this release
Sanne Grinovero
@Sanne
Jan 22 2014 16:56
sure, and yes I realize that. Which is exactly how I got to this -admittedly daring/crazy- idea of changing the internal structure: to be able to satisfy a contains request without actually allocating the key object first.
Steve Ebersole
@sebersole
Jan 22 2014 17:58
what exactly is it about the contains call that takes so long?
EntityKey as defined is very well suited as a Map key, and contains uses the EntityKey for just that
Sanne Grinovero
@Sanne
Jan 22 2014 18:00
it doesn't take very long, but it's responsible for a large % of the EntityKey instances still being allocated after our first round of cleanups
Steve Ebersole
@sebersole
Jan 22 2014 18:02
ok, so it is strictly an allocation fix
Not sure what to say tbh. I understand the desire to remove the allocations, but to me I just keep coming back to the question of the readability of contains(Object,int) versus contains(EntityKey)
Object, int is about as non-directive as you can get
( technically it would be Serializable, int )
Sanne Grinovero
@Sanne
Jan 22 2014 18:06
yea I'm with you on that. I just couldn't think of something better than to overhaul the whole internal container .. :worried:
Steve Ebersole
@sebersole
Jan 22 2014 18:07
how about a compromise : contains(Serializable, EntityPersister)
Sanne Grinovero
@Sanne
Jan 22 2014 18:07
we'd probably just need to focus on your other fix, the one you identified yesterday, and then get new figures to see where we stand
Steve Ebersole
@sebersole
Jan 22 2014 18:07
in terms of API its little more directive; still avoids the need for allocating a key object; and can still use the type of structures you suggest interanlly
my "other fix"... you mean a singleton entry for uninitialized entities?
Sanne Grinovero
@Sanne
Jan 22 2014 18:08
that compromise doesn't look too bad. I'm just a bit afraid that it's going to navigate in two hashmaps, so while we relax the allocation cost it might fire back on CPU cost..
yes, the HHH issue you shared with stale yesterday
Steve Ebersole
@sebersole
Jan 22 2014 18:09
not necessarily wrt 2 map lookups
like I said above, if EntityPersister knows about its ordinal...
Sanne Grinovero
@Sanne
Jan 22 2014 18:10
oooooh right love that
Steve Ebersole
@sebersole
Jan 22 2014 18:10
wrt HHH and EntityEntry, there is a rub
which is why i started explaining the trinity
today EntityEntry is used to hold what is conceptually (and physically) the EntityKey
which precludes it from being used in a singleton manner
we wold need to devise a way to remove the {persister,id} info from EntityEntry
Sanne Grinovero
@Sanne
Jan 22 2014 18:26
you need those values even for the "placeholder" use case?
Steve Ebersole
@sebersole
Jan 22 2014 18:27
again @Sanne this goes back to the fact that I cannot resolve these things completely reflexively
given an EntityEntry (aside from the values stored on EntityEntry) there is no way for me to know the entity instance nor the key to which it corresponds
thats what needs to be addressed
Sanne Grinovero
@Sanne
Jan 22 2014 18:48
right, but for the placeholder case I thought you would replace the EntityEntry instance - which is very large - with some very small object, which only has essentially the usual couple to identify what exactly is being placeholded for
interestingly it's another situation in which the placeholder could essentially be the {id} without any wrapping, if you happen to store these in separate containers for each type
Steve Ebersole
@sebersole
Jan 22 2014 19:06

The idea was to use a single(ton) EntityEntry instance for all the uninitialized cases. But the end-game is the same no matter what, we need to be able to resolve the "entity key" info given a EntityEntry.

I don't think forcing some other type of object into the EntityEntry maps is a good choice. You are just asking for problems imho.