In parts 1- 7 of the article series, I had completed coding various types of associations using NHibernate.
This is the last of the eight-part article series. In this article we clear the question from the end of the first article, "How do we manage a persistent object across sessions?". The background for the article is very important since it paves the way and introduces important topics required for understanding the discussions here.
Background
All ORM applications use a persistence manager for functionality like database writes, reads and queries. In NHibernate it is done by interfaces called Session, Transaction, Criteria etc. Each session must first open a transaction and all activities on objects are within this boundary of transactions (saying it as transaction is a highly simplified way of representing it. In ORM it is generally referred to as a unit of action and not necessarily a transaction but you will get an idea with the word "transaction" and saves me from going into an explanation of "what a unit of action" is, which in itself is an exhaustive & interesting topic in ORM and is definitely not necessary for the consumtion of ORM software for application development but vital for developing an ORM library like NHibernate.).
Each session is said to be associated with a PersistenceContext. PersistenceContext provides functionality like maintaining unique object identity (called identity scoping), repeatable reads, avoiding recursive load of objects in a graph, automatic transactional writes, dirtychecking etc. People who have worked with ORM code will immediately know what this Persistence Context means just by reading the functionality it offers. Yes, it is the good old IdentityMap. For the rest, I explain the idea here which is essential to understand. Note that the idea expressed here is based on our previous experience in the development of an ORM library and not NHibernate specifically but the idea is almost the same for most ORM libraries.
As said earlier, for ORM software the first step is fixing object identity for persistent objects in a way that it is uniquely matched to a database row. The next step is ensuring that the identity is unique and only one persistent object exists representing a particular database row. This avoids ambiguity. So what is usually done to ensure this object uniqueness is, an in-memory hashtable is used with the database primarykey as the "key" field of the hashtable and the persistent object as "Value". Whenever ORM software does a looksup of a persistent object in the database while executing a query or gets it by id (i.e the primary key value), it first does a lookup for the persistent object in this hashtable using the "primary key" which is key field for the hashtable. If it exists it just returns the object from the hashtable. Otherwise the ORM software creates an object based on values obtained by executing the query against a database and adds it to the hashtable and then finally returns the object to the client code. Every persistent object used by the client code is stored in this hashtable. This is the first-level cache for the ORM software. When you ask for an object repeatedly, only the first time it is queried against the db. It is subsequently gotten from this cache or hashtable. This is called a repeatable read. The main thing is where you keep this hashtable. If you keep this hashtable at the process level and store all objects of a process there then you need to synchronize access to the hashtable from multiple threads. So the general idea is to keep this hashtable at the thread level and all persistent objects modified in a session and transaction within this thread is maintained inside this hashtable per thread. This is named a persistent context cache (more or less gives the rough picture of how the ORM library is developed). The next functionality offered by the persistent context is dirty checking. The name dirty checking is self explanatory. It means you find whether an object is modified or not within the scope of 1 transaction. ORM software uses two ways to do that: 1. By storing a snapshot of a persistent object separately at the start of the transaction and comparing it at the end of the transaction, 2. Keep an IsDirty flag and set it if an object is modified. NHibernate must use the first method of DirtyChecking obviously because we don't set any flags while modifying persistent objects and we don't inherit objects from a well-defined NHibernate interface which would be necessary for the second method. So what is the use of dirty checking? At the end of a transaction, NHibernate will synch the changes of all modified dirty persistent objects to the database if it is a successful transaction. You don't have to do anything. Persistent objects are managed by NHibernate. The snapshot method is preferred by a sophisticated library like NHibernate because you can fine-tune the SQL generated for updates to the database to include only fields that was actually modified instead of all the fields for the object. So now we are familiar with the persistent context. This persistent context and its cache is associated with each session separately. The diagram in Figure 1 gives a rough idea of the thread-level persistent context cache.
Figure 1 - Persistent Context Cache- Objects already in the cache are directly retrieved from the hashtable or cache in two steps. For objects not in the cache, 3 steps are required because first the item has to be brought into the cache and then it is returned to the client. The method session.getbyid(identifier) is shown to illustrate retrieval by using identifier or primarykey.
Please kindly note that I have simplified things in an easy to understand way to a large extent. This is just to provide an idea of the issues needed to understand clearly.
Code Example
A persistent object in NHibernate exists in four states: Transient, Persistent, Removed and Detached Objects. Objects instantiated using the "new" operator are called Transient objects. They do not have a db id. When this transient object is saved to a db it is associated with a db id and becomes a Persistent object. Objects loaded from the db also have the db id and are called persistent objects only. These are the two main type of objects in ORM. NHibernate introduces a third object state called Detached state. It is vital to understand this detached state and to understand this only, the topic of a persistent context cache was introduced in the background of this article.
When the session closes, the persistent context cache associated with the session also closes. However it is possible that the reference to a persistent object exists, outliving the life of the session in which it was created. The state of these persistent objects outside the scope of a persistent context cache is said to be detached.
The final lesser used state is the "REMOVED" state of an object which is the intermittent state of an object which is deleted in a session from the db but the transaction in which the deletion is called is not yet completed. So until the transaction completes, the object is in a state called REMOVED, after which it transitions to the Transient state.
Figure 2 below gives a rough idea of the states of the object:
Figure 2 - Various states of objects
Now the problem with the fetch in the end of the first article should have been easily understandable to everyone. The problem was that we got an object using a helper method in the DBRepository class. This DBRepository instance uses a session in the getItemById method to retrieve the object from the db and closes this session at the end of the method. When we later tried to consume the object we got an exception. We got an exception because by default NHibernate says lazy=true wherein all associations for the object retrieved is not sent with the object during retrieval and only a proxy is kept instead. This is the lazyloading feature in NHibernate. So when this proxy is accessed with the session object closed, it throws an exception. By setting lazy=false we got the full object instead of the proxy for associations. This is one of the solutions but it cannot be used everywhere. But now after the background given here and our current knowledge of the different states of the objects we can comprehend the real problem. The real problem is that when the session is closed by DBRepository after returning the object we need, the persistent context cache associated with the session is also closed. So what we have in hand is a detached object. This detached object cannot be directly used in any other session because it does not exist in the persistent context cache of the other sessions. So it has to be brought into the persistence context of the other session by a process called reattaching. The method call for it is:
newsession.update(object);
This reattaches the object to the new session. Reattachment for unmodified persistent objects can also be done using the method call lock().
In scenarios where the new session is already loaded with the different instance of an object having the same db identifier as the old object that you want to load into the new session's persistent context cache, the method to use is:
new_object = newsession.merge(old_object);
This "merge" method will copy the state of the old_object inside the new_object. After this call, the client is supposed to use the new_object only.
The solution to the problem we had in first article is to use these techniques of reattachment of objects. So by using Reattachment of objects, we now know the technique to use objects across sessions.
Conclusion
This concludes the eight part article series. I will write more articles on querying using NHibernate, fetching strategy in NHibernate and full samples of object conversation using reattachment in months to come when I find time for it. Until then enjoy NHibernate.