2/17/2010

 

Avoiding N+1 Problem

In the previous post, I talked about how to use SQLServer to store objects in a nosql way. But that leaves a opening question, "How can we avoid N+1 problem?". What is N+1 problem? Let's do a quick recap.

N+1 problem is also called 1+N problem and ripple loading. Loading "1" object, we also need to load "N" objects it references to for "N" times, one by one. So, that's why it is called 1+N. Why it is a problem? The problem is the overhead of network and sql execution.

Database <---Overhead--- Application
Database <---Overhead--- Application
Database <---Overhead--- Application
Database <---Overhead--- Application
The more sql we issued, the more overhead it would be. So, a natural solution is to batch the operations. If for the "N" reference, we just need "1" sql to load them all, then the problem is no longer a problem.

A quick solution would be loading all the objects inside a collection with one sql. For example, a object "User" has a field "messages" with collection of "Message". Then assuming the user kayla has a messages referencing message with id 2, 3, 4, 5. Then we just need one sql to load all her messages.

SELECT * FROM EntityState WHERE id IN (2, 3, 4, 5)

However, this solution only works for the case of loading collection reference. But if the "User" class also reference "Department", "Manager", "Calendar"... For each reference, we still need a separate SQL to get them because they are not in a collection.

Moving further, we can try to iterate all the fields of a object, get all the references. Also, for field with collection value, find out all the members. Then combining them together, we can load them together with one SQL again.

For example, kayla has Department with id 6, Manager with id 7, Calendar with id 8. And the messages referencing 2, 3, 4, 5. Then we just need

SELECT * FROM EntityState WHERE id IN (2, 3, 4, 5, 6, 7, 8)

And for the result set returned. Assigning 2, 3, 4, 5 back to the field messages. Assigning 6 back to field department. Assigning 7 back to field manager. Assigning 8 back to field calendar.

Does this solve all of the problems? Not yet... How about manager also has reference to several other objects 9, 10, department reference several other objects, 11, 12. Should we load 9, 10, 11, 12 in one sql? How can we do that?


This page is powered by Blogger. Isn't yours?

Subscribe to Posts [Atom]