SisoDb – Simple tip on how to write more performant queries

Time for a really quick tip when it comes to queries in SisoDb that will give you better performance. When firing of a query with lots of parameters in your expression, move the logic into your model instead. That way the expression only has to lookup one index-value for your structure, and it will probably also state what you are doing. Example

session
    .Query<WorkOrder>()
    .Where(w => w.Status == WoStatus.Done || w.Status == WoStatus.Trashed);

should be

session
    .Query<WorkOrder>()
    .Where(w => w.IsClosed);

This was just a super simple example, but really. Try and get it down to one param. You will gain performance.

//Daniel

What about tweaking in SisoDb?

This could have been the shortest post I ever have written, since the question could be answered with a simple one liner:

There are no tweaks

The whole point with SisoDb is making it simple and performant out of the box. It will never be targetting multitudes of different scenarios craving needs for adoptions to perform in each scenario.

With that said, there are things you should take in to thoughts.

Effectively work with references between documents

A document/structur has no relations. Everything included in it will get serialized and stored. The root is indicated by adding this one member public Guid|int SisoId { get; set;} to it, saying this is a document and I want to be able to store it. You can of course let your document contain other complex types/classes both with or without the SisoId member. But there’s a huge difference.

Given an Order:

public class Order
{
	public Guid SisoId { get; set; }
}

If I add some simple attributes, it’s easy to understand that these will belong to the document.

public class Order
{
	public Guid SisoId { get; set; }
	
	public string OrderNo { get; set; }
	
	public DateTime CreatedAt { get; set; }
	
	public DateTime? ShippedAt { get; set; }
}

But what if I want a Customer linked to my Order? First, look at the Order as a pile of documents with the title “Order”. When writing a document and you want to reference someone elses writings, you put a reference to it by providing information where to find it, using e.g footnotes. The same thing goes here. Include CustomerId in the Order.

public class Order
{
	...
	public Guid CustomerId { get; set; }
	...	
}

public class Customer
{
	public Guid SisoId { get; set; }
	
	public string CustomerNo { get; set; }
	...
	...
}

Now you have made a connection saying: “My Order documents can point to Customer documents”. What If I want to have a Customer instance in the Order, since I don’t always might want to fire of new queries to fetch a Customer for a certain Order. Easy, just add a property for the Customer.

public class Order
{
	...
	public Guid CustomerId
	{
		get { return Customer.SisoId; }
		private set { Customer.CustomerId = value; }
	}
	
	public Customer Customer { get; set; }
	
	public Order()
	{
		Customer = new Customer();
	}
}

Now, when you store a Order document, Customer will not be stored, since SisoDb will know that Customer is a document living on it’s own, since it has the member public Guid SisoId { get; set; }. The CustomerId will get stored, and this can be used when you are querying orders. What you then can do is to say: “Hey, load Orders and include Customers.”

using(var uow = db.CreateUnitOfWork())
{
	var orders = uow.Query<Order>(q =>
				     q.Include<Customer>(order => order.CustomerId));
}

What SisoDb will do is to in the same query and JSON-result incorporate the JSON of the included documents/structures. Hence you will not get any extra roundtrips and you have gotten yourself a fully loaded Order. More information.

Control what to make Queryable

By default every simple property is extracted from your document/structures and made queryable in SisoDb. This is of course something that you can control and if you have a deep graph with lots of members but you only query on a few of them, you will get better performance by only making these fields queryable, which means that the Indexes table for that document type will get smaller. More information.

db.StructureSchemas.StructureTypeFactory.Configurations.NewForType<Customer>().OnlyIndexThis(c => c.CustomerNo);

That’s it for now. Now I’m going to tweak SisoDb for you, so you don’t have to 😉

//Daniel

Overview of the internal workings of SisoDb

I thought it’s time to give you an overview of how the internals of SisoDb works so that you get some insight into “performance” considerations.

How is data stored?

Before continuing, lets give a quick intro to SisoDb. SisoDb is a NoSql influenced provider giving you a document-oriented solution over Sql-server. It does this by seing your object graphs as structures (document in a NoSql document-oriented database) where public members of simple types (strings, numbers, dates etc) in the hierarchy are made queryable. As a default every property of the graph in the contract of the passed class or interface is flattened to fit one row in a special “Indexes-table”. This table is there for making queries against your structure. You can easily go in and place indexes on the columns you query a lot. All values are extracted using cached delegates generated using IL-Generator emits, hence I don’t relly on dirty, timeconsuming reflection calls.

Json-serialization

The structure is also stored as Json in the “Structure-table”. This is done to keep an intact schemaless representaion of your structure so that structures can be reindexed and to give an effective deserialization process when performing queries.

I’m using one of the fastest Json-serializer I know of in the .Net community: ServiceStack.Text you can read about a performance compare between the popular Json.Net library here: Json.Net vs ServiceStack.Text.

Not making everything queryable?

I’m currently implementing support for this, where you will be able to specify “hey don’t make eveything queryable, since I will only query on these properties”. That way you can boost performance making the “Indexes-table” much more slimmer.

This feature is coming really soon, perhaps it’s already implemented.

Separated entities & sharding

Since data is document-oriented one certain structure gets it’s own tables and they stand on their own legs not having relations to other tables. This is also a mindset you need to have when working with SisoDb, a mindset that it’s not an O/RM over a relational data model, it’s a document-database. You could take advantage of this and shard your model. I’m planning support for this in the future, but right now you will manually need to put up a proxy accessing different SisoDb instances depending on the type of structure being consumed.

Use replication for readmodels and writemodels

Since I’m targetting SQL-server you get some built in benefits where you could take advantage of the builtin support for replicating data between databases. This way you could easily have a write and a read store as well as put up a store which you then use some ETL tool to transform the data to a model more fit for reporting, warehousing etc.

How is data inserted?

When inserting entities there is a demand that you have a property named “SisoId”. That is the only demand SisoDb has on your model. That property could either return an Integer or an Guid.

Integer identities

In this scenario SisoDb looks how many entities you are inserting and reservers a range of identities and assign them to the model before performing the insert to the database. This way no ineffective insert + select for each row have to be made (as with Entity framework or traditional identities in NHibernate).

Sequential Guid identities

SisoDb doesn’t use traditional generated Guids but instead it uses sequential guids mimicking the algorithm used in SQL Server’s sequential guids.

Bulkcopy

I make use of custom datareader that reads over the structures and is consumed by the SQL bulkcopy. That way there are “NO custom generated ad-hoc batch SQL inserts” but effective inserts using the bulk copy.

Querying

When querying using uow.Where or uow.Query or uow.Get etc. your specified lambda expression are translated to parameterized SQL executed as a plain select via the ADO.Net command and NOT executed using ad-hoc SQL and the EXEC function in SQL server.

Well that was a short overview of how the internals works. Will be glad to try and answer any questions. There are more information about it here: http://sisodb.com/docs

//Daniel