Finally some critisicm to SisoDb!

Updated! This post has a follow up: http://daniel.wertheim.se/2012/03/11/ranting-is-good-for-you/

Finally some critisicm to SisoDb: http://www.code972.com/blog/?p=201 It means that someone has looked at SisoDb and taken a stand wheter they like it or not. For me it’s a great source of input to making things better and for you to rethink if you can or should use it. First, lets be clear. No SisoDb is not a traditional NoSQL solution, and as the title of its official page says it’s “a NoSQL’ish .Net implementation for SQL Server“, that should get you thinking right there. if you read the docs (http://sisodb.com/docs) you find information about why it exists.

A while back ago I started to fiddle with Microsofts CTP edition of code first in Entity framework 4. The product is great but I wanted something else, I wanted something “more” schemaless. I turned to MongoDB and wrote together an open source driver targeting .Net 4.

Still, there was things that I didn’t like so I built a document DB over Lucene. I relatively quickly discovered that I missed all the great infrastructure that SQL-server provides. Security, replication, scheduler etc, so I prototyped a solution that uses JSON to create a document/structure provider over SQL-server, namely: Simple Structure Oriented Db (SisoDb).

So, I just missed SQL-Server and thought it would be interesting to see if any document-oriented solution could be put together for SQL-server. Why SQL-Server and not MySQL, Oracle, SQLLite etc. Well, I use SQL-Server in my daily job and I had to start somewhere.

What it tries to solve in that sence is getting away from complex joins and high normalization etc. but as of right now it’s missing things like sharding hence you should be looking at it more as a “document oriented provider for SQL Server”.

The producing of SisoDb has never been driven by replacing any existing solution. If I would go with a purer document-oriented and accepted NoSQL solution, I would go with MongoDb or RavenDb. If I would have a case leaning at key-value, I would go with a key-value oriented data storage solution. If I need a highly denormalized table oriented environment with good convention based mapping and complex joins in queries, I would go with Entity framework 4.1 and never NHibernate. But this is my personal flavor.

Schemaless in SQL-server

Of course not. Again, it should get you thingking when you see words like schemaless, schemafree in any data storage solution, but even more when it’s used in conjuncion with a RDMS tool.

In SisoDb, the concept of impedence missmatch between your object model and your data model is “tried” to not bee of your concerns. This is done by looking at object graphs as documents and by handling adding and dropping of columns when properties in your code model is introduced or dropped. When it comes to more complex changes you will have to call an UpdateStructureSet method, where you CAN provide transformation code going from one old model version to a new one. But, yes, there is some effort there for you, but it’s not about upating existing mappings (either using XML-based or C#) and then running synchronizing transact SQL. Read more here: http://sisodb.com/docs/doc13, but yes, it will need to reinsert your data since it needs to touch the JSON.

By dealing with changes in this way, you can create separate assemblies holding the old model version ans showing you in code how it has been updated.

Id vs SisoId

Ok someone told me programmers are going to go nuts on this and I guess I’ll have to explain the decision, especially since I first used “Id” as the name. I want to be able to take entities from some other application domain and still keep the Identity. Since a lot use the name Id that name would “be taken” and since I don’t want the user to be able to provide mappings, SisoId it is. But if you really need it I guess you could add a property named Id in you class, pointing at SisoId? Or you could send me a request for supporting it.

Int and Guids for SisoId

Regarding ints and guids for SisoId and n-datatype is because I don’t want someone specifying a killer long string used in an join or in the page-index files stored in SQL server. So it’s about performance, which is also why I use sequential GUIDs (http://sisodb.com/docs/doc1). You could still add a constraint via an attribute and state that: public string OrderNo {get;set;} should be enforced to be unique, and that’s what the Uniques-table is for. it being unique.

Joins

No there are no F.K constraints or relationships used in the schema design of the tables. When querying using Where or Query syntax in SisoDb, SQL queries are executed against the query-table (Indexes-table) and the JSON stored in the Structure-table is returned and joined on the primary-key of both tables, but no there’s no physical relation in the database. But that’s really easy to fix if You need it. You are not going to need it to uphold data referential integrity since SisoDb uses the StructureId in the related tables and since every insert is done in a transaction, you will not end up in a inconsistent state.

Create databases

Well of course it can help you run CREATE DATABASE, but you need to execute it under an account having user rights to do this. I’m doing it during the integrations tests all the time using Db.EnsureNewDatabase.

Deep hierarchies

It does support persisting them, and it does support querying nested items, at least.

Querying

The querying is done against the flattened Querying-table (Indexes-table), and as of now it’s up to you to provide indexes on it to boost the query performance. Of course, when an aggregate root contains collections of other classes/types, denormalization will happe, ssince the complete graph should be persisted in one row and the querying using eg. QxAny to query collections of contained complex types, will use like queries to match e.g productno in a string looking like this: <1><2>

When putting together an example for showing query performance, I found a bug (thank you for reporting it). Queries like Customer.Address.AddionalValues[].Value has a bug in it but will be corrected soon. But as of now you should be able to query Customer.Address.ZipCode as well as nested collections: Order.OrderLines[].ProductNo e.g (http://sisodb.com/Docs/Doc11)

Just did a quick test, querying in a database with the same model as in the insert tests (). There was 100.000 customers and I made two queries:

  • on Customer.CustomerNo:
  • on Customer.DeliveryAddress.ZipCode:

100.000 Customers – Identifying by customer.CustomerNo (int)
#1 Total seconds: 0.3312
#2 Total seconds: 0.0337

First execution takes longer since the query plan should be created and cached.

100.000 Customers – Identifying by customer.DeliveryAddress.ZipCode (string)
#1 Total seconds: 0.3371
#2 Total seconds: 0.0519

First execution takes longer since the query plan should be created and cached.

I will fix the bug above and write a more detailed querying comparision and show what’s going on when querying, as well as timing having an actual F.K relation in the database.

So why SisoDb?

Again, not trying to be a silverbullet and replacing existing technologies. And if you hate it, hate it with all your heart. If you find a use-case where it works for you…..great! Then it wasn’t just a fun project after all. If you need a pure NoSQL solution have a look at e.g MongoDB or RavenDB.

//Daniel

8 thoughts on “Finally some critisicm to SisoDb!

  1. Dan, we had tried using your implementation but went in a different direction for a number of reasons. You’re stuff is 1.0 and its a good base for 1.0. Hopefully, you can integrate some of the criticism to make it better.

    One question – would it be possible for you to post execution numbers of the same joins using traditional RDBMS architecture? I would be curious to see the difference. It would be good to do this in the future whenever you are throwing out numbers so that we can get a better idea of comparisons.

    • Hi,

      I don’t mean to sound harsh now, but I do point out that this was just a quick querying test:

      I will fix the bug above and write a more detailed querying comparision and show what’s going on when querying, as well as timing having an actual F.K relation in the database.

      I will post more info in a coming post where I do this comparisons, much like I did with inserts: http://daniel.wertheim.se/2011/04/19/sisodb-vs-entity-framework-4-1-code-first-inserts/

      It was just a quick response to the post written by Synhersko.

      I know you had to go with another solution since I yet don’t have any support for dynamics. I haven’t given that up I have just been focusing on getting better memory consumption etc. There are a lot of things in the pipe so we will see what comes next.

      //Daniel

      • Sorry, did not mean it to come across as a criticism. Just was thinking it would be good to run comparisons any time you throw up the numbers because I think SISO queries may actually be faster for embedded data since they don’t need to do JOINs.

      • Hi,

        No worries and I absolutely agree with you that the details should be there. I just wanted you to know that I was aware that there was to little info. Yes I do believe there can be benefits, especially when other tools ha to build up an answer from joined tables after having to query them using the same joins.

        //Daniel

  2. I meant to ask in my last comment, are you planning to do anything with dynamics or injection of schemas or has that option died on the vine?

  3. Hi Daniel,

    I’m using SIsoDB in a project and am quite pleased with the performance and the code you’ve created. I have one question regarding identities other than SisoID. is there a function in the API that will issue me the next value in an identity. For example, say I have a class:

    public Story
    {
    public Guid SisoId { get; set; }
    [Unique(UniqueModes.PerType)]
    public int StoryId { get; set; }
    }

    I would like to increment StoryId without have to perform a function like:

    private int GenerateNextIdentity(IUnitOfWork uow)
    {
    int nextIdentity = uow.GetAll()
    .AsQueryable()
    .Max(x => x.StoryId);

    return nextIdentity + 1;
    }

    Am I missing something – I understand that the table Story_Uniques contains the identity values, I can’t seem to find anything in the API that will increment the StoryId for me. Thanks.

    • Hi,

      there’s nothing in the API that will let you query the SisoDbIdentities table which holds the identities per entity. SqlIdentityGenerator is used to access it.

      //Daniel

  4. Pingback: Ranting is good for you « Daniel Wertheim

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s