SisoDb – Using DbPipe to add compression

In v16.1.0 of SisoDb I’ve added a new interface SisoDb.IDbPipe which lets you hook into the process of writing and reading the string-data representing a document (default JSON) to and from the DB. You could use that for encryption, compression, ….? Now I will just show you how to add compression.
Continue reading

SisoDb – An early prototype

A while back ago I started to fiddle with the CTP of code first in Entity framework 4. The product is great but I wanted something else, I wanted something “more” schemaless. I turned to MongoDB and wrote together an open source driver targeting .Net 4. Still, there was things that I didn’t like so I built a document DB over Lucene. Ayende Rahien has also done this and he has a creat solution called Raven DB. I relatively quickly discovered that I missed all the great infrastructure that SQL-server provides. Security, tables for the DBA’s, replication, scheduler etc, so I prototyped a solution that uses JSON to create a document/structure DB over SQL-server, namely: Simple Structure Oriented Db (SisoDb). Note the word prototype here. I will make changes to it and I still need to build a nice querying solution, preferably using Linq.

How does it work

SisoDb stores your POCO-graphs (plain old clr objects) using JSON which enables us to go from a POCO to persistable JSON. For each entity there will be two tables (tables are created on the fly). One holds the Id of the entity and the Json-representation. The second table holds a key-value representation of each indexed property of the entity. If an entity contains complex types (custom classes) the scalar properties of these custom classes will be stored and indexed to. All scalar properties are stored as strings in the invariant-culture representation. Remember that the complete graph is serialized and deserialized so there’s no magical lazy loading and stuff, instead you have to think a bit before designing deep graphs.

Hey, wait! It sound like duplicate data is stored.

Yes, that’s correct. The complete graph of the entity is stored as JSON as well as in the second table (the index table) as key-values. This is done to get a more query friendly representation of the entity. You will be able to exclude properties from being indexed, but right now everything is indexed.

JSON, hmmm. How do I handle schema changes?

Well if you keep your self to the open-closed principle and just add new properties nothing is needed, it all works. You also don’t have to do anything if you drop properties, the JSON-serialization process will handle this for you. Since I use the well know JSON-framework by Newtonsoft (http://json.codeplex.com/), you can of course affect the serialization by injecting custom converters etc.

Pure POCOS, really?

Yes. No proxies or base-classes or interfaces. The only thing that is currently needed is one property:

public Guid Id { get; set; }

This is used to keep track of the entity and that is all that is needed.

Show me some code

The first step is of course to have an entity to persist. I will use a Customer that has scalar properties as well as complex-proprties of type Address.

The model

[Serializable]
public class Customer
{
    public Guid Id { get; set; }
    public string Firstname { get; set; }
    public string Lastname { get; set; }
    public ShoppingIndexes ShoppingIndex { get; set; }
    public DateTime CustomerSince { get; set; }
    public Address BillingAddress { get; set; }
    public Address DeliveryAddress { get; set; }
    public Customer()
    {
        ShoppingIndex = ShoppingIndexes.Level0;
        BillingAddress = new Address();
        DeliveryAddress = new Address();
    }
}

[Serializable]
public class Address
{
    public string Street { get; set; }
    public string Zip { get; set; }
    public string City { get; set; }
    public string Country { get; set; }
    public int AreaCode { get; set; }
}

[Serializable]
public enum ShoppingIndexes
{
    Level0 = 0,
    Level1 = 10,
    Level2 = 20,
    Level3 = 30
}

Pay attention, there’s no specific interfaces or base-classes. It’s a POCO!

Structure schemas

To get a hang of things I will first show you the structure schema that is generated. Note! This is nothing you need to control. The structure schema contains accessors to the key-property and all the indexed properties. The accessors are used to get values from an instance of the entity.

var customerSchema = database.StructureSchemas.GetSchema<Customer>();

Console.Out.WriteLine("KeyAccessor.Name \t=\t {0}", customerSchema.KeyAccessor.Name);

foreach (var indexAccessor in customerSchema.IndexAccessors) 
	Console.Out.WriteLine("IndexAccessor.Name \t=\t {0}", indexAccessor.Name);


KeyAccessor.Name = Id
IndexAccessor.Name = Firstname
IndexAccessor.Name = Lastname
IndexAccessor.Name = ShoppingIndex
IndexAccessor.Name = CustomerSince
IndexAccessor.Name = BillingAddress.Street
IndexAccessor.Name = BillingAddress.Zip
IndexAccessor.Name = BillingAddress.City
IndexAccessor.Name = BillingAddress.Country
IndexAccessor.Name = BillingAddress.AreaCode
IndexAccessor.Name = DeliveryAddress.Street
IndexAccessor.Name = DeliveryAddress.Zip
IndexAccessor.Name = DeliveryAddress.City
IndexAccessor.Name = DeliveryAddress.Country
IndexAccessor.Name = DeliveryAddress.AreaCode

Again, this is nothing you have to use, tweak or manage. It’s just there and works.

Insert some data

To get stared you need a SisoConnectionInfoYou which contains the connection string. Currently there’s also a StorageProviders enumeration, but it only has one provider (since I have stripped out the Lucene provider). You also need an instance of an SisoDatabase. This should be long-lived and is preferable controlled by your IoC. After that you do transactional work by letting the SisoDatabase create an UnitOfWork-instance.

var cnInfo = new SisoConnectionInfo(
	StorageProviders.SqlProvider,
	@"Data source=localhost;
	Initial catalog=SisoDbLab;Integrated security=SSPI;");

var db = new SisoDatabase(cnInfo);
db.EnsureNewDatabase();

//... create some new customer instances...

using (var unitOfWork = database.CreateUnitOfWork())
{
    unitOfWork.InsertMany(customers);
    unitOfWork.Commit();
}

The UnitOfWork implements IDisposable so you should use it in conjuntion with an using-block. If you don’t call Commit() , a rollback will be performed.
That’s it for today. Remeber, it’s a prototype, and for the next writing (hopefully this week) I will have some query capabilities to show, but not a complete Linq-provider.

The project is hosted here: http://code.google.com/p/sisodb/

Features in the pipe

– Querying
– Provide you with the unit-tests and integration tests.
– Attribute that enables you to say “Don’t index this property in the entity”.

Features in the future

– Navigation properties
– Support for other databases

//Daniel

SisoDb – First code released

Recently I started playing with Lucene.Net and Json.Net for the reason of creating my own solution for persisting hierarchial-structures of data. The goal is to make it schemaless and really simple. The result was: SisoDb. As of right now it got support for:

  • Commitable Unit of work
  • Insert
  • Update
  • DeleteByKey
  • GetByKey

Roadmap

My focus in the next couple of spent hours will lie on:

  • Querying support both for Indexes as well as free text querying using Lucene
  • Version-control for concurrency detection
  • Some support for relations
  • Distrubuted proxy

Some basics

SisoDb persists Structures. The only demand that is puts on the structures being persisted is that they must have a property that contains a Guid which acts as the unique-identifier for the Structure. By conventions, this member should be named Id but you can configure this via schemas. It can be System.Nullable or System.Guid. You don’t have to provide a value, since SisoDb will generate this for you upon insert. Contained objects in the Structure should not have an Id-property. A Structure should be seen as an aggregate where contained objects and properties are leafs of data, either represented of simple value types (strings, ints, decimals, datetimes) or complex types (your own types). The Structure will be represented in the database as Json, which is produced using Newtonsofts Json.Net library. Which means, you can use the attributes of this library to affect how the structure is serialized and deserialized. As of right now there is one underlying provider, which stores the Structures (the Json, Key and Indexes) using Lucene.Net. I’m thinking of building a provider that uses MsSql as the storage media, but as of right now, it’s Lucene that is used.

An example

Create a Structure

The example model is really simple. A Customer-entity containing some simple properties for naming as well as Billing-address and Delivery-address, which is represented by a custom complex type, Address.

[Serializable]
public class Customer
{
    public Guid Id { get; set; }
    public string Firstname { get; set; }
    public string Lastname { get; set; }
    public ShoppingIndexes ShoppingIndex { get; set; }
    public DateTime CustomerSince { get; set; }
    public Address BillingAddress { get; set; }
    public Address DeliveryAddress { get; set; }

    public Customer()
    {
        ShoppingIndex = ShoppingIndexes.Level0;
        BillingAddress = new Address();
        DeliveryAddress = new Address();
    }
}

[Serializable]
public class Address
{
    public string Street { get; set; }
    public string Zip { get; set; }
    public string City { get; set; }
    public string Country { get; set; }
}

[Serializable]
public enum ShoppingIndexes
{
    Level0 = 0,
    Level1 = 10,
    Level2 = 20,
    Level3 = 30
}

Create a Database

An instance of a SisoDatabase implementation is supposed to be long lived. It holds meta-data/schema-information about the structures being persisted. The SisoDatabase takes an IConnectionInfo which lets you define which underlying database-implementation to use as well as a connectionstring. When using the LuceneIO-provider, the connectionstring should contain the path to the directory where the data (indexes to use the correct Lucene terminology) should be stored.

var connectionInfo = new SisoConnectionInfo(StorageProviders.LuceneIo, DbPath);
var database = new SisoDatabase(connectionInfo);

This is something you would put in your IoC-container or at least in a factory.

Create a Unit-of-work & Insert a customer

The Unit-of-work is designed to be short lived. It is used for performing CRUD-operations against the database. It implements IDisposable and it’s recomended to be used in conjunction with the using-statement. If you don’t call Commit, ongoing changes aren’t flushed against the database. If you don’t use the using-statement make sure to call Dispose since it ensures that write-locks are being released.

var customer = new Customer
{
    Firstname = "Daniel",
    Lastname = "Wertheim",
    ShoppingIndex = ShoppingIndexes.Level1,
    CustomerSince = DateTime.Now,
    BillingAddress =
        {
            Street = "The street", 
            Zip = "12345", City = "The City", 
            Country = "Sweden"
        }
};

using (var unitOfWork = database.CreateUnitOfWork())
{
    unitOfWork.Insert(customer);
    unitOfWork.Commit();
}

Refetch and Update Customer

As of right now the only query-support is GetByKey which lets you retrieve a single item by a key-value (Guid).

using (var unitOfWork = database.CreateUnitOfWork())
{
    var customer = unitOfWork.GetByKey<Customer>(id);
    customer.ShoppingIndex = ShoppingIndexes.Level2;
    customer.Firstname = "Hans";

    unitOfWork.Update(customer);
    unitOfWork.Commit();
}

Delete Customer

As of right now you can only delete via the key-value.

using (var unitOfWork = database.CreateUnitOfWork())
{
    unitOfWork.DeleteByKey<Customer>(customer.Id);
    unitOfWork.Commit();
}

Custom key

As told, per default, you don’t have to provide any schema-information as long as the root-entity (structure) contains an Id:Guid or Id:Nullable -member. If you want to use another key-member, this is how you do it.

database.Schemas.Register<Customer>(
    builder => builder
        .SetKey(c => c.UId)
        .AddIndex(c => c.Lastname));

That’s it for now. The current code can be found here.

//Daniel