109: (Default)
[personal profile] 109
In response to Vova Patryshev's article

The question I tried to answer here is "How to write efficient and generic in-memory index". It seems that the concept of Map, coming from functional languages, is unavoidable. Fortunately, in C# we have delegates, which allow us to avoid the hassle with interfaces. For the sake of keeping the dialog going, I used the same data class (hunter, datetime, mammothWeight), invented by Vova.


File index.cs contains an implementation of generic index structure. Please note that sorting issues are not addressed. Obviously, we can sort the index each time we do binary search or we can always maintain it in the sorted state by inserting new value in correct position. The choice is affected by the size if the index and the frequency of inserts-deletes versus the frequency of binary searches. This example does neither, it simply re-sorts the whole collection every time new object is added, which is of course very inefficient, but optimizing sorting strategy is out of scope for now.

It is also should be obvious how to extend this class to change the default sorting with Comparer.Default to any imaginable sort order, which would allow index to act as so-called "functional index". Finally, "Get" method can be overloaded to accept two parameters - Key and delegate Predicate<Key>, which will allow to run queries like "get data where key > something" or even "get data where SomeComplicatedFunction(key) is true".

Please enjoy the code:
using System.Collections.Generic;

delegate Y Map<X, Y>(X x);

class Index<Key, Data> : List<Data>
{
	private Map<Data, Key> getKey;
	private Comparer<Key> keyComparer = Comparer<Key>.Default;

	public Index(Map<Data, Key> getKey)
	{
		this.getKey = getKey;
	}

	private int Compare(int index, Key key)
	{
		return keyComparer.Compare(getKey(this[index]), key);
	}

	private int Compare(Data x, Data y)
	{
		return keyComparer.Compare(getKey(x), getKey(y));
	}

	public new void Sort()
	{
		base.Sort(Compare);
	}

	private int BinarySearch(Key key)
	{
		int first = 0;
		int last = Count - 1;
		while (first <= last)
		{
			int current = first + ((last - first) >> 1);
			int comparison = Compare(current, key);
			if (comparison == 0)
				return current;
			if (comparison < 0)
				first = current + 1;
			else
				last = current - 1;
		}
		return -1;
	}

	public List<Data> Get(Key key)
	{
		int mid = BinarySearch(key);
		if (mid == -1)
			return new List<Data>(); // empty list, arguable

		int first = mid - 1;
		while (first >= 0)
		{
			if (Compare(first, key) != 0)
				break;
			first--;
		}
		first++;

		int last = mid + 1;
		while (last < Count)
		{
			if (Compare(last, key) != 0)
				break;
			last++;
		}
		last--;

		return GetRange(first, last - first + 1);
	}
}


File hunt.cs is the example of the index usage. Please note that there are no constraints for the Hunt class. No interfaces to implement, nothing. The only "adapter" between data class and the index is the implementation of the Map delegate. This code doesn't have to be a part of data class, as well as the code that puts data in the storage or looks it up. Which means that the suggested Index class can be applied to any existing class without the need to change it.

using System;
using System.Collections.Generic;

class Hunt
{
	private string hunterName;
	private DateTime killedDate;
	private double mammothWeight;

	public Hunt(string hunterName, DateTime killedDate, double mammothWeight)
	{
		this.hunterName = hunterName;
		this.killedDate = killedDate;
		this.mammothWeight = mammothWeight;
	}

	public override string ToString()
	{
		return string.Format(
			"{0} scored {1} on {2}",
			hunterName,
			mammothWeight,
			killedDate);
	}

	private static string GetHunterName(Hunt hunt)
	{
		return hunt.hunterName;
	}

	private static Index<string, Hunt> hunterNameIndex = 
		new Index<string, Hunt>(new Map<Hunt, string>(GetHunterName));
	
	public static void Add(Hunt hunt)
	{
		hunterNameIndex.Add(hunt);
		hunterNameIndex.Sort();
	}

	public static List<Hunt> ByHunterName(string hunterName)
	{
		return hunterNameIndex.Get(hunterName);
	}
}

class TestHunt
{
	public static void Main()
	{
		Hunt.Add(new Hunt("Kosta", DateTime.Now, 42));
		Hunt.Add(new Hunt("Vlad", DateTime.Now, 142));
		Hunt.Add(new Hunt("Vlad", DateTime.Now, 143));

		Console.WriteLine("\nKosta scores:");

		List<Hunt> list = Hunt.ByHunterName("Kosta");
		foreach (Hunt hunt in list)
		{
			Console.WriteLine(hunt);
		}

		Console.WriteLine("\nJohn scores:");

		list = Hunt.ByHunterName("John");
		foreach (Hunt hunt in list)
		{
			Console.WriteLine(hunt);
		}

		Console.WriteLine("\nVlad scores:");

		list = Hunt.ByHunterName("Vlad");
		foreach (Hunt hunt in list)
		{
			Console.WriteLine(hunt);
		}
	}
}


In order to test the code in Visual Studio, create empty solution and add Index.cs and Hunt.cs with the content copied from above.

Profile

109: (Default)
109

March 2019

S M T W T F S
     12
3456789
101112131415 16
17181920212223
24252627282930
31      

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags