Monday, December 30, 2019

Create a Dictionary using LINQ

Dictionaries are useful to use in your code, and occasionally I need to look up a word I don't know!!  Oh wait ... that's a different kind of dictionary!  ;0)

Say that you have some data that you'd like to put into a Dictionary. One reason for using a Dictionary might be to make it easy to process one group of items at a time. Another reason might be for easily finding data based on the dictionary key. The data could be in a file or maybe it's been retrieved from TCP or from a database. It doesn't matter. Data that might look something like this:

MyFirstGroup:
item1

MySecondGroup:
item2
item3
item4

Or maybe delimited data, like this:

|CARCOMPANY=GeeksRide|CARNUMBER=C121|CARNUMBER=C122|CARCOMPANY=UberGeek|CARNUMBER=C133|

The above string may look a bit familiar to regular readers of my blog. It's similar to data coming from a fictional ride-sharing system that I "invented" in order to write this blog post about parsing data: https://geek-goddess-bonnie.blogspot.com/2019/09/parsing-data-with-metadata.html

Oh, and how about XML? Here's some XML (in this case, representing a DataSet):

<RideShareOwnerDataSet>
  <CompanyInfo>
    <CarCompanyID>1</CarCompanyID>
    <CarCompany>GeeksRide</CarCompany>
  </CompanyInfo>
  <CompanyInfo>
    <CarCompanyID>2</CarCompanyID>
    <CarCompany>UberGeek</CarCompany>
  </CompanyInfo>
  <CarInfo>
    <CarNumberID>1</CarNumberID>
    <CarCompanyID>1</CarCompanyID>
    <CarNumber>C121</CarNumber>
  </CarInfo>
  <CarInfo>
    <CarNumberID>2</CarNumberID>
    <CarCompanyID>1</CarCompanyID>
    <CarNumber>C122</CarNumber>
  </CarInfo>
  <CarInfo>
    <CarNumberID>3</CarNumberID>
    <CarCompanyID>2</CarCompanyID>
    <CarNumber>C133</CarNumber>
  </CarInfo>
</RideShareOwnerDataSet>

Of course, depending on the data, the processing in your application could be quite different. So, let's take a look at how we would handle these three very different types of data.

In all of these examples, I am making use of LINQ syntax to create the Dictionary. In two of these examples, I make use of LINQ's .Aggregate() function. Here is a good link that explains how .Aggregate works:

https://stackoverflow.com/questions/7105505/linq-aggregate-algorithm-explained

Lines of Data

For this first example, I put the data into a text file, that I called GroupAndItem.txt. We can use the File.ReadLines() to read each line of data and utilize the .Aggregate() function of LINQ to make it pretty easy to dump this data into a Dictionary:

string key = "";
var dcSectionItems = File.ReadLines("GroupAndItem.txt")
    .Aggregate(new Dictionary<string, List<string>>(), (a, s) => {
        var i = s.IndexOf(':');
        if (i < 0) a[key].Add(s);
        else a[(key = s.Substring(0, i))] = new List<string>();
    
        return a;
    });


And then we can show the data by iterating the Dictionary utilizing Console.WriteLine() for display.

foreach (var kvp in dcSectionItems)
{
    Console.WriteLine("{0}:", kvp.Key); // section name
    foreach (var item in kvp.Value)
    {
        Console.WriteLine("\t{0}", item); // item name
    }
}


Delimited Data

Rather than reading from a file this time, I simply used a string to hold the pipe-delimited data. In practice, this data could be coming in from TCP, a web service or a database. Notice that GeeksRide has 2 cars, while UberGeek has only one. Again, I make use of the LINQ .Aggregate() function to create the Dictionary:

string pipeCompanyCars = "|CARCOMPANY=GeeksRide|CARNUMBER=C121|CARNUMBER=C122|CARCOMPANY=UberGeek|CARNUMBER=C133|";

string[] entries = pipeCompanyCars.Split(new char[] { '|' }, StringSplitOptions.RemoveEmptyEntries);
key = "";
var dcCompanyCars = entries
    .Aggregate(new Dictionary<string, List<string>>(), (a, s) => {
        var i = s.IndexOf("CARCOMPANY=");
        if (i >= 0) a[(key = s.Substring(11))] = new List<string>();
        else
        {
            i = s.IndexOf("CARNUMBER=");
            if (i >= 0) a[key].Add(s.Substring(10));
        }
        return a;
    });


And, again, show the data in the Dictionary the same way:

foreach (var kvp in dcCompanyCars)
{
    Console.WriteLine("{0}:", kvp.Key); // car company name
    foreach (var item in kvp.Value)
    {
        Console.WriteLine("\t{0}", item); // car numbers
    }
}



XML Data


I almost exclusively utilize XML with DataSets. Most of the time, the DataSets in my applications are Typed DataSets. But, every once in a while, I'll use a plain old DataSet for one-time uses. My first example with XML data here will simply use a plain old DataSet.

The XML I'm using could have just as easily been retrieved from an .xml file to fill the DataSet using ds.ReadXml("FileName.xml"), but here I am filling the DataSet from an XML string. Here's the string and the code to read it into a DataSet :

string xmlCompanyCars =
    "<RideShareOwnerDataSet>" +
      "<CompanyInfo>" +
        "<CarCompanyID>1</CarCompanyID>" +
        "<CarCompany>GeeksRide</CarCompany>" +
      "</CompanyInfo >" +
      "<CompanyInfo>" +
        "<CarCompanyID>2</CarCompanyID>" +
        "<CarCompany>UberGeek</CarCompany>" +
      "</CompanyInfo >" +
      "<CarInfo>" +
        "<CarNumberID>1</CarNumberID>" +
        "<CarCompanyID>1</CarCompanyID>" +
        "<CarNumber>C121 </CarNumber >" +
      "</CarInfo>" +
      "<CarInfo>" +
        "<CarNumberID>2</CarNumberID>" +
        "<CarCompanyID>1</CarCompanyID>" +
        "<CarNumber>C122 </CarNumber >" +
      "</CarInfo>" +
      "<CarInfo>" +
        "<CarNumberID>3</CarNumberID>" +
        "<CarCompanyID>2</CarCompanyID>" +
        "<CarNumber>C133 </CarNumber >" +
      "</CarInfo>" +
    "</RideShareOwnerDataSet >";

DataSet ds = new DataSet();
StringReader sr = new StringReader(xmlCompanyCars);
ds.ReadXml(sr, XmlReadMode.InferSchema);

If you use XML strings to fill DataSet a lot, instead of the above code, I recommend creating a handy Extension method, like the following:

// Extension methods for DataSet:
public static void FillWithXml(this DataSet ds, string XML)
{
    StringReader sr = new StringReader(XML);
    ds.ReadXml(sr, XmlReadMode.InferSchema);
    ds.AcceptChanges();
}

Here, you'd use it like this:

ds.FillWithXml(xmlCompanyCars);

Here's the rest of the code for this XML example:

// Now let's get that DataSet into a Dictionary!
// We do need a "dummy" DataTable for this to work
DataTable dt = new DataTable();
dt.Columns.Add("CarCompany");
dt.Columns.Add("CarNumber");

var dcCompanyCarsFromDataSet = ds.Tables[0].AsEnumerable()
    .Join(ds.Tables[1].AsEnumerable(),
        rowCompany => rowCompany.Field<string>("CarCompanyID"), rowCar => rowCar.Field<string>("CarCompanyID"),
        (rowCompany, rowCar) =>
        {
            DataRow newRow = dt.NewRow();
            newRow["CarCompany"] = rowCompany.Field<string>("CarCompany");
            newRow["CarNumber"] = rowCar.Field<string>("CarNumber");
            return newRow;
        })
    .GroupBy(rowKey => rowKey.Field<string>("CarCompany"), rowValue => rowValue.Field<string>("CarNumber"))
    .ToDictionary(rowKey => rowKey.Key, rowValue => rowValue.ToList());

Notice the syntax for accessing the columns of data in LINQ with a regular DataSet. You *must* specify the type of data with .Field<string>("ColumnName"). There is no way around this (unless you use a Typed DataSet ... more on that in my final example). Also notice that I did *NOT* use the .Aggregate() function here. I could not figure out how to do it that way (and I doubt that it can be done, but if any of my faithful readers want to give it a try ... go for it! Leave me a comment if you figure it out). But the .GroupBy() and the .ToDictionary() work just fine!

One more thing that you may notice, and that is the CarCompanyID and the CarNumberID. In the XML, you can clearly see that they are numbers, most likely an int. But, because I read the XML in to a regular DataSet (and even though I used XmlReadMode.InferSchema), it did not infer that those numbers were anything but a string. Which is OK in this case, but just something to be aware of. We could have added Tables and Columns to that DataSet before we read in the XML data, but why bother? If I wanted that, I'd use a Typed DataSet.

And use a similar foreach, as before, to iterate the Dictionary:

foreach (var kvp in dcCompanyCarsFromDataSet)
{
    Console.WriteLine("{0}:", kvp.Key); // car company name
    foreach (var item in kvp.Value)
    {
        Console.WriteLine("\t{0}", item); // car numbers
    }
}

XML Data and a Typed DataSet

To experiment with Typed DataSets if you've never used them, since we have available XML, you can use ds.WriteSchema("RideShareOwnerDataSet.xsd") method to create an .xsd file. Then add the .xsd file to your solution and run the MSDataSetGenerator to create a Typed DataSet. Here's one of my blog posts about how to do that:

https://geek-goddess-bonnie.blogspot.com/2010/04/create-xsd.html

The relevant (to this post) part of that blog post is this:

Simply add that .xsd to your DataSet project, right-click on the .xsd and choose "Run Custom Tool". This is what will generate the Typed DataSet for you. If that option doesn't show up in your context menu, choose Properties instead and type "MSDataSetGenerator" in the Custom Tool property. After that, any time you make a change to the .xsd in the XML Editor and save the change, the Typed DataSet gets regenerated.

I edited the .xsd to change the datatypes of the IDs to an int, instead of a string. Go ahead and do that if you'd like to.

// We still need a "dummy" table
DataTable dt = new DataTable();
dt.Columns.Add("CarCompany");
dt.Columns.Add("CarNumber");

var dcCompanyCarsFromTypedDataSet = dsRideShareOwner.CompanyInfo
    .Join(dsRideShareOwner.CarInfo,
        // Notice that we can use the Typed syntax for the columns now
        rowCompany => rowCompany.CarCompanyID, rowCar => rowCar.CarCompanyID,
        (rowCompany, rowCar) =>
        {
            // Use the Typed syntax here too
            DataRow newRow = dt.NewRow();
            newRow["CarCompany"] = rowCompany.CarCompany;
            newRow["CarNumber"] = rowCar.CarNumber;
            return newRow;
        })
    // The GroupBy still needs to use the untyped syntax, because these Rows are from the untyped "dummy" DataTable
    .GroupBy(rowKey => rowKey.Field<string>("CarCompany"), rowValue => rowValue.Field<string>("CarNumber"))
    .ToDictionary(rowKey => rowKey.Key, rowValue => rowValue.ToList());

You'll notice that the LINQ syntax is quite different for Typed DataSet versus non-typed. We do *not* have to specify Field<string> for the columns we are using, since we can use the Typed property to access the column data: rowCompany.CarCompanyID, instead of rowCompany.Field<string>("CarCompanyID").  Much cleaner.

And, as before, iterate the Dictionary:

foreach (var kvp in dcCompanyCarsFromTypedDataSet)
{
    Console.WriteLine("{0}:", kvp.Key); // car company name
    foreach (var item in kvp.Value)
    {
        Console.WriteLine("\t{0}", item); // car numbers
    }
}

So, that's it. It's been a fun experiment for me as I wrote this blog post. I hope that you, Dear Reader, can put this to good use.

Happy Coding!!  =0)

No comments:

Post a Comment