Monday, September 21, 2009

Fun With DataSets

Data in a DataRow has several different versions. First, there's the original version. Then, when it's being edited (either in a UI control, like a TextBox or programmatically), it has a Proposed version and once it's done being edited, that becomes the Current version. Sometimes when entering data in a UI control, the row is left in the Proposed state and the Edit needs to be ended programmatically. This doesn't always happen and I'm not entirely sure why it's seemingly random (maybe it's an ADO.NET bug).
Say, for example, that you have made only one change in your entire DataSet and, for that unexplained reason, the DataRow is still in its Proposed version. If you check the DataSet.HasChanges() method, it will return a false. If you do a DataSet.GetChanges() you will not get anything returned. That's not a good thing. UPDATE: See my most recent post, DataBinding DataSets: Directly vs BindingSource ,explaining why this may not apply if you use BindingSource to DataBind your Controls.
But there's a solution to this little gotcha: here's a method I *always* call before I attempt to check for .HasChanges() or do a .GetChanges() before saving data:

protected virtual void CommitProposedChanges(DataSet ds)
{
    if (ds == null)
        return;
 
    for (int nTable = 0; nTable < ds.Tables.Count; nTable++)
    {
        for (int nRow = 0; nRow < ds.Tables[nTable].Rows.Count; nRow++)
        {
            if (ds.Tables[nTable].Rows[nRow].HasVersion(DataRowVersion.Proposed))
            {
                ds.Tables[nTable].Rows[nRow].EndEdit();
            }
        }
    }
}

UPDATE: If you read the comments below, you’ll see that that there was some talk about the speed of for vs foreach. It turns out that I have had a misconception for many, many, *many* years that for was faster than foreach. Perhaps it used to be back in the .NET 1.1 days, but it certainly hasn’t been the case since them. See my recent blog post about this here: http://geek-goddess-bonnie.blogspot.com/2016/02/compare-speed-of-for-vs-foreach.html
Consequently, I’m going to add another method to this post, using foreach instead of for:

protected virtual void CommitProposedChanges(DataSet ds)
{
    if (ds == null)
        return;
 
    foreach (DataTable dt in this.dsGlobal.Tables)
        foreach (DataRow row in dt.Rows)
        {
            if (row.HasVersion(DataRowVersion.Proposed))
                row.EndEdit();
        }

}



 

35 comments:

  1. I have been trying to save a DataSet or SqlCeResultset for 3 weeks! This did the trick! A HUGE THANKS!

    ReplyDelete
  2. You're quite welcome! Glad I could help! Too bad it took you 3 weeks to find my blog post though ... ;0)

    ReplyDelete
  3. Yeah, too bad for sure ... I will be reading all your pages this weekend!

    Any idea why CommitProposedChanges does not do its thing when called from the FormClosing event?

    If I call it just prior to closing the form, then click the X .HasChanges() reports 'true' ... but if I call it first thin in the FormClosing event it reports 'false'?

    (FYI: My project uses SqlCeAdapter ... but the DataSet is the same ...)

    ReplyDelete
  4. Sorry PHT, I didn't notice that you asked a question. Have you figured it out?

    I suspect it may have to do with the possiblity that clicking the X to close the form may not cause the ActiveControl to lose focus, thus not forcing the control's value into it's databound object.

    Does that make sense? I have found ways around this, but my solution is rather complicated and generic. I would need to address it in a new blog post. Could you test your scenario and see if that is indeed what's happening?

    ReplyDelete
  5. hi,

    I load a dataset from datasource then i made the binding it with the textbox with defaultvalue and format..

    Me.txtSuppName.DataBindings.Add(New System.Windows.Forms.Binding("Text", Me.DataObject.Tables("Table"), "SupplierName", True)) ', DataSourceUpdateMode.OnValidation, "New Supplier Name"))

    after that i made a save method as follows

    Me.Validate()
    BindingContext(DataObject, "Table").EndCurrentEdit()

    but the dataset is not updating with the default value

    ReplyDelete
  6. Anonymous said...

    hi,

    I load a dataset from datasource then i made the binding it with the textbox with defaultvalue and format..

    Me.txtSuppName.DataBindings.Add(New System.Windows.Forms.Binding("Text", Me.DataObject.Tables("Table"), "SupplierName", True, DataSourceUpdateMode.OnValidation, "New Supplier Name"))

    after that i made a save method as follows

    Me.Validate()
    BindingContext(DataObject, "Table").EndCurrentEdit()

    but the dataset is not updating with the default value

    ReplyDelete
  7. The problem in this case is because you are programmatically using a default. This is what shows up in your TextBox, but as you have seen, it's not getting pushed to your DataTable. I have dealt with this issue in the past for different reasons (a control not losing focus, hence not firing it's OnValidate event) ... I'll need to look at a method I used to call ForceBind() and see if I can apply it to this particular situation. I'll get back to you later when I can work it out. It may take a few days because I'll be out of town today.

    ReplyDelete
  8. Well, first of all, you should be using .ValidateChildren(), not .Validate(). Once you switch to .ValidateChildren(), you'll see that the TextBox's various validate events fire.

    However, there seems to be a bug, IMHO. I've only tried this in VS2008 (I have VS2010, but don't have a handy test project at the moment). The buggy part is trying to set a default to use for DBNull in your binding. Sure, it puts it in the TextBox and the .Text property is set to your specified default value. But, no matter what I try, the value is never propagated to the datasource that the TextBox is bound to when the Validating or Validated events fire.

    We could jump through hoops to get it to work probably, but I think a better workaround is to put this default directly in the new DataRow rather than the Binding. So, add an event handler for the RowChanged event, and set your default there:

    private void MyData_RowChanged(object sender, DataRowChangeEventArgs e)
    {
    if (e.Action == DataRowAction.Add)
    {
    // put any defaults we need here
    e.Row["description"] = "bjb";
    }
    }

    ReplyDelete
  9. WTF! Why does this happen!?!?! I've been working on this for almost 2 weeks now! I'm so glad I found your post!

    THANK YOU!

    ReplyDelete
  10. You're welcome! I'm glad my blog helped you out!

    ReplyDelete
  11. struggled for a while with DataGridView changes this seems to work very nicely thanks
    (just a bit cautious because nothing else I have tried has worked)

    ReplyDelete
    Replies
    1. Hopefully, this should continue to work for you. If it does, I'm glad I could help! If it doesn't, let me know ...

      Delete
  12. well they(my users) seem happy enough with it, so thaks again

    ReplyDelete
  13. After a day looking and trying, this worked!!!!!
    The concept is simple, if only I had known.....godess is an apt description
    Many thaks
    Bruce C NZ

    ReplyDelete
    Replies
    1. You're welcome, Bruce! I'm glad I could help! =0)

      Delete
  14. In my case it *always* seems to happen i.e. HasVersion(Proposed) check is always true. I can't help thinking I am missing something that would cause the entire DataSet to change to the Current state. But for now, I am more than happy to use your excellent fix, thanks a million! Talking about proposals (no, not that one) can I suggest an improvement to your code? These loops are much simpler using 'foreach':

     foreach (DataTable table in ds.Tables)
     {
         foreach (DataRow row in table.Rows)
         {
             if (row.HasVersionDataRowVersion.Proposed))
             {
                 row.EndEdit();
             }
         }
     }

    (sorry about the non-formatting, I can't seem to get much HTML to work)

    Jon Pryce

    ReplyDelete
    Replies
    1. Hi Jon, Every row in your entire DataSet is in the Proposed version? Wow, I haven't encountered that before. The only thing I can think of is that perhaps you have some code somewhere that goes through each row and does something, like adds a default value.

      And you've got a good point about using foreach, from a readability standpoint. Early in the days of .NET, I'm pretty sure that a "for loop" was several orders of magnitude faster than a "foreach loop". I wrote this blog post in 2009, but the actual code goes back quite a few years before that. The compiler nowadays may generate code now where the performance is identical. I don't know, I haven't tested that lately. But maybe I'll give it a test sometime today.

      Delete
    2. I had a bit of time on my hands so I timed the following loops. 'reqTable' is a DataTable with 19 rows and 3 columns, so admittedly not a large sample:

      1.
      foreach (DataRow row in reqTable.Rows)
        foreach (object item in row.ItemArray)
          counter++;

      2.
      for (int rowIndex = 0; rowIndex < reqTable.Rows.Count; rowIndex++)
      {
        DataRow row = reqTable.Rows[rowIndex];
        for (int col = 0; col < row.ItemArray.Length; col++)
        {
          object item = row.ItemArray[col];
          counter++;
        }
      }

      Note that I fetch the object of interest manually in the for loop case because the foreach loop does that for me.

      Here are my timings for 4 consecutive runs, release build (so it's possible my fetch of 'item' was optimized out), measured in usec (using QueryPerformanceCounter()):

      foreach: 62.4, 3.4,   3.0, 3.4
              for: 17.1, 10.7, 9.8, 12.4

      Conclusion: foreach is considerably slower the first time but considerably faster all the other times! Go figure...

      Jon

      Delete
    3. I looked at the disassembly and no, the assignment to 'item' in the for loop was not optimized away.

      Jon

      Delete
    4. Hi Jon, I did a similar test, but did not set any object in the for loop, because the original loops did not do that either.
      It turns out that foreach is roughly 8 times faster than for. I did a test with a DataSet that had almost 2 million rows (1,943,936 to be exact).
      Each iteration I did was almost identical, time-wise, with foreach taking around 800 milliseconds and for taking around 100 milliseconds. I'm going to update this blog post with this and maybe even write another blog post just on this topic.
      I even changed the Target Framework for the assembly, to be able to test with 2.0. Even way back then, foreach was still faster. I couldn't set the Target Framework to 1.1 (there was no option for that), so I couldn't test that ... but, I started out back in 2002 using 1.1, so perhaps my observations that for was faster than foreach goes all the way back to then! Wow!

      Delete
    5. Bonnie, I think you have mangled a sentence: you say "It turns out that foreach is roughly 8 times faster than for" but I think you mean the other way around?

      IMHO I think you ought to make the for loop fetch the item of interest because you wouldn't be doing the loop unless you were going to access said item and you get it for free with foreach. Perhaps do the timings for both cases?

      Wish I had a DataSet with 2 million rows ... not :-)

      Merry Xmas

      Jon

      Delete
    6. Jon, nope nothing mangled there! The foreach is 8 times faster! Surprised the hell out of me too! I didn't have actual data with 2 million rows. I grabbed a few thousand from a database table, then copied and merged it together in a loop about 7 times. Since there weren't any PKs set, the merge simply appended rows, resulting in 2 million.

      DataTable dt;
      for (int i = 0; i < 7; i++)
      {
      dt = this.dsGlobal.Tables[0].Copy();
      this.dsGlobal.Tables[0].Merge(dt);
      }

      Here's the code I ran to test the timing. It runs from a button click, toggling between timing the for one time, the foreach the next time. Just keep clicking the button:

      Stopwatch oWatch = new Stopwatch();
      if (this.Toggle)
      {
      // Time the for loop
      this.Toggle = false;
      oWatch.Start();
      for (int nTable = 0; nTable < this.dsGlobal.Tables.Count; nTable++)
      {
      for (int nRow = 0; nRow < this.dsGlobal.Tables[nTable].Rows.Count; nRow++)
      {
      if (this.dsGlobal.Tables[nTable].Rows[nRow].HasVersion(DataRowVersion.Proposed))
      {
      this.dsGlobal.Tables[nTable].Rows[nRow].EndEdit();
      }
      }
      }
      oWatch.Stop();
      this.richTextBox1.Text += string.Format("For in {0} milliseconds\r\n", oWatch.ElapsedMilliseconds);
      }
      else
      {
      // Time the foreach loop
      this.Toggle = true;
      oWatch.Start();
      foreach (DataTable dt in this.dsGlobal.Tables)
      foreach (DataRow row in dt.Rows)
      if (row.HasVersion(DataRowVersion.Proposed))
      row.EndEdit();
      oWatch.Stop();
      this.richTextBox1.Text += string.Format("Foreach in {0} milliseconds\r\n", oWatch.ElapsedMilliseconds);
      }

      Oh, and here's the results:

      Time using 1,943,936 Rows
      For in 832 milliseconds
      Foreach in 101 milliseconds
      For in 768 milliseconds
      Foreach in 102 milliseconds
      For in 772 milliseconds
      Foreach in 102 milliseconds
      For in 858 milliseconds
      Foreach in 103 milliseconds
      For in 757 milliseconds
      Foreach in 132 milliseconds
      For in 750 milliseconds
      Foreach in 102 milliseconds

      Delete
    7. This comment has been removed by the author.

      Delete
    8. OK, so you must have mangled this sentence:
      > with foreach taking around 800 milliseconds and for taking around 100 milliseconds

      Great, you *are* fetching the item in the for loop so ignore my previous IMHO! I suspect the problem with the for loop is that the inner loop is having to evaluate a complicated expression to get the Rows collection 2 million times so I wonder how different it would be if you coded the for loop like this (changes in bold):

      for (int nTable = 0; nTable < this.dsGlobal.Tables.Count; nTable++)
      {
      DataRowCollection rows = this.dsGlobal.Tables[nTable].Rows;
      for (int nRow = 0; nRow < rows.Count; nRow++)
      {
      if (rows[nRow].HasVersion(DataRowVersion.Proposed))
      {
      rows[nRow].EndEdit();
      }
      }
      }

      Delete
    9. I bet it wouldn't matter, Jon ... but I'll try it later today (with my 2 million rows) and let you know!

      Delete
    10. We shall see ... but I would like to see the row count worked out outside the loop too since there are two calls involved in calculating it.

      Delete
    11. It didn't really make much difference. My original for loop (I posted the numbers), tended toward 8 times slower, and the new test that you suggested tends towards 7 times slower. Both sets of for loops (the original and the one using a row collection variable to iterate through), took between 700 - 800 milliseconds, with the first in the high 700's and the second in the low 700's.

      You said:
      >>I would like to see the row count worked out outside the loop too since there are two calls involved in calculating it.<<

      I don't know what you mean by that. There is no counting going on in the loops, there's nothing like that there in the code I posted. The data is retrieved once (and its row count displayed at the time, so only once), and that happens before the button is ever clicked to start timing the two loops.

      Delete
    12. Both the loops fetch Count every time round the loop, e.g.:

      for (int nRow = 0; nRow < this.dsGlobal.Tables[nTable].Rows.Count; nRow++)

      Now I haven't looked at what kind of collection 'Rows' is but when you access the Count member it calls a get() function, which might have to work out the count painfully every time e.g. if it is a linked list. So I was just suggesting something like:

      int count = this.dsGlobal.Tables[nTable].Rows.Count;
      for (int nRow = 0; nRow < count; nRow++)

      It may not make any difference but it removes any possibility of criticism from anyone who reads your new blog :-)

      Delete
    13. Oh, *that* Count ... sorry, I didn't catch that that's what you meant. Must've been because it was too early (before I had my coffee!!!) ... I'll play with it again later today and see what happens.

      Delete
  15. No, sorry, I meant to say that whenever a DataTable is modified using one of the controls bound to it (well, actually to a DataView mapped onto a single row of the table) the change is visible in the DataTable's DataRow but GetChanges() returns null unless I run your code.

    ReplyDelete
  16. Err, just a minor point Bonnie, maybe you have been spending too long on Mt Olympus, but the new version of your loop still uses 'for' not 'foreach'!

    Jon

    ReplyDelete
    Replies
    1. Mt. Olympus ... that's funny! ;0) But, you're right, Jon! I never did update that to a foreach. Guess I should do that! Thanks for catching that ...

      Delete
    2. I fixed it Jon. Thanks! I think that what had happened was that I must have copy/pasted from the wrong code snippet and didn't even notice it! I guess I must have been in a hurry that day. =0(

      Delete
  17. HELLO BONNIE I DON'T KNOW HOW TO POST NEW Q? HERE so sorry for everything
    can i change checkbox into textbox in datagridview ??? i have sql table having one column name as status & its data type is bit when i run the program it is showing textbox's i want to change it can u help me ..?? thanks in advance..

    ReplyDelete
    Replies
    1. That's not going to be an easy question to answer here because there's probably going to be too much code you'd have to post to show me what you've got and then I'd have to post some code back to you. As you can see, if you've looked at some of the comments here, there is no formatting, so the code is hard to read.

      And, your question really has nothing to do with this blog post about DataSets. So, I suggest that you post your question on the MSDN forums here:

      https://social.msdn.microsoft.com/Forums/windows/en-US/home?forum=winformsdatacontrols

      And then come back to these comments and post a link to your MSDN question so I can help you there (and you just might get help from other people too).

      Oh, wait a minute, I see that you *did* post on the forums (https://social.msdn.microsoft.com/Forums/windows/en-US/d66f7162-02b3-42e5-8135-784f5a373019/convert-checkbox-into-textboxmessage-in-datagridview-directly-from-sql-while-running-the-program?forum=winformsdatacontrols) ... I'll go take a look at it.

      Delete