This project has moved. For the latest updates, please go here.
1
Vote

Generic.KeyNotFoundException in ResponseManager.HandleResponse

description

I've been getting exceptions of type System.Collections.Generic.KeyNotFoundException from ResponseManager.HandleResponse. From my reading of the code, this means that a response with a specific sequence number was not found in the Responses collection. Have you seen anything like this - any idea how this could happen?

Also, while I was trying to understand this code, one example of the path looks like this :
  1. Brick.PollSensorsAsync creates a new Command of DirectReply type. This also creates a Response object for the command which is stored in the Responses collection. The command is sent off using SendCommandAsyncInternal on which PollSensorsAsync blocks.
  2. Brick.SendCommandAsyncInternal sends the command and calls ResponseManager.WaitForResponseAsync to block for the response.
  3. If the response comes in (through the ICommunication object), ResponseManager.HandleResponse handles it and signals the response's Event so that ResponseManager.WaitForResponseAsync competes, and the response is removed from the Responses collection.
  4. If no response is received within a second, the reply type is set to error, but the response is not removed - in fact it does not look like it is ever removed.
The questions:
Is the behavior in 4 above normal in the error case, i.e. the response is not removed?
Is it possible for there to be a race condition between WaitForResponseAsync timing out on WaitOne and the response being handled?
I still can't explain the exception that I'm seeing, but is it possible that there is a race between the polling task and some other command (like one to apply power to the motors), such that both threads try to call ResponseManager.CreateResonse at the same time? Is incrementing the sequence number atomic or exclusive?

Thanks,
-Achal

comments

peekb wrote Sep 10, 2014 at 5:15 PM

Is this happening on a specific platform? (Phone, Desktop, WinRT?)

achalshah wrote Sep 11, 2014 at 4:39 AM

This is on Phone. Let me know if you want the stacks.

achalshah wrote Sep 13, 2014 at 4:46 AM

Here is a possible scenario resulting in non-sequential write access to the Responses collection (in ResponseManager):

PollSensorAsync() in Brick.cs sends commands and waits on the response being handled (eventually by calling WaitForResponseAsync on the ResponseManager. WaitForResponseAsync waits on the response event.

PollInput in BTCommunication runs in a separate thread and calls HandleResponse in ResponseManager when new data comes in.

HandleResponse gets the sequence, takes it from the Responses collection, populates it and sets its event which WaitResponseAsync is waiting for. Once the event is signalled, WaitResponseAsync removes the response from the Responses collection.

Even though these 2 threads are pretty much synchronized by virtue of the fact that each command has a response which is waited on, a 3rd thread could send a command (e.g. from the UI), and now we could have two tasks waiting for responses.

The two responses would come back to back and HandleResponse could set both events before the two tasks waiting on the events are able to run. Eventually both run and try to remove items from the collection simultaneously and could corrupt it since access is not protected.