Finding a clean way to get feedback to the UI

jstafford · October 24, 2018, 5:51pm

I think you lost me. I thought the objective was to only send the message if the command was successfully handled. Wouldn’t this approach just send a message for every incoming message simply because the message arrived?

The pipeline approach seemed good, it only lacked a context for me to send the message with. Right now, because of the order of how things happen, I have to pass in a factory to get ahold of the session

static async Task Main(string[] args)
{
    Console.Title = "Producer A";

    IMessageSession session = null;

    var config = new EndpointConfiguration("ProducerA");
    config.UsePersistence<LearningPersistence>();
    config.UseSerialization<NewtonsoftSerializer>();
    config.EnableInstallers();

    // session is null so delay the retrieval
    config.AllowReplyToHub(() => session);

    var transport = config.UseTransport<RabbitMQTransport>();
    transport.ConnectionString("host=rabbitmq;username=user;password=bitnami");
    transport.UseConventionalRoutingTopology();

    session = await Endpoint.Start(config);
    await Task.Delay(Timeout.Infinite);
}

This works, but feels wrong. I’m going to go wire this into the actual SignalR project to see if it’s any better when I have access to IServiceCollection

andreasohlund · October 24, 2018, 6:25pm

The thing is that you can always send the message, due to Batched message dispatch • NServiceBus • Particular Docs it would only ever be dispatched if the message pipeline is completed successfully.

Does that make any sense?

jstafford · October 24, 2018, 6:38pm

The concept makes sense to me in that any messages created by a single context won’t go out unless the handler completes successfuly, but in this scenario, I feel like they are different things. The implementation of IHandleMessage<object> that we just put together, you said that had nothing to do with the pipeline which tells me that this generic handler is simply handling all messages in a different context – a different instance of the pipeline. Is that not correct?

So if a receiving service has implementations of IHandleMessages<Stage1Command> and a this IHandleMessages<object> then both of those handlers will receive the same message in two different contexts.

So then IHandleMessages<object> doesn’t care about what happens in IHandleMessages<Stage1Command> because as far as it’s concerned, it’s a completely different handler.

Is that not right?

jstafford · October 24, 2018, 6:43pm

I guess I could see that working if maybe I base classed the handler and then checked those headers in the base class. Is that what you were implying?

e.g.

public abstract class HandlerBase<T> : IHandleMessages<T>
{
    public async Task Handle(T message, IMessageHandlerContext context)
    {
        await HandleMessage(message, context).ConfigureAwait(false);

        if (!context.MessageHeaders.TryGetValue(CustomHeaders.TaskId, out var taskId) ||
            !context.MessageHeaders.TryGetValue(CustomHeaders.UserId, out var userId) ||
            !context.MessageHeaders.TryGetValue(CustomHeaders.ResponseQueue, out var respondTo)
        ) return;

        var options = new SendOptions();
        options.SetDestination(respondTo);

        CopyHeaders(options, context.MessageHeaders, "DAS");
        
        await context
            .Send<DirectedReply>(c =>
            {
                c.Status = "Success";
                c.TaskId = taskId;
                c.UserId = userId;
                c.Message = JsonConvert.SerializeObject(message);
            }, options)
            .ConfigureAwait(false);
    }

Update: Tried this ^ and it does work as expected. So now to tackle the error scenario. I guess I could just catch the error in the base class just the same, but I really only want to forward the error message if the message completely fails (all retries attempted). The service control approach could work because it looks like it would only raise the event if it was forwarded to the error queue. Currently looking for how to achieve what is in the example in netcore, as there is no app.config.

What I don’t like about the service control approach is that it seems to complicate local development testing. Is there another way? I see that I can plug into endpointConfiguration.Notifications but I’m back to not having a reference to a pipeline context through which I could send a message.

andreasohlund · October 25, 2018, 7:19am

The context for the purpose of this discussion is the processing of the incoming message (see Steps, Stages and Connectors • NServiceBus • Particular Docs for more details).

This means that all handlers that match the given message type will execute in the same context/transaction. So in this case the object handler will be invoked for all messages just like we want. Should something go wrong when processing the message, like handlers for the specific message throwing, no outgoing messages will be emitted and the incoming message will be rolled back or moved to the error queue.

If we put this together we get the behaviour you want:

A UIFeedback message will be emitted if any message with the mentioned header is processed successfully.

Does this clear things up? (perhaps test it to see that it works)

andreasohlund · October 25, 2018, 7:21am

No need for a subclass, see below

andreasohlund · October 25, 2018, 7:23am

Yes you would need to have ServiceControl installed to do end to end testing. If that’s not acceptable then yes using the notifications would be your other option.

jstafford · October 25, 2018, 5:33pm

Indeed it does, thank you.

Ok so I’m kind of trying to average out all the possible solutions and decide on the most consistent solution to both the success and error scenarios. While exploring the solution for error messages, it seems to put me back where I was a few steps ago trying to subscribe to ReceivePipelineCompleted which you did point out earlier. Creating a handler for that works, but that approach doesn’t seem to have a matching approach for errors without employing the service control method. Using endpointConfiguration.Notifications.Errors.MessageSentToErrorQueue += ... certainly works but it leaves me with the same problem I had before which is that I have no pipeline context through which I can send a message.

I decided to assume for now that I want to make this approach work if possible since the solutions for both success messages and failed messages are similar, and the only problem to solve is getting access to a pipeline context, and getting one into DI is tricky because of the order of events on how to start up NSB. Essentially, there’s a chicken or the egg situation where I need a configuration item to have access to IMessageSession but I can’t register that without starting the endpoint and I can’t start the endpoint without the configuration.

As an experiment to solve this, I have sent IServiceCollection into the delegate definition so that it can build a provider and request IMessageSession on demand which will be long after the endpoint has started and the service registered. Normally, I would prefer to have sent IServiceProvider but that is not available during the registration phase, so the drawback here is that every time these delegates get called, it has to build the service provider so that it request the service. I don’t even really know if that’s a big deal or not; it’s weird that MS separated the two at all. Sorry, that’s hard to illustrate in words, here’s an example of an extension method I wrote to enable the behavior. It would be the same approach for success messages.

public static void ForwardFailedMessages(
    this EndpointConfiguration endpointConfiguration, 
    IServiceCollection services, string customHeadersStartWith = "DAS")
{
    endpointConfiguration.Notifications.Errors.MessageSentToErrorQueue += async (sender, message) =>
    {
        ... // short circuits if custom headers not present

        var reply = ... // build reply
        var options = ... // build from headers (e.g. destination)

        var provider = services.BuildServiceProvider();
        var bus = provider.GetService<IMessageSession>();

        await bus.Send(reply, options).ConfigureAwait(false);
    };
}

And then installation looks like this:

static async Task Main(string[] args)
{
    Console.Title = "Producer A";
    
    var services = new ServiceCollection();
    var config = new EndpointConfiguration("ProducerA");
    config.UsePersistence<LearningPersistence>();
    config.UseSerialization<NewtonsoftSerializer>();
    config.EnableInstallers();

    // my installation extension
    config.ForwardFailedMessages(services);

    var transport = config.UseTransport<RabbitMQTransport>();
    transport.ConnectionString("host=rabbitmq;username=user;password=bitnami");
    transport.UseConventionalRoutingTopology();
    
    var session = Endpoint.Start(config).Result;
    services.AddSingleton<IMessageSession>(session);
    
    await Task.Delay(Timeout.Infinite);
}

Of course, I’m assembling it like this because for simplicity, I want to create a nuget package for us to use for dozens of services to introduce behavior. This appears to be behaving as I would want it to. Do you see any problem with this approach to success and/or failed messages? Do you perhaps have another idea on how to get reference to a pipeline context instead of rebuilding the service collection every time like I have here?

If this seems like it might be a decent approach, I’ll post the complete solution for feedback and posterity

jstafford · October 25, 2018, 10:29pm

For what it’s worth, this document seems to be quite outdated. Some of these extensions and methods don’t exist on these packages

andreasohlund · October 26, 2018, 7:59am

Yes this is indeed tricky, we never intended for those API’s to be used to emit messages.

You should be able to pull it off by adding a behavior that stores the context in a (evil I know) AsyncLocal and then access that from the event handlers. You can take a look at our UniformSession for inspiration on how to do this - Uniform Session • UniformSession Support • Particular Docs

Note that if you don’t do this and just use the IMessageSession the message operation won’t participate in the receive transaction (unless TransactionScope transaction mode is used) this will cause messages to “go straight out” and potentially cause false positives should something go wrong when the transport completes the receive operation.

In short:

You will get “immediate dispatch” semantics , Sending messages • NServiceBus • Particular Docs , if you just use the IMessageSession to emit the status messages.

This won’t affect you since RabbitMQ only supports the ReceiveOnly mode so you are exposed to this for all your message operations anyway.

More on transaction modes:

More on sending inside vs outside of the pipeline context:

andreasohlund · October 26, 2018, 8:08am

I have some more philosophical concerns about this UI feedback idea in general but I think that would be better suited for a call if you would like to discuss? (shoot me an email if that’s sounds interesting)

jstafford · October 26, 2018, 5:01pm

A call may be useful at some point, but judging by our message cadence, you’re on the other side of the world , and I do need to move this forward even if temporarily while additional conversations are taking place.

I’m curious as to your more general thoughts on the topic (aside from this current solution). Just generally speaking, how would you solve this problem of getting updates back to the UI asynchronously, particularly when the UI is not a .Net solution?

andreasohlund · October 30, 2018, 9:28am

Regarding success notifications:

What if you told the UI that the request was successful as soon as you got the message onto the queuing system? (Ie as soon as the webrequest returns, no need for SignalR)

Regarding failure notifications:

Do you need this at all? I assume that you would have some back office process to deal with the failures anyway that would let the user know, via email or some other way, more exactly what they need to do if they need to get involved at all. A generic solution to this doesn’t seem to help much? How common do you expect these failures to be?

jstafford · October 30, 2018, 3:33pm

Technically, we already do this. At the moment, commands come in through a REST API to which we always respond (provided validation succeeds) with a 202 Accepted. There are some cases where a task needs to be distributed, however, and those can take a little bit; we don’t want to hold them up in the meantime. For example, provisioning a new account with SMS capabilities and so on. We’ve gathered everything we need from the user and we’re processing and we want to let them move on. It’s worth noting that the “user” in this particular scenario is an employee of our company. They have many tasks to complete, so we’re building this system to allow them to move on as soon as their part is complete. So, if something fails, we need to notify them out of band. Email is disruptive, so we’d rather pop up a toast notification for certain kinds of success events. The use case for that is related to what we do. For example, there are manual interactions with customers that need to take place as soon as something completes. We do have a requirement that we send emails and/or SMS as well, when the user is offline. That part can happen on the backend and is already taken care of.

Failures are unfortunately quite common because a lot of what we do is interacting with 3rd parties (close to 100) and their services don’t always succeed for various different reasons. At least half the time, the failure is resolved on it’s own so NSB’s retry logic is enough. In other cases, we need to know right away so that we can contact the vendor. Due to the nature of this business, we’re building the system to be as communicative as possible to help mitigate things that are out of our control by allowing our agents to respond quicker. So the general theme of the this part of the system is “Tasks” and “Real-time Feedback”.

At first pass, it seems that a lot of this could be accomplished by explicitly sending certain messages straight to the hub, but the services are shared and not always processing a message that came from a UI, hence the need to use the custom headers. Getting feedback on individual messages is a feature request nicety that has some troubleshooting benefits, I think.

andreasohlund · November 1, 2018, 8:39am

What actions do the users take once they get the confirmation that the account was provisioned?

Failures are unfortunately quite common because a lot of what we do is interacting with 3rd parties (close to 100) and their services don’t always succeed for various different reasons. At least half the time, the failure is resolved on it’s own so NSB’s retry logic is enough. In other cases, we need to know right away so that we can contact the vendor

Sound like this might be better to build explicitly into your system in some kind of “Automated account provisioning failed” flow? Don’t you want specific details like account id, what service that didn’t get setup properly etc.

In short: A generic “this request failed” doesn’t seem to give the user what they want to figure out what to do next?

jstafford · November 3, 2018, 1:39am

Contact the customer to give them their information would be an example. It seems logical to simply email them any pertinent information, but that’s not how we operate. We’re have a “white glove” managed services element to our business. For most success cases, it’s really just going to be the UI letting the user know that the task was completed, asynchronously.

If you look at Azure portal, it does exactly this. You start a task and go about your business. It adds a message to your alert center and has a progress meter that you can check. When everything completes, you get another update showing that everything completed successfully and that same notification is updated from a progress meter to a green check mark.

It’s not always a generic failed notification, but even if it was, that is still very useful for our business because it allows us to be proactive about failures rather than the customer noticing first and calling. In the meanwhile, yes, of course we would have a failure flow that happened on the system side for things that the system could do on it’s own. But there’s a strong agency component to our business (managed services) which does require us to deliberately have more human interaction.

The system captures all of the information, but the idea is really to immediately alert the UI with a notification of sorts (e.g. toast, or an inbox). The user can then click on the alert and depending on the scenario, it would take them to the original setup page for them to modify, in the case the agent is actually able to correct something, for example. A scenario would be doing a test run of a customer’s inventory feed. It could take a while for the process to run, and the user shouldn’t have to sit around and wait for it. At some point, something may fail and we want the user to know immediately so they can tend to it. For example, the system received an authentication error when trying to pull data, or 75% of the way through parsing, we determined that the data being given to us is inconsistent with the configuration. Rather than only send an email to one of our agents, if they’re online, we can alert the agent that started the task in real time and take them straight to the set up page with customer info already shown and the configuration ready to edit by clicking the alert. That way they can either get on the phone with the customer and make the fix right there or click a button to schedule followups.

To make this happen, we simply need to give enough information back in the message to track that error. So for instance, when we catch the error, we assign it an id and store the bigger information then notify the signalR hub with some information that will allow the UI to direct the user to the correct location, if any.

We have these agents called “Client Advocates” and they speak directly to our customers frequently. Our target customers are not tech savvy and demand a lot of feedback and we want to provide our agents as much information as proactively as possible, so a lot of this particular system’s functionality isn’t just automation, but auditing and creating conversation points, so we do a lot of balancing between automation and manual interaction. This messaging component is really just an alternative approach to simply checking a task queue constantly. It may seem chatty and overkill, but it’s not me dreaming this stuff up

andreasohlund · November 5, 2018, 12:31pm

The system captures all of the information, but the idea is really to immediately alert the UI with a notification of sorts (e.g. toast, or an inbox). The user can then click on the alert and depending on the scenario, it would take them to the original setup page for them to modify,

I think I get it now, you’re saying the the UI is using the “ID” that comes with the failure notification to be able to take the user “back” to the screen where the failed message was “sent” from?

To make this happen, we simply need to give enough information back in the message to track that error. So for instance, when we catch the error, we assign it an id and store the bigger information then notify the signalR hub with some information that will allow the UI to direct the user to the correct location, if any.

Ah, so when a failure occurs you store failure details correlated by the “UI Request ID”? The UI then goes and looks that up when it gets the error notification? (or does it get included in the UIFeedback message somehow?)

jstafford · November 6, 2018, 1:12am

Bingo!

it depends on the situation, but in short, either/or/both. In some cases, nothing can be done; it’s just “hey that think with ID 123, failed. Contact admin.” in other cases where it may be recoverable, the message would hopefully contain enough information to look it up, move the user to a form and pre-populate it, and flag anything that needs to be fixed.

andreasohlund · November 6, 2018, 10:08am

Can you share some details on how this works with the generic feedback mechanism discussed earlier in this thread? have you considered adding something to the context allowing your business code the additional info to be attached to the failure message to save yourself the lookup? or do you just embed it in the exception message string and include that when reporting the failure to the UI?