Real-time FileWatcher System Monitor using TPL DataFlow , Asp.NET Web API, SignalR, ASP.net MVC and Angular JS





Consider a system that must process multiple files simultaneously.
We want to improve system performance but also we want to monitor the process in real time.

To achieve this goal, we suggest to build a distributed architecture consisting of a REST web server ( ASP.NET WEB API,  SIGNALR),  a WEB client (ASP.NET MVC and Angular JS ) and a web service that processes files ( WCF or Other).

But for this tutorial we will use a single project for easier reading.

To follow this tutorial, you must undertand ASP.NET WEB API , SIGNALR and TPL Dataflow.

TECHNOLOGY ARCHITECTURE

  • HUB Server  : ASP.NET WEB API and SIGNALR
  • Monitoring Client   : ASP.NET MVC and AngularJS
  • Processing Server : TPL DataFlow, FileWatcher System
  1. HUB SERVER.

To Build our Hub Server, we will use ASP.NET Web API because clients must connect to the hub by uploading json data

key.png

PostData.

We will also use SignalR  as it allows bi-directional communication between server and client. Servers can now push content to connected clients instantly as it becomes available and supports Web Sockets.

Hub.Clients.All.LoadBalance(item) ==>  Send message to all connected clients

Hub.Clients.Client(id).LoadBalance(item) ==> Send Message to a specific client.

Hub.Clients.Group(groupId).LoadBalance(item) ==> Send Message to all clients connected to a specific group.

For more information about SIGNALR please take a look at http://www.asp.net/signalr

So, let Create an ASP.NET WEB API Project , and add a ApiController as follow :

  • install Microsoft ASP.NET SignalR, AngularJS  and TPL Dataflow

SignalRPackage

AngularNuget

tplNuget

Lets create an API Controller (MonitorController), SignalRBase implement IHub and allow us to access our Hub inside APIController.

SignalRbase

Hub.Clients.All.LoadBalance(item) : Notify all connected client to invoke LoadBalance function of the Hub

PostProcessor

Monitors is the name of the Hub and clients connect to Hub  as follow

var connection = $.hubConnection();
this.proxy = connection.createHubProxy(‘Monitors’);

Hub

Processor

When Server is invoked, Hub.Clients.All.LoadBalance(item) ( where item is Processor), data is pushed to Hub and be available for clients as follow

Monitor Controller MonitorCtrl use MonitorSvc

PushData

ConnectHub

2.MONITORING CLIENTS

Clients connect to LoadBalance function of the Hub as follow:

Client use MonitorCtrl  and iterate through processor to display items in real time. this is possible because MonitorCtrl push item into an array named Processor

$scope.Processor = new Array();

var addProcessor = function (data) {
$scope.Processor.push(data);
};

Receiver

3.PROCESSING SERVER

We can avoid bottlenecks in performance and improve overall responsiveness of our application using the asynchronous programming. However, traditional techniques for writing asynchronous applications can be complex and difficult to write, debug and update applications.

There exist différent technics to build asynchronous systems :

  • THREAD

Thread

We can Start, Stop, Abort and Coordinating Threads (Join)

  • TASK

A task Represents an asynchronous operation that can return a value

Task

ASYNC and AWAIT

Await

  • PARALLEL PROGRAMMING

Parallel

  • TPL DATAFLOW 

we want to just write the code, and the way we structure it results in no synchronization issues.  So  we don’t have to think about synchronization. In this world each object has its own private thread of execution, and only ever manipulates its own internal state.

Instead of one single thread executing through many objects by calling object methods, objects send asynchronous messages to each other.

If the object is busy processing a previous message, the message is queued. When the object is no longer busy it then processes the next message.
Fundamentally, if each object only has one thread of execution, then updating its own internal state is perfectly safe.

TPL Dataflow enable us to achieve this goal by building blocks. Blocks are
essentially a message source, target, or both. In addition to receiving and sending messages, a block represents an element of concurrency for processing the messages it receives.

Multiple blocks are linked together to produce networks of blocks. Messages are then posted asynchronously into the network for processing.

Consider the following use case.  Several files are sent to a server (in a directory), The data contained in each file need to be transformed into a
data object ready to be sent to the web service. For network efficiency the web service receives multiple data objects as part of a single request, up to a defined maximum.

The following process could be broken down into a series of blocks, where each block is responsible for doing some part of the overall processing.

_bufferBlock has the responsibility to fetch files from directory as they arrive

BufferBlock

BufferBlock1

_receptorBlockOne , _receptorBlockTwo and _receptorBlockThree has the responsibility to load balance fetched files

For better performance , we want to load balance our process. So our next step is to create 3 load balanced receptors,  if _receptorBlockOne is busy, _receptorBlockTwo or _receptorBlockThee will process the item,… ReceptorBlockOne, ReceptorBlockTwo and ReceptorBlockThree are blocks. So, if a message is refused by one block, the next linked block will be
offered the message. If all blocks refuse the message, the first block to become available to process the message will do so. To achieve this goal, we have to make a block non-greedy, simply set the queue length to 1.

Receptors

Receptorsbis

FindFiles

_transformBlockToManyFiles has responsability transfrom  a FileOrderEntity to as List<FileOrderEntity>.  large files must be split to many small files.

TransformManyTransformMany1.png

_printingBlock has responsability to print outputs

print

print1
Now we are going to build our Dataflow network by linking blocks.

LinkBlogs

To visualize the TPL Dataflow network , launch debugger and then click the search icon

debug

The schema below represent our DataFlow network, the workflow that will execute at runtime.

Workflow

Finally, let us use FileSystemWatcher to listen file system change notifications and raises events when a directory  receives some files.

For more information about FileSystemWatcher , please take a look at https://msdn.microsoft.com/en-us/library/System.IO.FileSystemWatcher(v=vs.110).aspx

FileWatcher

To run samle code, proceed as follow:

  • Create  a directory : <<D:Samplesdumpdir>>  or change this line of code to appropriate directory :

Dir1.png

dir2.png

  • Copy some csv files to directory  <<D:Samplesdumpdir>>
  • Csvfiles.png
  •  You will see that Receiver client   display files at real time.

NB : rename  “D:Samplesdumpdir  with a existing  folder on your local machine or create it,

If you do not have csv files , replace the following line  watcher.Filter = “*.csv”;  (  watcher.Filter = “*.xlsx“;  , etc… )

Result

I hope this post will help you.

Sample code is available here   DataflowSignalrAngularDemo

Regards




Gora LEYE

I'm a microsoft most valuable professional (MVP) .NET Architect and Technical Expert skills located in Paris (FRANCE). The purpose of this blog is mainly to post general .NET tips and tricks, www.masterconduite.com Gora LEYE

Support us

BMC logoBuy me a coffee