Data Flow Programming in real world
Data Flow Programming has been capturing a lot of eyes recently in the tech industry. It allows us to execute a set of instructions (algorithm) without really writing code. I think visual programming will never be able to provide a compelling/head-to-head alternative for traditional programming languages with fewer exceptions like Scratch.
In case of scratch, it allows a person to create fun animations, simple games and what not by taking a visual approach to programming. I have another example in which a high school student implemented an embedded C++ language called chaiscript in an excellent graph programming framework. The project is called chigraph and it is worth checking out.
Chaigraph GUI (https://github.com/chigraph/chigraph)
It is still not practically useful in a general-purpose programming environment. This is primarily because of the following reasons.
Every programmer is unique in some way
They have a different way of approaching a problem
Preference changes between structured, functional, object-oriented or even a hybrid approach
Limits a programmers imagination
Impossible to satisfy a larger crowd out of the educational realm
This is not entirely true when you go through the domain-specific route though. Let’s say you wanted to develop a domain-specific language just for dealing with problems in signal processing or some specific area like that. You could go about writing your own grammar, a lexer, parser and develop a whole domain-specific language / you could take a data-flow programming route where you could keep your existing code as such and create a few nodes and data types for getting the job done. In the case of domain-specific problems, there may not be a lot of things a scientist/engineer might want to achieve. It would make perfect sense in a setting like that. Now, we have established an understanding of where data flow programming might be required. Let us take a look at some of the critical components.
The following are the base items in a Data Flow programming environment.
Graph (graphics scene)
A Node is generally a block or a model that holds the logic. Think of it as a function in a structured program. It should be able to take different inputs, perform some computation on it and then pass the output. There are several ways in which it differs from a function, though.
A function requires all the compulsory arguments to be passed during the function call whereas a Node waits for all the required arguments to be passed and only performs computation after.
A function validation is performed only if it has all the required inputs.
A Node validates itself every time an input is passed, or input is changed.
There is still quite a bit of difference. I haven’t listed them all.
A connection propagates the data from one block to the another. In general, it should be able to pass any data around the graph.
These are generally an abstract way to represent the data structures around. This is the most crucial part of any data flow programming software. It has to very well thought out and abstracted in such a way that it can be reused efficiently across multiple models in the graph. It should also have all the necessary getters for obtaining information from a data structure.
A Graph comprises of all the above items. It does all the computations across different blocks. It validates the graph as soon as an input of a model gets updated, or a model gets removed.
There are quite a lot of issues that come up in this otherwise cool way of automated systems.
Abstraction of a data type — This is always a bottleneck with data flow programming. It works perfectly in Domain-specific languages where you only have 10 or 20 data types of dealing with. In a general-purpose scenario, sometimes the user wants to use much complex data type or even invent his own. This limits the number of tasks that can be accomplished.
Volatility of Graph — Most of the times graphs are too volatile and needs baby proofing of models for multiple edge cases. If this is not done correctly, things could turn ugly. A user working on a 1000 block model might be frustrated if an edge case creeps up and crashes the whole scene.
Complexity — While data flow programming techniques ease the end-user, these tools are a programmer’s worst nightmare. The entire tool could crash at any point in time. In some scenarios, the data structure is so complex that it cannot even be abstracted to a model data type.
Code Generation — In almost all of the cases, a programmer is required to do code generation for a graph. This process is not pretty in any way. You may even be asked to do reverse code generation wherein you are asked to convert a working script from some other programming language to a Graph.
Styling and Design — You would have to create an awful lot of widgets and apply make upon them to make sure they look pretty enough to the target end-users. Its never easy to please everyone, hence we end up creating multiple style sheets for addressing users needs. Things like colour blending with the background scene, connections are quite dull from a programmer point of view.
There are a lot more disadvantages. But I have only jotted down the ones that I have faced while working on these types of tools. Although it may seem like I am very harsh on this style of programming, it comes out of the frustration that I have developed while implementing a tool like this. I do see this performance as an excellent tool in domain-specific regions where well-posed problems are solved with almost a unified workflow. I am simply against that idea of creating a general-purpose programming language out of this. Let me know if any of you have experienced developing tools like this in your professional or personal life (as a hobby).