Architecting a software application from the ground up

Hi all! It's been a hot minute. Settling into work post-college has been a larger task than expected, and getting into a rhythm where I have time to write up my experiences has taken longer than I would have hoped.

As of three weeks ago, I am the singular software engineer at a medical device startup based in Atlanta, GA, USA. In this post, I will document the current state of the software at this company, the immediate tasks ahead of me, and my progress to date in developing the architecture of the new software.

The State of Things

As of right now, the existing platform, which supports hardware product #1, is written in C# on a .NET Framework platform. There are two repositories, one for our hardware schematics/firmware and the other for our C# application, which encompasses everything from the UI (implemented in WPF) to the software which runs on the Zynq chip on the hardware device itself.

The Path Ahead

Ahead of me is the task of rebuilding the existing application into a brand new one which: has all the same features/functionality, adds support for a new piece of hardware we are currently developing, is tailored specifically to supporting our company's hardware products (rather than the products of the company whose code we inherited), and is built from tools which will be supported from now into the far flung future.

Rather than attempting the monumental task of rebuilding everything from the ground up right from the get go, we'll be starting first with implementing a new application for hardware product #2. Starting this way will have the benefit of giving us a smaller scope of features, and will give us an idea of how difficult it will be to transition the features over from hardware product #1's application. It will also give us the chance to try out some modern software tools with minimal risk and see if they will be good to use for a large-scale application.

Architecting a New System

Where does one even start? As a baby software architect, the idea of figuring out how to plan out a new system was somewhat overwhelming. It's been drilled into my head throughout my education and during my first year of post college work that "[p]remature optimization is the root of all evil" and that it's far more important to create the smallest possible functional prototype you can at a time and iterate on it than it is to have an idea of the overall structure of a system from the get go. At the same time, my experiences from various internships and projects had shown me that neglecting to proper structure a codebase from the get go will inevitably lead to an unmaintainable mess of spaghetti code.

Thus, I decided that it would be good to come up with a basic idea of pieces of the application, and the basic file structure of those pieces. While it is true that "the best laid plans of mice and men oft go astray", I hope that at least having some idea of the scope and direction of the project from the beginning will help prevent me from getting horribly lost in the weeds at a later date.

Coming up with a list of priorities

Due to the overwhelming nature of how one decides a good place to start with a large project like this, I decided to come up with a list of priorities, a rubric of sorts, to grade any architecture I came up with by. I hoped that in coming up with this list of priorities, I might get inspiration for the architecture itself.

First, I need the application to have the following capabilities:
  1. A functional and polished UI which is intuitive to use
  2. Glue between the hardware controls and UI which can perform complex mathematical calculations to process the data generated by the hardware.
  3. An ability to issue commands to and receive information from our multiple hardware devices
  4. Easily extensible and maintainable by multiple different people who may be onboarded at various different points during the development process
  5. Relies primarily on free or cheap tools given the budget constraints of a small startup
  6. Capable of displaying data in soft real-time to end users.
With a current software team of ~1, the first requirement of a functional and polished UI means that I need to find a platform which allows me to leverage the capabilities existing libraries. Frankly, I do not have the time between now and our first release date to build every component from the ground up, so finding something that allows me to do a decent amount of plug and play with open-sourced, free, and supported components is crucial to meeting our deliverable requirements in a timely fashion.

The second requirement is one which took my head for a bit of a spin. Again, with a current development team of ~1, I simply don't have the time to code up data processing algorithms from scratch. While our team does and will continue to have custom built algorithms geared at efficient analysis of our data, it's important to pick a tool which will allow us to leverage existing algorithms, and which will run quickly in order to meet the final requirement of displaying processed data swiftly to end users.

The third requirement makes it clear that I need to have an API for each of the hardware devices that could be connected to the system. This API needs to be exposed to the data processing layer so that the data layer can in turn update the UI depending on which devices are or aren't present.

The fourth requirement of extensibility and maintainability by multiple different people means that I should pick languages and development models which are familiar to people from different backgrounds. No new and upcoming languages for us, we need something that is established with a vibrant online community that will make googling common solutions for different issues easy.

The UI Piece

Shockingly, the frontend options were pretty easy to narrow down. Taking a look at the desktop applications written by the major tech companies of the day, practically all end user applications have their frontends implemented using React.js or Angular. Thanks to the big push to open source technological developments, companies such as Facebook, Amazon, Google, and Spotify rely and in turn contribute heavily to the maintenance and development of new features for these platform. Even Microsoft, home to WPF, WinForms, and UWP, uses React.js on Electron on their consumer-facing desktop applications, and uses plain old React.js on their websites.

Given that many of the libraries available for use on a .NET platform are licensed, and that not even Microsoft is using their frontend tools for their modern desktop applications, I'm inclined to pick between Angular and React.js which are both free, well supported, and commonly used. Of the two, I'm already familiar with React.js, so that'll be the one we go with!

Because I want to ensure issues are caught sooner rather than later, and enforce cleaner coding practices, we will use TypeScript.

The Brain

Hands down, this was probably the easiest choice on my plate. I can really only see one option as far as a language capable of doing complex algorithms. There are three main languages used in data science nowadays: Python, R, and Matlab.

Matlab's expensive licensing fees and slow performance take it off the table as a potential option for our use case.

R and Python are both highly respected and have extensive libraries offering support for various kinds of data analysis, modeling and visualization. However, Python has a much larger online, supporting community, and is far easier to find competent developers for. Given that, and the fact that over the past three years, the support for data analytics, modeling, and visualization has been growing in the Python community whereas it has remained fairly stagnant in R, and the fact that the current development team and data science teams at our company have prior experience with Python, and not with R, picking Python as our brains seems like the only logical conclusion.

Hardware APIs

Here's a trickier one. In the existing software system, everything was done in the same language, C#. However, as our use case is graduating from a closed loop requirement to a soft real-time requirement, 
and potentially to a hard real-time requirement in the far off future, tying ourselves to a language which doesn't allow for real-time calculations seems like the wrong call. Ideally, any processing needed to communicate with our hardware will be done on device itself, reducing any latency of performing calculations on the connected computer. Each hardware device will have a certain number of shared commands (is the device on, is it experiencing any errors, etc) as well as a certain number that will be specific to the device itself. Following the principle of DRY code, these shared commands should only be implemented once.

Given this, the hardware controlling code should be implemented in a single repository. Common commands should be shared between the devices, and specific commands should be shared between devices with the same capabilities. On connection of the controller to the data brain of the application, the controller should indicate what devices are hooked up to the computer so the brain can know what capabilities of the system should be exposed to the user.

My inclination, given that we want fast performance, and that we're needing to manipulate memory registers on a physical device, is to use either C or C++. C# and Java are all well and good, but won't work for a device which could have real time requirements. One newer option is Rust, which is as lightweight as C++, and potentially offers some benefits over C, including more robust error catching and a lower probability of security vulnerabilities. However, making the jump to a newer programming language has its downfalls. It may be harder to recruit developers that already know the language, leading to a longer onboarding period. Additionally, while there are decades of help and examples out there for C development, there's a far more limited set of examples out there for Rust, though that set of examples is growing daily.

Summary

As of right now, I'm leaning towards a pretty microkernel-y approach to this application. By keeping the UI, Data Brain, and Hardware Drivers in separate little pods that communicate to each other via established APIs, it will make it easier to swap out any given component for a different one if we realize down the line that we picked the wrong tool for the job.

For the UI, I'm currently planning on pushing forward using Electron with React.js. React.js is by far the most commonly used frontend in the past several years, and there are an abundance of React developers out there. Additionally, the React project is free, entirely open sourced, and yet still supported by a major tech company (Facebook), meaning it's not likely to disappear anytime soon. Additionally, major tech companies such as Microsoft are using Electron and React.js to build their desktop applications (see Slack, Skype, VS Code, and others) rather than their own, in-house tools like WPF or WinForms. This leads me to believe that the developer experience of using Electron and React.js will be preferable compared to the tools available in the proprietary Microsoft ecosystem.

For the Data Brains, I'm planning to use Python. While the tools of R and Python are comparably robust, our team already has prior experience with Python, and Python has a larger, more supported online community than R.

For the drivers, I plan on using C. While I could still be convinced to use Rust, there is a strong benefit to using a language that is widely taught and used across industry sectors.

Final Notes

Thanks for getting this far! I hope this post has provided you some insight into how at least a beginner software architect went about making some of those crucial starting decisions for a new system.

What are your thoughts on the tools I've decided to use? How have you gone about deciding how different parts of a new application of your would communicate? If you have any thoughts, comments, or feedback, please leave a note in the comments below!

Comments

Popular posts from this blog

The Linker for Perl 6

Getting Started: Developing for Perl 6

Modifying Perl 6 Executable to Run Bytecode