Most complex software applications need some form of configuration, often through a configuration file. The more complicated the app, the more difficult become the choices between offering flexibility and improving usability. Configuration is the solution to get the best of both worlds. However, getting it right takes knowledge, experience and discipline and often requires having made all the usual mistakes yourself and learned from them. Now, our team of developers and data-analysts has published their solution to this problem as an open-source python library: PyConfig.
We will show you how it works and how it helps solve common issues, and provide some insight into our approach to software development.
No idea what we are talking about? Do not worry, this post isn’t aimed at software developers (only). You are an experienced programmer and want to skip this introduction and get straight to business? Check out the package documentation for a deep dive.
Let us start with a very simple example: a timer on a smartphone. As a user, you expect the experience to go as follows: you type in an amount of time, say 5 minutes, and then press “start”. After 5 minutes, the app tells you that the time is up. Could not be simpler. Now you might ask: what exactly does the app do at the end of the 5 minutes? Does it beep? Does it play my favorite song? Should my phone vibrate?
One way to solve this is that the developer makes a choice for you in advance and writes it directly in the code. In this example, we define a python function that takes only the amount of time as input, waits exactly that long and then plays the mp3 file “beep_beep” and vibrates the phone:
This is the simplest solution but the user loses all flexibility and control over the app for no good reason! The user can never change the app’s behavior! Let’s try again but this time ask the user exactly what they want. Every time you want to use your timer you have to tell it the amount of time, which sound to play and whether to vibrate the phone. EVERY SINGLE TIME. Notice in the following code how the mp3 file and the choice whether to vibrate or not are now passed to “run_timer” as input parameters (they are no longer “hard-coded”):
This gives the users the most flexibility and control, of course, but at the cost of making the app nearly unusable because it takes so long and so much effort just to set a simple timer!
“Configuration” is the way out of this trade-off so you do not have to compromise. Basically, somewhere in your phone there’s a file storing these “input” parameters that you rarely want to change. It might look like this:
When you use the timer, all you have to do is enter the amount of time and press “start”. The additional “input” will be read from this configuration file automatically, so you do not have to enter it every time.
On the other hand, if you grow tired of “la cucaracha”, the app will usually have a settings menu where you can pick a different tune (and the app will save this new choice by editing the file above). The best of both worlds: Excellent usability but with lots of flexibility and control.
Configuration comes in all shapes and sizes. Sometimes configuration parameters are too technical for ordinary users but interesting for very advanced users or even for programmers during the development of the app. Other times configuration parameters are secrets (such as passwords) that you do not want to type-in all the time. In general, configuration parameters are things that might differ from user to user but rarely change after they are chosen.
Python has a standard configuration library called “configparser”, which is very convenient and powerful, but does not address many common problems. So we set out to write our own library to make it easier to follow the best practices in app configuration. We will present a few of these common issues here and explain how PyConfig solves them.
Disclaimer: There are many interesting packages for app configuration out there and we value every contribution to the open-source community. Some people might prefer other packages to ours depending on their programming style, their needs and mindset. Of course, PyConfig is our baby and we might be biased, but we will try to stay as objective as possible.
If you really know what you are doing, if you are disciplined and careful, you can do a good job just using configparser. But let your guard down for a minute and there are several mistakes anyone could make in this area. PyConfig is meant to make the “right way” the “only way”. Where that is not possible, we try to make the “right way” at least the “natural way”. Let us start the break down!
Verifying the configuration early
The apps on your phone usually have a menu screen dedicated to changing “settings”. However, in apps made by programmers for other programmers or for data-analysts or as internal tools within a company, we often do not invest the time to make such a screen, and in fact these apps often don’t have any screens at all but instead just run on a text-based terminal. Therefore, configuration files are just edited by the users directly as text documents, and as developers, we have to be ready for anything!
In our simple example with the timer app, for example, the “vibrate” parameter should be either “yes” or “no”. But since users are just writing their own configuration files, nothing is stopping them from typing the word “maybe” in there!
In case the configuration is invalid, you always want to know about it as soon as possible. If you’re deploying an important, long-running web service and there’s a password missing in the configuration file, it’s usually okay to fail immediately, with a helpful and clear error message saying which password is missing, and then fix the configuration and launch the service again, within minutes. The danger is when the service runs despite having an invalid configuration. It could run for days without needing that missing password and no one would notice, until it finally needs the password for the first time and crashes horribly. True to Murphy's Law, this always happens in the middle of the night or when everyone who could solve the problem is on vacation. The next morning you wake up to an inbox full of angry e-mails from your clients, and a few from your boss.
If you use our “timer” above to bake a cake and the “sound file” parameter is missing, the app will first wait the 25 minutes until your cake is ready and only then try to access the missing parameter and crash. You’ll instead be notified much later by the smoke detector. At Next Kraftwerke, on the other hand, this kind of undetected crash in a long-running service could make us miss a lot of important data, prevent us from playing our role in stabilizing the electricity grid, or just cost us a lot of money on the trading floor.
PyConfig avoids this scenario by automatically validating the app’s configuration as soon as it is loaded. All of it. It will check that every parameter has a value (or at least a default value) and that every value is of the right “data type” (that is, a number, a date, text, a URL, yes/no etc.). Additionally, developers can define more complex checks to be performed automatically. If there is a problem, it will be noticed as soon as the app starts up.
Another common mistake is unintentionally leaking sensitive configuration parameters (such as passwords). During development, it is common to include configuration parameters in error messages, so we can better understand the context of an error. One developer logs an innocent error message that includes the app’s configuration, another developer separately adds a new configuration parameter that’s supposed to be secret... and suddenly you have a serious security issue!
PyConfig offers a custom data type called “SecretString” made especially for this situation. Every sensitive parameter should be marked with this special type. When you print the configuration in an error message or any other log, PyConfig automatically masks secrets as “*****” instead. It not only saves you the trouble of checking whether there are any secrets that need masking, but also ensures that new secrets added later will also be masked appropriately.
The library will also forbid you from giving secret parameters a default value (a trick that is unfortunately still common for convenience during development). It just will not run until you remove that “future embarrassment” from your code.
All the information about the configuration options for an app can be found somewhere in the code, either where the configuration is loaded, where it is validated or where it is actually used. But users must also get this information (without knowing the source code) in order to write their own configuration files: What parameters are there? What are their data types? How do they change the behaviour of the app? What are the restrictions on the values that each parameter can have? This documentation is commonly provided separately as a PDF or a wiki page. Finally there are also “template” configuration files, where all the parameters are already listed but the values are missing (or are filled using defaults). These templates are convenient because users can simply copy them and quickly set up the app.
In total, we have now three separate places where some or all of the details about the configuration are included, and they need to be kept synchronized! If you change the name of a parameter in the code and forget to update the documentation on the wiki page, you get angry users. Change the default value of a parameter and forget to update the template file? Angry users. You get the idea.
This happens all of the time. So much so, that experienced programmers often prefer checking the details looking directly in the source code, because they know the documentation cannot be trusted.
PyConfig solves this problem by automatically generating a fully detailed documentation of the configuration options, as well as a template file, directly from the source code. The users can generate either one whenever they want, and the result is by definition up-to-date.
And as a developer, you don’t have to worry about synchronizing all these sources of information and can focus on what really matters, which is the source code.
There are several different formats commonly used for app configuration (for example .INI, .YAML, .JSON or .ENV) and sometimes it is convenient to use “environment variables” instead of files for this purpose (but do not worry if you don’t know what those are). Usually, developers have to make a choice about what format they want to support. Different formats require different code to read the files and transform their contents into python variables, so it is a lot of work to support more than one option. In fact, developers do not want to deal with this issue at all because configuration files and formats and environment variables have nothing to do with the actual problem they’re trying to solve with their software. It is boring, repetitive and distracts us from the real challenges such as in our case of the Virtual Power Plant managing the energy production of thousands of power plants and securing the smooth operation of the power system!
Our library separates the application logic from configuration loading mechanisms and allows developers to write their code independently from any configuration formats. App users, on the other hand, have the flexibility to choose whatever format they like, or environment variables, or both.
And even more: Format support evolves together with the library. In its early days, PyConfig supports only .INI and .YAML for files, plus environment variables. In future versions we intend to add more formats (for example JSON). App developers simply have to switch to the newest library version (but don’t have to change anything in their source code) in order to offer their users this additional flexibility.
At Next Kraftwerke we’ve enjoyed the benefits of the rich python open source community for a long time and it was about time we give something back. PyConfig is our first open project and hopefully soon only one of many. We invite you to check out our code on github, try it out, submit bug reports, feature requests, pull requests or even just get in touch. And to the python community: Thanks for everything. Thanks for letting us in. We’re just glad to play our part.