Tuesday, 19 July 2016

Effective Typesafe Config

Typesafe Config is a library that aims to do one thing well: reading configuration files in your applications and libraries. It has a number of features that makes it a good choice for the job: it's pure Java so works with any JVM based language, it's self contained so pulls in no external dependencies, and offers a flexible feature sets for reading and accessing configuration files. No wonder then that it's widely used in the JVM world, especially in Scala projects (natural given its origin at Typesafe the company, now Lightbend).

Having come across this library at a number of companies, I often find that it's not used in the most efficient way. So here's my attempt at pulling together some recommendations for how to use it effectively, to get the most out of it. The following advice is more opinionated than the official docs but this is what I find works well in practice.

DO put configuration in the right file

Typesafe Config lets you put configs in several files. The two files that are typically used are *reference.conf* and *application.conf*. It's important to understand the distinction between these and the purpose of each. In short:

  • Put defaults in reference.conf
  • Put environment specific configuration only in application.conf

The docs suggest libraries provide defaults in reference.conf and applications provide configuration in application.conf. I find this to be insufficient, because it means that if you have multiple application.conf files, one for each environment, you will end up with a lot of duplication between these files if they contain configs that apply across environments.

DON'T include an application.conf file in your source code

This perhaps somewhat surprising suggestion derives from the previous point: If an application.conf file contains environment specific configuration, which environment is the file in your source tree referring to? There are various options for how to provide the application configuration instead:

  • Include several files, e.g. application.conf.dev, application.conf.qa etc in your source tree. Then specify the one to use for example by using a command line argument to your application, or a custom build step for your environment.
  • Don't provide application configs in source code. This means providing configs from an external system at runtime. Do note that in this case, it's still worth providing a dev config file in the source code, to make it easy to run your application locally while developing it. Just make sure this file doesn't accidentally gets included in the build and picked up in real deployments!

Which ones of these approaches you choose depends more on what fits best in with your operational infrastructure.

DO namespace your configurations

Some developers when first providing configs for their application use simple configuration keys like "eventlistener.port". The problem with this however is that you risk your config parameters clashing with those of libraries you depend on. Remember that when resolving configurations, the library will merge _all_ reference.conf files it seems in its classpath, from all libraries you include. You really want to make sure that your configs don't clash with those of others.

Fortunately, it's simple to do this: simply put your configs in a namespace that's unique to your application or component. In the HOCON files supported by Typesafe Config, that's as simple as wrapping your configs in a block:

com.mycompany.myapp {
  <... rest of configs here ...>

I suggest you think of your configuration as a global tree of settings that could be provided as a single logical config file - even if you may not choose to do this in practice. I.e. even if you're working on a microservice that's part of a much larger system, think of your configuration as part of a global, logical configuration tree for the system as a whole.

Doing this means you are free to move code between libraries and services at a later date without clashes. And, it means you can easily provide the configuration for your application from a central configuration service, should you choose to.

DO take advantage of the HOCON format

Typesafe Config supports multiple formats including Java properties files and JSON. But, there's no reason to go for these formats when it also supports HOCON, a much richer format. Take advantage of this to make your config files as simple, duplication- and noise-free as possible. For example:

  • Include comments to explain the purpose of settings
  • Use units for things like durations, to avoid mistakes about milliseconds vs seconds etc.
  • Use block syntax to avoid repetition
  • Avoid unnecessary quotes and commas

See the full description of the HOCON format to fully familiarise yourself with what you can do with it.

DON'T use "." as a word separator

In HOCON, a full stop (".") has a special meaning: it's the equivalent of a block separator. In other words, this configuration:


is the equivalent of:

service {
  incoming {
    queue {
      reconnection {
        timeout {
          ms = 5000

...which is probably not what you intended! Instead using snake case for compound words, only using the block structure where you actually want a configuration block or object, for example:

service.incoming-queue.reconnection-timeout=5000 ms

You'll want a block where you may want additional parameters that logically belongs to the same entity, so that these parameters can be accessed as a single object. In the above example we may add another parameter as follows:

service.incoming-queue.ack-timeout=2000 ms

This means that the 'service.incoming-queue' object now has two properties.

DON'T mix spaces and tabs

Need I say more?

DO limit where you access configurations in your code

The Typesafe Config API makes it easy - maybe too easy! - to get a configuration object for the application. You can do just:

  Config config = ConfigFactory.load()

This makes it all too tempting to load a configuration object every time you need to get a setting. There are some problems with this though. First of all, there are different ways to produce the Config object for an application. You can for example read it from a non-standard file, from a URL, or compose it by merging several sources. If you use the singleton-style way of getting at the Config object, you will have to potentially change a lot of code when you change the specifics of how you load the configuration.

Instead, I suggest this rule:

Only load the configuration once in any given application.

DON'T depend directly on configuration in application components

On a similar note, I also strongly recommend not passing around Config objects in your application. Instead, extract the parameter that each component needs and pass it in as regular parameters, grouping them into objects if necessary. This has several advantages:

  • Only your main application has a dependency on the Typesafe Config API. This means it's easier to change the config API at some point should you wish to, and most of your code can be used in a different context where a different config mechanism is used.
  • It makes testing easier, in that you don't have to construct Config objects to pass in to objects under test.
  • The API of your components can be more type safe and descriptive.

In other words, instead of doing this:

class MessageHandler(config: Config) {
  val hostname = config.getString("hostname")
  val portNumber = config.getInt("portnumber")
  val connectionTimeout = 
config.getDuration("connection-timeout", TimeUnit.MILLISECONDS).millis

Do this:

case class ConnectionParameters(hostname: String, portNumber: Int, connectionTimeout: FiniteDuration)

class MessageHandler(brokerParameters: ConnectionParameters) {
  // ...

You'll probably want to define a factory method that reads objects like ConnectionParameters from Config, calling these when creating and wiring up your application components.

DON'T refer to configs using absolute paths

I sometimes come across code that reads configuration as follows:

def getMessageHandlerConfig(config: Config): ConnectionParameters = {
  val hostname = config.getString("com.mycompany.message-handler.hostname")
  val portNumber = config.getInt("com.mycompany.message-handler.portnumber")
  val connectionTimeout = config.getDuration("com.mycompany.message-handler.connection-timeout", TimeUnit.MILLISECONDS).millis

  ConnectionParameters(hostname, portNumber, connectionTimeout)

Here we are converting parameters from a Config object into a nicely typed object that we can pass around our application. So what's wrong with this?

The problem is that this code contains knowledge of the overall shape of the configuration tree, using the full parameter path of a configuration item (e.g. "com.mycompany.message-service.hostname"). It only works if you pass in the root Config object, and assumes that these parameters will always live at this point and this point only in the config tree. This means that:

  • We can never use this method for getting parameters of a component that lives in a different part of the tree.
  • If we reorganise our config tree, we have to change a lot of code.

Much better instead then, to have this method to only read parameters directly from the Config object it's given:

def getMessageHandlerConfig(config: Config): ConnectionParameters = {
  val hostname = config.getString("hostname")
  val portNumber = config.getInt("portnumber")
  val connectionTimeout = config.getDuration("connection-timeout", TimeUnit.MILLISECONDS).millis

    ConnectionParameters(hostname, portNumber, connectionTimeout)

Now this can be called with a sub object of the overall configuration, for example:

val incomingMessageHandlerParams = getMessageHandlerConfig(config.getConfig("com.mycompany.incoming-message-handler"))

And you can now use the same method for other parts of your configuration tree:

val outgoingMessageHandlerParams = getMessageHandlerConfig(config.getConfig("com.mycompany.outgoing-message-handler"))

In summary

The Typesafe Config API has a lot going for it and is well worth a look if you're writing JVM based applications and services. If you do use it, I suggest having a closer look at the documentation, as well as consider the points I mentioned above, to make the most of it.

If you have further suggestions, or perhaps disagree with my recommendations I would love to hear about it. Get in touch, either in the comments below or ping me on Twitter.

1 comment:

Real Time Web Analytics