Wednesday, July 12, 2017

The Patterns of the Antipatterns: Architecture

We have been always looking for the best ways to architect our applications and platforms so they are maintainable, extensible, observable, adaptable and easy to evolve (this is by no means a complete list of the properties we would like to have but the foundational ones). Layered architecture, component-based architecture, hexagonal architecture, microservice architecture (and many others) advocate to use great design principles and guidelines to achieve these goals.

For example, layered architecture suggests to have clear separation between different layers (persistence, services, caching, presentation, ...) with a strict rules imposed on how layers could talk to each other. Component-based architecture, quite widespread in the world of UI development but not limited to it, introduces the components (which internally may be built on top of layered architecture or other architectural style) as key building blocks of the applications or system.

In turn, hexagonal architecture goes even further by introducing the concepts of ports and adapters around the application core. Last but not least, microservice architecture, the hottest architectural style these days, advocates to structure the application or system as a set of loosely coupled, ideally small, self-sufficient services which collaborate with each other.

Each one of the aforementioned architectural styles (as well as others we haven't mentioned) has own strong/weak sides and the context where it shines the most. But there are two kinds of architectures which beat them all, and those we are going to discuss right away.

Proximity Architecture

There is only one principle this architectural style offers: use anything you want anywhere you need. Do you need access to the service from data transfer object? No problem, just do it! May be your RESTful resource needs access to persistence services? Not an issue, go for it, because by and large, why to bother? It is just a pile of classes (usually within the same application), use static fields or methods, dependency injection or any other technique your language / frameworks provide. At the end, if compiler is happy, what is the problem? Right?

No, it is not right ... For sure, you have heard funny terms such as "big ball of mud" or "spaghetti code". Those are the typical outcomes when proximity architecture drives the show.

There are many reasons why proximity architecture emerges. Those reasons range from the constant urge to push features out, lack of knowledge and/or experience in software craftsmanship or the worst case, an ethical and careless approach to software engineering discipline. If you happen to have a startup experience , or worked for one of those large companies that still claim to be a startup to justify their chaos, I’m sure you have seen this form of architecture or one of its variations. Are they all doomed?

Well, I would argue, no, not really but fleeing the claws of the proximity architecture solely depends on the trade-offs you are going to make. For the developers, it is hard or even impossible to push back on the timelines set for the delivering of new features (unless you are exceptionally lucky and your manager understands what is technical debt and how to keep it under control). You have to and you will take shortcuts. However, it is enormously important to strive for the right balance between pouring in hacky code (on top of bunch of other hacks) and going with quick / not the best option while keeping in mind how much harm it could do to overall application architecture (and how hard it would be to refactore / improve / reverse that later on). Undoubtedly, seasoned software developers are able to cope with that.

I know, it sounds pretty vague to be an actionable advice but every company is unique, there is no magic universal rule or recipe to stop proximity architecture from poisoning your applications. The one thing is clear though, if you let it in and do nothing, you are going to end up with the unmanageable codebase and the only route to take would be to rewrite everything from scratch.

Micromonolith Architecture

Microservices, microservices, microservices ... everyone talks about them ... You hear from left and right that if you don't do microservices, then you are stuck in stone age. And many actually truly believe that all their problems and issues could be solved with microservices.

In many regards, microservices architecture could put life into stagnating projects assuming you are ready to embrace it across the whole organization, culturally and technically. This is a serious long-term commitment as such the changes won't happen overnight. But so many organizations fail to see the new beast rising here: micromonolith architecture in microservice clothing.

Here is what you could be sold as microservices architecture. Many organizations ended up with monolith and have no choice as to deal with that. Quality is terrible, productivity is going down, releasing new features takes months (or even years) ... and throwing more people into it does not help at all. So everyone tries to escape the monolith, whatever it takes, building standalone applications or services on the side, luckily terrific Spring Boot makes it as easy as falling off a log.

At first glance, it doesn’t look bad, after all they are calling it a microservice, and a bunch of things are calling each other, right? Except that this is not right at all. Applications and services that thrive in this kind of unhealthy ecosystem often do not own any data. They do not have sufficient knowledge to function independently and they do not belong to a concrete domain. These kind of services need to talk to the monolith to perform any useful function and in the worst case scenario, they share a database with the monolith and becomes a mere chatty proxy.

Why is that? Well, because in most cases it is very hard to migrate the large monolithic application to microservices architecture (the true one). And along with the migration, the organization should transform itself to become a fertile ground for cultivation of microservices, naturally. Not everyone is so brave to pull it off but the good news though, it is feasible. And here is why.

While building the monolith, the organization (hopefully) becomes an expert in its domain. There are tons of knowledge developed around the product, there are well-defined, well-understood use cases and flows, the domain-driven design might be of great help to formalize and capture all that. Once there is a solid understanding of what should be built, it is time to start cutting the pieces off the monolith. It is going to be a bumpy ride for sure and probably you won't be able to get from point A to point B directly (refactorings? splitting monolith into modules?), not to forget about evolving your testing practices (consumer contract tests? component tests? e2e tests?) along the way. But the rewards are worth the efforts.

Microservice architecture is a great one, it opens unlimited amount of opportunities and, when done right, could bring back the joy of development, could serve as a solid foundation for organization's software stacks, current and future ones. But neglecting its core principles would pave the road to frustration and chaos.

Monday, June 19, 2017

The Patterns of the Antipatterns: Design / Testing

It's been a while since I have started my professional career as the software developer. Over the years I worked for many companies, on a whole bunch of different projects, trying my best to deliver them and be proud of the job. However, drifting from company to company, from project to project, you see the same bad design decisions made over and over again, no matter how old the codebase is or how many people have touched it. They are so widely present that truly become the patterns themselves, thereby destine the title of this blog: the patterns of the antipatterns.

I am definitely not the first one who identified them and certainly not the last one either (so let all the credits go to the original sources if any exist). They all come from the real projects, primarily Java ones, though other details should stay undisclosed. With that ...

Code first, no time to think

This is hardly sounds like a name of the antipattern but I believe it is. We all are developers, and unsurprisingly love to code. Whenever we have gotten some feature or idea to implement, we often just start coding it, without thinking much about the design or even not looking around if we could borrow / reuse some functionality which is already there. As such, reinventing the wheel all the time, hopefully enjoying at least the coding part ...

I am absolutely convinced that spending some time just thinking about high-level design, incarnating it in terms of interfaces, traits or protocols (whatever your favoring language is offering) is a must-do exercise. It won't take more time, really, but the results are rewarding: beautiful, well-thought solution at the end.

TDD is a set of the terrific practices to help and guide your through. While you are developing test cases, you are iterating over and over, making changes and with each step getting close to the right implementation. Please, adopt it in each and every project you are part of.

Occasional Doer

Good, enough philosophy, now we are moving on to a real antipatterns. Undoubtedly, many of you witnessed such pieces of code:

public void setSomething(Object something) {
    if (!heavensAreFalling) {
        this.something = something;
    }
}

Essentially, you call the method to set some value, and surely you expect the value to be set once the invocation is completed. However, due to presence of some logic inside the method implementation it may actually do nothing, silently ...

This is an example of really bad design. Instead, such interactions could be modeled using for example State pattern or at least just throwing an exception saying that the operation cannot be completed due to internal constraints. Yes, it may require a bit more code to be written but at least the expectations will be met.

The Universe

Often there is a class within the application which is referencing (or is referenced by) mostly every other class of the application. For example, all classes in the project must be inherited from some base class.

public class MyService extends MyBase {
  ...
}
public class MyDao extends MyBase {
  ...
}

In most cases, this is not a good idea. Not only it creates a hard dependency on this single class but also on every dependency this guy is using internally. You may argue that in Java every single class implicitly is inherited from Object. Indeed, but this is part of the language specification and has nothing to do with application logic, no need to mimic that.

Another extreme is to have a class which serves as a central brain of the application. It knows about every service / dao / ... and provides accessors (most of the time, static) so anyone can just turn to it and ask for anything, for example:

public class Services {
    public static MyService1 getMyService1() {...}
    public static MyService2 getMyService2() {...}
    public static MyService3 getMyService3() {...}
}

Those are universes, eventually they leak into every class of the application (it is easy to call static method, right?) and couple everything together. In more or less large codebase, getting rid of such universes is exceptionally hard.

To prevent such things to poison your projects, use dependency injection pattern, implemented by Spring Framework, Dagger, Guice, CDI, HK2, or pass the necessary dependencies through constructors or method arguments. It will actually help you to see the smells early on and address them even before they become a problem.

The Onionated Inheritance

This is a really fun and scary one, very often present in projects which expose and implement REST(ful) web services. Let us suppose you have a Customer class, with quite a few different properties, for example:

public class Customer {
    private String id;
    private String firstName;
    private String lastName;
    private Address address;
    private Company company;
    ...
}

Now, you have been asked to create an API (or expose through other means) which finds the customer but returns only his id, firstName and lastName. Sounds straightforward, isn't it? Let us do it:

public class CustomerWithIdAndNames {
    private String id;
    private String firstName;
    private String lastName;
}

Looks pretty neat, right? Everyone is happy but next day you get to work on another feature where you need to design an API which returns id, firstName, lastName and also company. Sounds pretty easy, we already have CustomerWithIdAndNames, only need to enrich it with a company. Perfect job for inheritance, right?

public class CustomerWithIdAndNamesAndCompany extends CustomerWithIdAndNames {
    private Company company;
}

It works! But after a week, another request comes in, where a new API also needs to expose the address property, anyway, you got the idea. So at the end you end up with dozen of classes which extends each other, adding properties here and there (like onions, hence the antipattern name) to fulfill the API contracts.

Yet another road to hell ... There are quite a few options here, the simplest one is JSON Views (here are some examples using Jackson) where the underlying class stays the same but different views of it could be returned. In case you really care about not fetching the data you don't need, another option is GraphQL which we have covered last time. Essentially, the message here is: don't create such onionated hierarchies, use single representation but use different techniques to assemble it and fetch the necessary pieces efficiently.

IDD: if-driven development

Most of real-world projects internally are built using pretty complex set of application and business rules, cemented by applying different layers of extensibility and customizations (if needed). In most cases, the implementation choice is pretty unpretentious and the logic is driven by a conditional statements, for example:

int price = ...;
if (calculateTaxes) {
    price += taxes;
}

With time, the business rules evolve, so do the conditional expression they are impersonated, becoming a real monsters at the end:

int price = ...;
if (calculateTaxes && (country == "US" || country == "GB") && 
    discount == null || (discount != null && discount.getType() == TAXES)) {
    price += taxes;
}

You see where I am going with that, the project is built on top IDD practices: if-driven development. Not only the code becomes fragile, prone to condition errors, very hard to follow, it is very scary to change as well! To be fair, you also need to test all possible combinations of such if branches to ensure they make sense and the right branch is picked up (I bet no one is actually doing that because this is a gigantic amount of tests and efforts)!

In many cases the feature toggles could be an answer. But if you get used to write such a code, please take some time and read the books about design patterns, test-driven development and coding practices. There are so many of them, below is the list I would definitely recommend:

There are so many great ones to add here, but those are a very good starting point and highly rewarding investment. Much better alternatives to if statements are going to fill you mind.

Test Lottery

When you hear something like "All projects in our organization are agile and using TDD practices", the idealistic pictures come in mind, you hope everything is done right by the book. In reality, in most projects things are very far from that. Hopefully, at least there are some test suites you could trust, or could you?

Please welcome the test lottery: the kingdom of flaky tests. Have you ever seen builds with random test failures? Like on every consecutive build (without any changes introduced) some tests suddenly pass but others are starting to fail? And may be one of ten builds may turn green (jackpot!) as stars finally got aligned properly? If no one cares, it becomes a new normal and every team member who joins the team is being told to ignore those failures, "they are failing for ages, screw them".

Tests are as important as the mainstream code you push into production. They need to be maintained, refactored and kept clean. Keep all your builds green all the time, if you see some flakiness or random failures, address them right away. It is not easy but in many cases you may actually run into real bugs! You have to trust your test suites, otherwise why do you need them at all?

Test Framework Factory

This guy is an outstanding one. It happens so often, it feels like every organization is just obligated to create own test framework, usually on top of existing ones. At first it sounds like a good idea, and arguably, it really is. Unfortunately, in 99% the outcome is yet another monstrous framework no one wants to use but is forced to because "it took 15 men / years to develop and we cannot throw it away after that". Everyone struggles, productivity is going down at the speed of light, and quality does not improve at all.

Think twice before taking this route. The goal to simplify testing of complex applications the organization is working on is certainly a good one. But don't fall into the trap of throwing people at it, who are going to spend few months or even years working on the "perfect & mighty test framework" in isolation, may be doing great job overall, but not solving any real problems. Instead, just creating new ones. If you decided to embark on this train anyway, try hard to stay focused, address the pain points at their core, prototype, always ask for feedback, listen and iterate ... And keep it stable, easy to use, helpful, trustful and as lightweight as possible.

Conclusions

You cannot build everything right from the beginning. Sometimes our decisions are impacted by pressure and deadlines, or the factors outside of our control. But it does not mean that we should let things stay like that forever, getting worst and worst every single day, please, don't ...

This post is a mesh of ideas from many people, terrific former teammates and awesome friends for good, credits go to all of them. If you happens to like it, you may be interested in checking out the upcoming post, where we are going to talk about architectural antipatterns.

Wednesday, April 26, 2017

When following REST(ful) principles might look impractical, GraphQL could come on the resque

I am certainly late with jumping on the trendy train, but today we are going to talk about GraphQL, a very interesting approach to build REST(ful) web services and APIs. In my opinion, it would be fair to restate that REST(ful) architecture is built on top of quite reasonable principles and constraints (although the debates over that are never ending in the industry).

So ... what is GraphQL? By and large, it is yet another kind of the query language. But what makes it interesting, it is designed to give the clients (f.e., the frontends) the ability to express their needs (f.e., to the backends) in terms of data they are expecting. Frankly speaking, GraphQL goes much further than that but this is the one of its most compelling features.

GraphQL is just a specification, without any particular requirements to the programming language or technology stack but, not surprisingly, it got the widespread acceptance in modern web development, both on client-side (Apollo, Relay) and server-side (the class of APIs often called BFFs these days). In today's post we are going to give GraphQL a ride, discuss where it could be a good fit and why you may consider adopting it. Although there quite a few options on the table, Sangria, terrific Scala-based implementation of GraphQL specification, would be the foundation we are going to build atop.

Essentially, our goal is to develop a simple user management web API. The data model is far from being complete but good enough to serve its purpose. Here is the User class.

case class User(id: Long, email: String, created: Option[LocalDateTime], 
  firstName: Option[String], lastName: Option[String], roles: Option[Seq[Role.Value]], 
    active: Option[Boolean], address: Option[Address])

The Role is a hardcoded enumeration:

object Role extends Enumeration {
  val USER, ADMIN = Value
}

While the Address is another small class in the data model:

case class Address(street: String, city: String, state: String, 
  zip: String, country: String)

In its core, GraphQL is strongly typed. That means, the application specific models should be somehow represented in GraphQL. To speak naturally, we need to define schema. In Sangria, schema definitions are pretty straightforward and consist of three main categories, borrowed from GraphQL specification: object types, query types and mutation types. All of these we are going to touch upon, but the object type definitions sounds like a logical point to start with.

val UserType = ObjectType(
  "User",
  "User Type",
  interfaces[Unit, User](),
  fields[Unit, User](
    Field("id", LongType, Some("User id"), tags = ProjectionName("id") :: Nil, 
      resolve = _.value.id),
    Field("email", StringType, Some("User email address"), 
      resolve = _.value.email),
    Field("created", OptionType(LocalDateTimeType), Some("User creation date"), 
      resolve = _.value.created),
    Field("address", OptionType(AddressType), Some("User address"), 
      resolve = _.value.address),
    Field("active", OptionType(BooleanType), Some("User is active or not"), 
      resolve = _.value.active),
    Field("firstName", OptionType(StringType), Some("User first name"), 
      resolve = _.value.firstName),
    Field("lastName", OptionType(StringType), Some("User last name"), 
      resolve = _.value.lastName),
    Field("roles", OptionType(ListType(RoleEnumType)), Some("User roles"), 
      resolve = _.value.roles)
  ))

In many respects, it is just direct mapping from the User to UserType. Sangria can easy you off from doing that by providing macros support so you may get the schema generated at compile time. The AddressType definition is very similar, let us skip over it and look on how to deal with enumeration, like Roles.

val RoleEnumType = EnumType(
  "Role",
  Some("List of roles"),
  List(
    EnumValue("USER", value = Role.USER, description = Some("User")),
    EnumValue("ADMIN", value = Role.ADMIN, description = Some("Administrator"))
  ))

Easy, simple and compact ... In traditional REST(ful) web services the metadata about the resources is not generally available out of the box. However, several complimentary specifications, like JSON Schema, could fill this gap with a bit of work.

Good, so types are there, but what are these queries and mutations? Query is a special type within GraphQL specification which basically describes how you would like to fetch the data and the shape of it. For example, there is often a need to get user details by its identifier, which could be expressed by following GraphQL query:

query {
  user(id: 1) {
    email
    firstName
    lastName
    address {
      country
    }
  }
}

You can literally read it as-is: lookup the user with identifier 1 and return only his email, first and last names, and address with the country only. Awesome, not only GraphQL queries are exceptionally powerful, but they are giving the control of what to return back to the interested parties. Priceless feature if you have to support a diversity of different clients without exploding the amount of API endpoints. Defining the query types in Sangria is also a no-brainer, for example:

val Query = ObjectType(
  "Query", fields[Repository, Unit](
    Field("user", OptionType(UserType), arguments = ID :: Nil, 
      resolve = (ctx) => ctx.ctx.findById(ctx.arg(ID))),
    Field("users", ListType(UserType), 
      resolve = Projector((ctx, f) => ctx.ctx.findAll(f.map(_.name))))
  ))

There are two queries in fact which the code snippet above describes. The one we have seen before, fetching user by identifier, and another one, fetching all users. Here is a quick example of latter:

query {
  users {
    id 
    email 
  }
}

Hopefully you would agree that no explanations needed, the intent is clear. Queries arguably are the strongest argument in favor of adopting GraphQL, the value proposition is really tremendous. With Sangria you do have access to the fields which clients want back, so the data store could be told to return only these subsets, using projections, selects, or similar concepts. To be closer to reality, our sample application stores data in MongoDB so we could ask it to return only fields the client is interested in.

def findAll(fields: Seq[String]): Future[Seq[User]] = collection.flatMap(
   _.find(document())
    .projection(
      fields
       .foldLeft(document())((doc, field) => doc.merge(field -> BSONInteger(1)))
    )
    .cursor[User]()
    .collect(Int.MaxValue, Cursor.FailOnError[Seq[User]]())
  )

If we get back to the typical REST(ful) web APIs, the approach most widely used these days to outline the shape of the desired response is to pass a query string parameter, for example /api/users?fields=email,firstName,lastName, .... However, from the implementation perspective, not many frameworks support such features natively, so everyone has to come up with their own way. Regarding the querying capabilities, in case you happen to be the user of terrific Apache CXF framework, you may benefit from its quite powerful search extension, which we have talked about some time ago.

If queries usually just fetch data, mutations are serving the purpose of the data modification. Syntactically they are very similar to queries but their interpretation is different. For example, here is one of the ways we could add new user to the application.

mutation {
  add(email: "a@b.com", firstName: "John", lastName: "Smith", roles: [ADMIN]) {
    id
    active
    email
    firstName
    lastName
    address {
      street
      country
    }
  }
}

In this mutation a new user John Smith with email a@b.com and ADMIN role assigned is going to be added to the system. As with queries, client is always in control which data shape it needs from server when mutation completes. Mutations could be think of as the calls for action and resemble a lot method invocations, for example the activation of the user may be done like that:

mutation {
  activate(id: 1) {
    active
  }
}

In Sangria, mutations are described exactly like queries, for example the ones we have looked at before have the following type definition:

val Mutation = ObjectType(
  "Mutation", fields[Repository, Unit](
    Field("activate", OptionType(UserType), 
      arguments = ID :: Nil,
      resolve = (ctx) => ctx.ctx.activateById(ctx.arg(ID))),
    Field("add", OptionType(UserType), 
      arguments = EmailArg :: FirstNameArg :: LastNameArg :: RolesArg :: Nil,
      resolve = (ctx) => ctx.ctx.addUser(ctx.arg(EmailArg), ctx.arg(FirstNameArg), 
        ctx.arg(LastNameArg), ctx.arg(RolesArg)))
  ))

With that, our GraphQL schema is complete:

val UserSchema = Schema(Query, Some(Mutation))

That's great, however ... what we can do with it? Just in time question, please welcome GraphQL server. As we remember, there is no attachment to particular technology or stack, but in the universe of web APIs you can think of GraphQL server as a single endpoint which is bound to POST HTTP verb. And, once we started to talk about HTTP and Scala, who could do better job than amazing Akka HTTP, luckily Sangria has a seamless integration with it.

val route: Route = path("users") {
  post {
    entity(as[String]) { document =>
      QueryParser.parse(document) match {
        case Success(queryAst) =>
          complete(Executor.execute(SchemaDefinition.UserSchema, queryAst, repository)
            .map(OK -> _)
            .recover {
              case error: QueryAnalysisError => BadRequest -> error.resolveError
              case error: ErrorWithResolver => InternalServerError -> error.resolveError
            })

        case Failure(error) => complete(BadRequest -> Error(error.getMessage))
      }
    }
  } ~ get {
    complete(SchemaRenderer.renderSchema(SchemaDefinition.UserSchema))
  }
}

You may notice that we also expose our schema under GET endpoint as well, what it is here for? Well, if you are familiar with Swagger which we have talked about a lot here, it is a very similar concept. The schema contains all the necessary pieces, enough for external tools to automatically discover the respective GraphQL queries and mutations, along with the types they are referencing. GraphiQL, an in-browser IDE for exploring GraphQL, is one of those (think about Swagger UI in the REST(ful) services world).

We are mostly there, our GraphQL server is ready, let us run it and send off a couple of queries and mutations, to get the feeling of it:

sbt run

[info] Running com.example.graphql.Boot
INFO  akka.event.slf4j.Slf4jLogger  - Slf4jLogger started
INFO  reactivemongo.api.MongoDriver  - No mongo-async-driver configuration found
INFO  reactivemongo.api.MongoDriver  - [Supervisor-1] Creating connection: Connection-2
INFO  r.core.actors.MongoDBSystem  - [Supervisor-1/Connection-2] Starting the MongoDBSystem akka://reactivemongo/user/Connection-2

Very likely our data store (we are using MongoDB run as Docker container) has no users at the moment so it sounds like a good idea to add one right away:

$ curl -vi -X POST http://localhost:48080/users -H "Content-Type: application/json" -d " \
   mutation { \
     add(email: \"a@b.com\", firstName: \"John\", lastName: \"Smith\", roles: [ADMIN]) { \
       id \
       active \
       email \
       firstName \
       lastName \
       address { \
       street \
         country \
       } \
     } \
   }"

HTTP/1.1 200 OK
Server: akka-http/10.0.5
Date: Tue, 25 Apr 2017 01:01:25 GMT
Content-Type: application/json
Content-Length: 123

{
  "data":{
    "add":{
      "email":"a@b.com",
      "lastName":"Smith",
      "firstName":"John",
      "id":1493082085000,
      "address":null,
      "active":false
    }
  }
}

It seems to work perfectly fine. The response details will be always wrapped into data envelop, no matter what kind of query or mutation you are running, for example:

$ curl -vi -X POST http://localhost:48080/users -H "Content-Type: application/json" -d " \                                                
   query { \                                                                                                                           
     users { \                                                                                                                         
       id \                                                                                                                            
       email \                                                                                                                         
     } \                                                                                                                               
   }"                                                                                                                                  

HTTP/1.1 200 OK                                                                                                                                   
Server: akka-http/10.0.5                                                                                                                          
Date: Tue, 25 Apr 2017 01:09:21 GMT                                                                                                               
Content-Type: application/json                                                                                                                    
Content-Length: 98                                                                                                                                
                                                                                                                                                  
{
  "data":{
    "users":[
      {
        "id":1493082085000,
        "email":"a@b.com"
      }
    ]
  }
}

Exactly as we ordered... Honestly, working with GraphQL feels natural, specifically when data querying is involved. And we didn't even talk about fragments, variables, directives, and a lot of other things.

Now it comes to the question: should we abandon all our practices, JAX-RS, Spring MVC, ... and switch to GraphQL? I honestly believe that this is not the case, GraphQL is a good fit to address certain kind of problems, but by and large, traditional REST(ful) web services, combined with Swagger or any other established API specification framework, are here to stay.

And please be warned, along with the benefits, GraphQL comes at a price. For example, HTTP caching and cache control won't apply anymore, HATEOAS does not make much sense either, unified responses no matter what you are calling, reliability as everything is behind single facade, ... With that in mind, GraphQL is indeed a great tool, please use it wisely!

The complete project source is available on Github.

Monday, January 9, 2017

Overcast and Docker: test your apps as if you are releasing them to production

I am sure everyone would agree that rare application architecture these days could survive without relying on some kind of data stores (either relational or NoSQL ones), or/and messaging middleware, or/and external caches, ... just to name a few. Testing such applications becomes a real challenge.

Luckily, if you are a JVM developer, things are not as bad. Most of the time you have an option to fallback to an embedded version of the data store or message broker in your integration or component test scenarios. But what if the solution you are using is not JVM-based? Great examples of those are RabbitMQ, Redis, Memcached, MySQL, Postgresql which are extremely popular choices these days, and for a very good reasons. Even better, what if your integration testing strategy is set to exercise the component (read, microservice) in the environment as close to production as possible? Should we give up here? Or should we write a bunch of flaky shell scripts to orchestrate the test runs and scary most of the developers to death? Let us see what we can do here ...

Many of you are already screaming at this point: just use Docker, or CoreOS! And this is exactly what we are going to talk about in this post, more precisely, how to use Docker to back integration / component testing. I think Docker does not need an introduction anymore. Even those of us who spent a last couple of years in a cave on a deserted island in the middle of the ocean have heard about it.

Our sample application is going to be built on top of Spring projects portfolio, heavily relying on Spring Boot magic to wire all the pieces together (who doesn't, right? it works pretty well indeed). It will be implementing a very simple workflow: publish a message to RabbitMQ exchange app.exchange (using app.queue routing key), consume the message from the RabbitMQ queue app.queue and store it in Redis list data structure under the key messages. The three self-explanatory code snippets below demonstrate how each functional piece is being done:

@Component
public class AppQueueMessageSender {
    @Autowired private RabbitTemplate rabbitTemplate;
 
    public void send(final String message) {
        rabbitTemplate.convertAndSend("app.exchange", "app.queue", message);
    }
}
@Component
public class AppQueueMessageListener {
    @Autowired private AppMessageRepository repository;
 
    @RabbitListener(
        queues = "app.queue", 
        containerFactory = "rabbitListenerContainerFactory", 
        admin = "amqpAdmin"
    )
    public void onMessage(final String message) {
        repository.persist(message);
    }
}
@Repository
public class AppMessageRepository {
    @Autowired private StringRedisTemplate redisTemplate;
 
    public void persist(final String message) {
        redisTemplate.opsForList().rightPush("messages", message);
    }
 
    public Collection<String> readAll() {
        return redisTemplate.opsForList().range("messages", 0, -1);
    }
 
    public long size() {
        return redisTemplate.opsForList().size("messages");
    }
}

As you can see, the implementation is deliberately doing the bare minimum, we are more interested in the fact that quite a few interactions with RabbitMQ and Redis are happening here. The configuration class includes only the necessary beans, everything else has been figured out by Spring Boot automatic discovery from the classpath dependencies.

@Configuration
@EnableAutoConfiguration
@EnableRabbit
@ComponentScan(basePackageClasses = AppConfiguration.class)
public class AppConfiguration {
    @Bean
    Queue queue() {
        return new Queue("app.queue", false);
    }

    @Bean
    TopicExchange exchange() {
        return new TopicExchange("app.exchange");
    }

    @Bean
    Binding binding(Queue queue, TopicExchange exchange) {
        return BindingBuilder.bind(queue).to(exchange).with(queue.getName());
    }
 
    @Bean
    StringRedisTemplate template(RedisConnectionFactory connectionFactory) {
        return new StringRedisTemplate(connectionFactory);
    }
}

At the very end comes application.yml. Essentially it contains the default connection parameters for RabbitMQ and Redis, plus a bit of logging level tuning.

logging:
  level:
    root: INFO
    
spring:
  rabbitmq:
    host: localhost
    username: guest
    password: guest
    port: 5672
    
  redis:
    host: localhost
    port: 6379

With that, our application is ready be run. For convenience, the project repository contains docker-compose.yml with official RabbitMQ and Redis images from the Docker hub.

Being TDD believers and practitioners, we make sure no application leaves the gate without thorough set of test suites and test cases. Keeping unit tests and integration tests out of scope of the subject of our discussion, let us jump right into component testing with this simple scenario.

@RunWith(SpringJUnit4ClassRunner.class)
@SpringBootTest(
    classes = { AppConfiguration.class }, 
    webEnvironment = WebEnvironment.NONE
)
public class AppComponentTest {
    @Autowired private AppQueueMessageSender sender;
    @Autowired private AppMessageRepository repository;
 
    @Test
    public void testThatMessageHasBeenPersisted() {
        sender.send("Test Message!");
        await().atMost(1, SECONDS).until(() -> repository.size() > 0);
        assertThat(repository.readAll()).containsExactly("Test Message!");
    }
}

It is really basic test case, exercising the main flow, but it makes an important point: no mocks / stubs / ... allowed, we expect the real things. The line is somewhat blurry but this is what makes component tests different from let say integration or e2e tests: we test a single full-fledged application (component) with the real dependencies (when it makes sense).

This is a right time for an excellent Overcast project to appear on the stage and help us out. Overcast brings the power of Docker to enrich test harness of the JVM applications. Among many other things, it allows to define and manage lifecycle of Docker containers from within Java code (or more precisely, any programming language based on JVM).

Unfortunately, the last released version 2.5.1 of Overcast is pretty old and does not include a lot of features and enhancements. However, building it from source is no-brainer (hopefully the new release is going to be available soon).

git clone https://github.com/xebialabs/overcast
cd overcast
./gradlew install

Essentially, the only prerequisite is to provide the configuration file overcast.conf with the list of named containers to run. In our case, we need RabbitMQ and Redis.

rabbitmq {
    dockerImage="rabbitmq:3.6.6"
    exposeAllPorts=true
    remove=true
    removeVolume=true
}

redis {
    dockerImage="redis:3.2.6"
    exposeAllPorts=true
    remove=true
    removeVolume=true
}

Great! The syntax is not as powerful as Docker Compose supports, but simple, straightforward and quite sufficient to be fair. Once configuration file is placed into src/test/resources folder, we could move on and use Overcast Java API to manage these containers programmatically. It is natural to introduce the dedicated configuration class in this case as we are using Spring framework.

@Configuration
public class OvercastConfiguration {
    @Autowired private ConfigurableEnvironment env;

    @Bean(initMethod = "setup", destroyMethod = "teardown")
    @Qualifier("rabbitmq")
    public CloudHost rabbitmq() {
        return CloudHostFactory.getCloudHost("rabbitmq");
    }

    @Bean(initMethod = "setup", destroyMethod = "teardown")
    @Qualifier("redis")
    public CloudHost redis() {
        return CloudHostFactory.getCloudHost("redis");
    }

    @PostConstruct
    public void init() throws TimeoutException {
        final CloudHost redis = redis();
        final CloudHost rabbitmq = rabbitmq();

        final Map<String, Object> properties = new HashMap<>();
        properties.put("spring.rabbitmq.host", rabbitmq.getHostName());
        properties.put("spring.rabbitmq.port", rabbitmq.getPort(5672));

        properties.put("spring.redis.host", redis.getHostName());
        properties.put("spring.redis.port", redis.getPort(6379));

        final PropertySource<?> source = new MapPropertySource("overcast", properties);
        env.getPropertySources().addFirst(source);
    }
}

And that is literally all we need! Just a couple of important notes here. Docker is going to expose random ports for each container, so we could run many test cases in parallel on the same box without any port conflicts. On most operating systems it is safe to use localhost to access the running containers but for the ones without native Docker support the workarounds with Docker Machine or boot2docker exist. That is why we override connection settings for both host and port for RabbitMQ and Redis respectively, asking for the bindings at runtime:

properties.put("spring.rabbitmq.host", rabbitmq.getHostName());
properties.put("spring.rabbitmq.port", rabbitmq.getPort(5672));

properties.put("spring.redis.host", redis.getHostName());
properties.put("spring.redis.port", redis.getPort(6379));

Lastly, more advanced Docker users may wonder how Overcast is able to figure out where Docker daemon is running? Which port it is bound to? Does it use TLS or not? Under the hood Overcast uses terrific Spotify Docker Client which is able to retrieve all the relevant details from the environment variables, which works in majority of use cases (though you can always provide your own settings).

To finish up, let us include this configuration into the test case:

@SpringBootTest(
    classes = { OvercastConfiguration.class, AppConfiguration.class }, 
    webEnvironment = WebEnvironment.NONE
)

Easy, isn't it? If we go ahead and run mvn test for our project, all test cases should pass (please notice that first run may take some time as Docker would have to pull the container images from the remote repository).

> mvn test
...
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
...

No doubts, Docker raises testing techniques to a new level. With the help of such awesome libraries as Overcast, seasoned JVM developers have even more options to come up with realistic test scenarios and run them against the components in "mostly" production environment (on the wave of hype, it fits perfectly into microservices testing strategies). There are many areas where Overcast could and will improve but it brings a lot of value even now, definitely worth checking out.

Probably, the most annoying issue you may encounter when working with Docker containers is awaiting for the moment when the container is fully started and ready to accept the requests (which heavily depends on what kind of underlying service this container is running). Although the work on that has been started, Overcast does not help with this particular problem yet though simple, old-style sleeps may be good enough (versus a bit more complex port polling for example).

But but but ... always remember about testing pyramid and strive for a right balance. Create as many test cases as you need to cover most critical and important flows, but no more. Unit and integration tests should be your main weapon.

The complete project is available on Github.

Sunday, November 27, 2016

Keep your promises: contract-based testing for JAX-RS APIs

It's been a while since we talked about testing and applying effective TDD practices, particularly related to REST(ful) web services and APIs. But this topic should have never been forgotten, especially in the world where everyone is doing microservices, whatever it means, implies or takes.

To be fair, there are quite a lot of areas where microservice-based architecture shines and allows organizations to move and innovate much faster. But without a proper discipline, it also makes our systems fragile, as they become very loosely coupled. In today's post we are going to talk about contract-based testing and consumer-driven contracts as a practical and reliable techniques to ensure that our microservices fulfill their promises.

So, how does contract-based testing work? In nutshell, it is surprisingly simple technique and is guided by following steps:

  • provider (let say Service A) publishes its contact (or specification), the implementation may not even be available at this stage
  • consumer (let say Service B) follows this contract (or specification) to implement conversations with Service A
  • additionally, consumer introduces a test suite to verify its expectations regarding Service A contract fulfillment
In case of SOAP web services and APIs, things are obvious as there is an explicit contract in a form of WSDL file. But in case of REST(ful) APIs, there are a lot of different options around the corner (WADL, RAML, Swagger, ...) and still no agreement on the one. It may sound complicated but please don't get upset, because Pact is coming on the rescue!

Pact is family of frameworks for supporting consumer-driven contracts testing. There are many language bindings and implementations available, including JVM ones, JVM Pact and Scala-Pact. To evolve such a polyglot ecosystem, Pact also includes a dedicated specification so to provide interoperability between different implementations.

Great, Pact is there, the stage is set and we are ready to take off with some real code snippets. Let us assume we are developing a REST(ful) web API for managing people, using terrific Apache CXF and JAX-RS 2.0 specification. To keep things simple, we are going to introduce only two endpoints:

  • POST /people/v1 to create new person
  • GET /people/v1?email=<email> to find person by email address
Essentially, we may not bother and just communicate these minimal pieces of our service contract to everyone so let the consumers deal with that themselves (and indeed, Pact supports such a scenario). But surely, we are not like that, we do care and would like to document our APIs comprehensively, likely we are already familiar with Swagger. With that, here is our PeopleRestService.
@Api(value = "Manage people")
@Path("/people/v1")
@Consumes(MediaType.APPLICATION_JSON)
@Produces(MediaType.APPLICATION_JSON)
public class PeopleRestService {
    @GET
    @ApiOperation(value = "Find person by e-mail", 
        notes = "Find person by e-mail", response = Person.class)
    @ApiResponses({
        @ApiResponse(code = 404, 
            message = "Person with such e-mail doesn't exists", 
            response = GenericError.class)
    })
    public Response findPerson(
        @ApiParam(value = "E-Mail address to lookup for", required = true) 
        @QueryParam("email") final String email) {
        // implementation here
    }

    @POST
    @ApiOperation(value = "Create new person", 
        notes = "Create new person", response = Person.class)
    @ApiResponses({
        @ApiResponse(code = 201, 
            message = "Person created successfully", 
            response = Person.class),
        @ApiResponse(code = 409, 
            message = "Person with such e-mail already exists", 
            response = GenericError.class)
    })
    public Response addPerson(@Context UriInfo uriInfo, 
        @ApiParam(required = true) PersonUpdate person) {
        // implementation here
    }
}

The implementation details are not important at the moment, however let us take a look at the GenericError, PersonUpdate and Person classes as they are an integral part of our service contract.

@ApiModel(description = "Generic error representation")
public class GenericError {
    @ApiModelProperty(value = "Error message", required = true)
    private String message;
}

@ApiModel(description = "Person resource representation")
public class PersonUpdate {
    @ApiModelProperty(value = "Person's first name", required = true) 
    private String email;
    @ApiModelProperty(value = "Person's e-mail address", required = true) 
    private String firstName;
    @ApiModelProperty(value = "Person's last name", required = true) 
    private String lastName;
    @ApiModelProperty(value = "Person's age", required = true) 
    private int age;
}

@ApiModel(description = "Person resource representation")
public class Person extends PersonUpdate {
    @ApiModelProperty(value = "Person's identifier", required = true) 
    private String id;
}

Excellent! Once we have Swagger annotations in place and Apache CXF Swagger integration turned on, we could generate swagger.json specification file, bring it to live in Swagger UI and distribute to every partner or interested consumer.

Would be great if we could use this Swagger specification along with Pact framework implementation to serve as a service contract. Thanks to Atlassian, we are certainly able to do that using swagger-request-validator, a library for validating HTTP request/respons against a Swagger/OpenAPI specification which nicely integrates with Pact JVM as well.

Cool, now let us switch sides from provider to consumer and try to figure out what we can do having such Swagger specification in our hands. It turns out, we can do a lot of things. For example, let us take a look at the POST action, which creates new person. As a client (or consumer), we could express our expectations in such a form that having a valid payload submitted along with the request, we expect HTTP status code 201 to be returned by the provider and the response payload should contain a new person with identifier assigned. In fact, translating this statement into Pact JVM assertions is pretty straightforward.

@Pact(provider = PROVIDER_ID, consumer = CONSUMER_ID)
public PactFragment addPerson(PactDslWithProvider builder) {
    return builder
        .uponReceiving("POST new person")
        .method("POST")
        .path("/services/people/v1")
        .body(
            new PactDslJsonBody()
                .stringType("email")
                .stringType("firstName")
                .stringType("lastName")
                .numberType("age")
        )
        .willRespondWith()
        .status(201)
        .matchHeader(HttpHeaders.CONTENT_TYPE, MediaType.APPLICATION_JSON)
        .body(
            new PactDslJsonBody()
                .uuid("id")
                .stringType("email")
                .stringType("firstName")
                .stringType("lastName")
                .numberType("age")
        )
       .toFragment();
}

To trigger the contract verification process, we are going to use awesome JUnit and very popular REST Assured framework. But before that, let us clarify on what is PROVIDER_ID and CONSUMER_ID from the code snippet above. As you may expect, PROVIDER_ID is the reference to the contract specification. For simplicity, we would fetch Swagger specification from running PeopleRestService endpoint, luckily Spring Boot testing improvements make this task a no-brainer.

@RunWith(SpringJUnit4ClassRunner.class)
@SpringBootTest(webEnvironment = WebEnvironment.RANDOM_PORT, 
    classes = PeopleRestConfiguration.class)
public class PeopleRestContractTest {
    private static final String PROVIDER_ID = "People Rest Service";
    private static final String CONSUMER_ID = "People Rest Service Consumer";

    private ValidatedPactProviderRule provider;
    @Value("${local.server.port}")
    private int port;

    @Rule
    public ValidatedPactProviderRule getValidatedPactProviderRule() {
        if (provider == null) {
            provider = new ValidatedPactProviderRule("http://localhost:" + port + 
                "/services/swagger.json", null, PROVIDER_ID, this);
        }

        return provider;
    }
}

The CONSUMER_ID is just a way to identify the consumer, not much to say about it. With that, we are ready to finish up with our first test case:

@Test
@PactVerification(value = PROVIDER_ID, fragment = "addPerson")
public void testAddPerson() {
    given()
        .contentType(ContentType.JSON)
        .body(new PersonUpdate("tom@smith.com", "Tom", "Smith", 60))
        .post(provider.getConfig().url() + "/services/people/v1");
}

Awesome! As simple as that, just please notice the presence of @PactVerification annotation where we are referencing the appropriate verification fragment by name, in this case it points out to addPerson method we have introduced before.

Great, but ... what the point? Glad you are asking, because from now on any change in the contract which may not be backward compatible will break our test case. For example, if provider decides to remove the id property from the response payload, the test case will fail. Renaming the request payload properties, big no-no, again, test case will fail. Adding new path parameters? No luck, test case won't let it pass. You may go even further than that and fail on every contract change, even if it backward-compatible (using swagger-validator.properties for fine-tuning).

validation.response=ERROR
validation.response.body.missing=ERROR

No a very good idea but still, if you need it, it is there. Similarly, let us add a couple of more test cases for GET endpoint, starting from successful scenario, where person we are looking for exists, for example:

@Pact(provider = PROVIDER_ID, consumer = CONSUMER_ID)
public PactFragment findPerson(PactDslWithProvider builder) {
    return builder
        .uponReceiving("GET find person")
        .method("GET")
        .path("/services/people/v1")
        .query("email=tom@smith.com")
        .willRespondWith()
        .status(200)
        .matchHeader(HttpHeaders.CONTENT_TYPE, MediaType.APPLICATION_JSON)
        .body(
            new PactDslJsonBody()
                .uuid("id")
                .stringType("email")
                .stringType("firstName")
                .stringType("lastName")
                .numberType("age")
        )
        .toFragment();
}

@Test
@PactVerification(value = PROVIDER_ID, fragment = "findPerson")
public void testFindPerson() {
    given()
        .contentType(ContentType.JSON)
        .queryParam("email", "tom@smith.com")
        .get(provider.getConfig().url() + "/services/people/v1");
}

Please take a note that here we introduced query string verification using query("email=tom@smith.com") assertion. Following the possible outcomes, let us also cover the unsuccessful scenario, where person does not exist and we expect some error to be returned, along with 404 status code, for example:

@Pact(provider = PROVIDER_ID, consumer = CONSUMER_ID)
public PactFragment findNonExistingPerson(PactDslWithProvider builder) {
    return builder
        .uponReceiving("GET find non-existing person")
        .method("GET")
        .path("/services/people/v1")
        .query("email=tom@smith.com")
        .willRespondWith()
        .status(404)
        .matchHeader(HttpHeaders.CONTENT_TYPE, MediaType.APPLICATION_JSON)
        .body(new PactDslJsonBody().stringType("message"))
        .toFragment();
}

@Test
@PactVerification(value = PROVIDER_ID, fragment = "findNonExistingPerson")
public void testFindPersonWhichDoesNotExist() {
    given()
        .contentType(ContentType.JSON)
        .queryParam("email", "tom@smith.com")
        .get(provider.getConfig().url() + "/services/people/v1");
}

Really brilliant, maintainable, understandable and non-intrusive approach to address such a complex and important problems as contract-based testing and consumer-driven contracts. Hopefully, this somewhat new testing technique would help you to catch more issues during the development phase, way before they would have a chance to leak into production.

Thanks to Swagger we were able to take a few shortcuts, but in case you don't have such a luxury, Pact has quite rich specification which you are very welcome to learn and use. In any case, Pact JVM does a really great job in helping you out writing small and concise test cases.

The complete project sources are available on Github.

Monday, September 19, 2016

Stop being clueless, just instrument and measure: using metrics to gain insights about your JAX-RS APIs

How often we, developers, build these shiny REST(ful) APIs (or microservices, joining the hype here) hoping they are going to just work in production? There is enormous amount of frameworks and toolkits out there which give us the ability to be very productive at development, however when things are deployed in production, most of them keep us clueless about what it is going on.

Spring Boot is certainly an exception from this rule and in today's post we are going to talk about using Spring Boot Actuator along with terrific Dropwizard Metrics library to collect and expose metrics about Apache CXF-based JAX-RS APIs. To keep things even more interesting, we are going to feed the metrics into amazing Prometheus collector and visualize using beautiful Grafana dashboards.

With that, let us get started by defining simple JAX-RS service to manage people, PeopleRestService. We are not going to plug it to any external storage or whatnot but instead just cheat a bit by relying on Spring Reactor project and introducing random delays while returning a predefined response.

@Path("/people")
@Component
public class PeopleRestService {
    private final Random random = new Random();
 
    @GET
    @Produces({MediaType.APPLICATION_JSON})
    public Collection<Person> getPeople() {
        return Flux
            .just(
                new Person("a@b.com", "John", "Smith"), 
                new Person("c@b.com", "Bob", "Bobinec")
            )
            .delayMillis(random.nextInt(1000))
            .toStream()
            .collect(Collectors.toList());
    }
}

Because we are going to use Spring Boot and its automagic discovery capabilities, the configuration is going look rather trivial. We have already talked about using Spring Boot along with Apache CXF but now even this part has been improved, thanks to the Apache CXF Spring Boot Starter available (it becomes just a matter of adding one more dependency to your project).

@Configuration
@EnableAutoConfiguration
public class AppConfig {
    @Bean(destroyMethod = "destroy")
    public Server jaxRsServer(final Bus bus) {
        final JAXRSServerFactoryBean factory = new JAXRSServerFactoryBean();

        factory.setServiceBean(peopleRestService());
        factory.setProvider(new JacksonJsonProvider());
        factory.setBus(bus);
        factory.setAddress("/");

        return factory.create();
    }
 
    @Bean
    public PeopleRestService peopleRestService() {
        return new PeopleRestService();
    }
}

At this point we did everything we needed to have a bare bones Spring Boot hosting Apache CXF-based JAX-RS service. Please take a note that by default with Apache CXF Spring Boot Starter all APIs are served under /services/* mapping.

Till now we said nothing about metrics yet and that is what we are going to talk about next. Dropwizard Metrics is a de-facto standard for JVM applications and has a rich set of different kind of metrics (meters, gauges, counters, histograms, ...) and reporters (console, JMX, HTTP, ...). Consequently, the MetricRegistry is a central place to manage all the metrics. And surely, the typical way to expose metrics for JVM-based application is JMX so let us include the respective beans into configuration.

@Bean(initMethod = "start", destroyMethod = "stop")
public JmxReporter jmxReporter() {
    return JmxReporter.forRegistry(metricRegistry()).build();
}
 
@Bean
public MetricRegistry metricRegistry() {
    return new MetricRegistry();
}

You are free to create as many metrics as you need and we could have added a few for our PeopleRestService as well. But luckily, Apache CXF has a dedicated MetricsFeature feature to integrate with Dropwizard Metrics and collect all the relevant ones, with zero effort. A minor update of the JAXRSServerFactoryBean initialization is enough.

@Bean(destroyMethod = "destroy")
public Server jaxRsServer(final Bus bus) {
    final JAXRSServerFactoryBean factory = new JAXRSServerFactoryBean();

    factory.setServiceBean(peopleRestService());
    factory.setProvider(new JacksonJsonProvider());
    factory.setBus(bus);
    factory.setAddress("/");
    factory.setFeatures(Collections.singletonList(
        new MetricsFeature(new CodahaleMetricsProvider(bus))
    ));
    factory.setProperties(
        Collections.singletonMap(
            "org.apache.cxf.management.service.counter.name", 
            "cxf-services."
        )
    );

    return factory.create();
}

Just a quick note about org.apache.cxf.management.service.counter.name. By default, Apache CXF is going to name metrics quite verbosely, including unique bus identifier as part of the name as well. It is not very readable so we just override the default behaviour using static 'cxf-services.' prefix. This is how those metrics are going to look like in the JMX console.

It looks terrific, but JMX is not a very pleasant piece of technology to dial with, could we do better? Here is where Spring Boot Actuator comes into play. Along with many other endpoints, it is able to expose all the metrics over HTTP protocol by adding a couple of properties to application.yml file:

endpoints:
  jmx:
    enabled: true
    unique-names: true

management:
  security:
    enabled: true 

It is important to mention here that metrics, along with other Spring Boot Actuator endpoints, may expose a sensitive details about your application so it is always a good idea to protect them, for example, using Spring Security and HTTP Basic Authentication. Again, a few configuration properties in application.yml will do all the work:

security:
  ignored:
    - /services/**
  user:
    name: guest
    password: guest

Brilliant, if we run our application and access /metrics endpoint (providing guest/guest as credentials), we should see quite an extensive list of metrics, like these ones:

> curl -u guest:guest http://localhost:19090/metrics

{
    "classes": 8673,
    "classes.loaded": 8673,
    "classes.unloaded": 0,
    "counter.status.200.metrics": 5,
    "counter.status.200.services.people": 1,
    "counter.status.401.error": 2,
    "cxf-services.Attribute=Checked Application Faults.count": 0,
    "cxf-services.Attribute=Checked Application Faults.fifteenMinuteRate": 0.0,
    "cxf-services.Attribute=Checked Application Faults.fiveMinuteRate": 0.0,
    "cxf-services.Attribute=Checked Application Faults.meanRate": 0.0,
    ...
}

It would be great to have some dedicated monitoring solution which could understand these metrics, store them somewhere and give us useful insights and aggregations in real-time. Prometheus is exactly the tool we are looking for but there are bad and good news. On a not so good side, Prometheus does not understand the format which Spring Boot Actuator uses to expose metrics. But on a bright side, Prometheus has a dedicated Spring Boot integration so the same metrics could be exposed in Prometheus-compatible format, we are few beans away from that.

@Configuration
public class PrometheusConfig {
    @Bean
    public CollectorRegistry collectorRegistry() {
        return new CollectorRegistry();
    }

    @Bean
    public SpringBootMetricsCollector metricsCollector(
            final Collection<PublicMetrics> metrics, final CollectorRegistry registry) {
        return new SpringBootMetricsCollector(metrics).register(registry);
    }

    @Bean
    public ServletRegistrationBean exporterServlet(final CollectorRegistry registry) {
        return new ServletRegistrationBean(new MetricsServlet(registry), "/prometheus");
    }
}

With this configuration in place, metrics alternatively are going to be exposed under /prometheus endpoint, let us check this out.

> curl -u guest:guest http://localhost:19090/prometheus

# HELP cxf_services_Attribute_Data_Read_fifteenMinuteRate cxf_services_Attribute_Data_Read_fifteenMinuteRate
# TYPE cxf_services_Attribute_Data_Read_fifteenMinuteRate gauge
cxf_services_Attribute_Data_Read_fifteenMinuteRate 0.0
# HELP cxf_services_Attribute_Runtime_Faults_count cxf_services_Attribute_Runtime_Faults_count
# TYPE cxf_services_Attribute_Runtime_Faults_count gauge
cxf_services_Attribute_Runtime_Faults_count 0.0
# HELP cxf_services_Attribute_Totals_snapshot_stdDev cxf_services_Attribute_Totals_snapshot_stdDev
# TYPE cxf_services_Attribute_Totals_snapshot_stdDev gauge
cxf_services_Attribute_Totals_snapshot_stdDev 0.0
...

All the necessary pieces are covered and the fun time is about to begin. Prometheus has a very simple and straightforward installation steps but Docker is certainly the easiest one. The project repository includes docker-compose.yml file in docker folder to get you started quickly. But before, let us build the Docker image of our Spring Boot application using Apache Maven:

> mvn clean install

Upon successful build, we are ready to use Docker Compose tool to start all the containers and wire them together, for example:

> cd docker
> docker-compose up

Recreating docker_cxf_1
Recreating docker_prometheus_1
Recreating docker_grafana_1
Attaching to docker_cxf_1, docker_prometheus_1, docker_grafana_1
...

If you are using native Docker packages, just open your browser at http://localhost:9090/targets where you could see that Prometheus has successfully connected to our application and is consuming its metrics (for older Docker installations, please use the address of your Docker Machine).

The cxf target came preconfigured from Prometheus configuration file, located at docker/prometheus.yml and used to build the respective container in the docker-compose.yml (please notice the presence of the credentials to access /prometheus endpoint):

# my global config
global:
  scrape_interval:     15s # By default, scrape targets every 15 seconds.
  evaluation_interval: 15s # By default, scrape targets every 15 seconds.

scrape_configs:
  - job_name: 'cxf'

    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 5s

    basic_auth:
      username: guest
      password: guest

    metrics_path: '/prometheus'

    # Default scheme is http
    static_configs:
      - targets: ['cxf:19090']

Prometheus supports graph visualizations but Grafana is unquestionable leader in mastering beautiful dashboards. It needs a bit of configuration though which could be done over web UI or, even better, through API . The data source is the most important one, and in our case should point to running Prometheus instance.

> curl 'http://admin:admin@localhost:3000/api/datasources' -X POST -H 'Content-Type: application/json;charset=UTF-8' --data-binary '{"name": "prometheus", "type": "prometheus", "url":"http://prometheus:9090", "access":"proxy", "isDefault":true}'

Done! Adding a sample dashboard is the next thing to do and again, API is the best way to accomplish that (assuming you are still in the docker folder)

> curl 'http://admin:admin@localhost:3000/api/dashboards/db' -X POST -H 'Content-Type: application/json;charset=UTF-8' --data-binary @cxf-dashboard.json

The same rule applies here, if you are still using Docker Machine, please replace localhost with appropriate virtual machine address. Also please notice that you have to do this only once when the containers are created first time. The configuration will be kept for existing containers.

To finish up, let us open our custom Grafana dashboard by navigating to http://localhost:3000/dashboard/db/cxf-services, using admin/admin as default credentials. Surely, you are going to see no data at first but by generating some load (f.e. using siege), we could have gotten interesting graphs to analyze, for example:

Those graphs were made simple (and not so much useful to be honest) on purpose, just to demonstrate how easy it is to collect and visualize metrics from your Apache CXF-based JAX-RS APIs in real-time. There are so many useful metrics our applications could expose that no shortage of ideas here expected. Plus, Grafana allows to define quite sophisticated graphs and queries, worth of another article, but official documentation is a good point to start off.

Hope this post will encourage everyone to think seriously about monitoring your JAX-RS APIs by exposing, collecting and visualizing important metrics. This is just a beginning ...

The complete project sources are available on Github.

Saturday, July 23, 2016

When things may get out of control: circuit breakers in practice. Apache Zest and Akka.

In the previous post we have started the discussion about circuit breakers and why this pattern gained so much importance these days. We have learned about Netflix Hystrix, the most advanced circuit breaker implementation for JVM platform, and its typical integration scenarios. In this post we are going to continue exploring the other options available, starting from Apache Zest library.

Surprisingly, Apache Zest being certainly a gem, is not well-known and widely used. It is a framework for domain centric application development which aims to explore composite-oriented programming paradigm. Its roots go back to 2007, where it was born under another name, Qi4j (and became Apache Zest in 2015). It would require a complete book just to go through Apache Zest features and concepts, but what we are interested in is the fact that Apache Zest has simple circuit breaker implementation.

Let us use the same example to consume https://freegeoip.net/ REST(ful) web API and wrap the communication with this external service using CircuitBreaker from Apache Zest:

public class GeoIpService {
    private static final String URL = "http://freegeoip.net/";
    private final ObjectMapper mapper = new ObjectMapper();
    private final CircuitBreaker breaker = new CircuitBreaker(5, 1000 * 120);
 
    public GeoIpDetails getDetails(final String host) {
        try {
            if (breaker.isOn()) {
                GeoIpDetails details = mapper.readValue(get(host), GeoIpDetails.class);
                breaker.success();
                return details;
            } else {
                // Fallback to empty response
                return new GeoIpDetails();
            }
        } catch (final IOException ex) {
            breaker.throwable(ex);
            throw new RuntimeException("Communication with '" + URL + "' failed", ex);
        } catch (final URISyntaxException ex) {
            // Should never happen, but just trip circuit breaker immediately
            breaker.trip(); 
            throw new RuntimeException("Invalid service endpoint: " + URL, ex);
        }
    }

    private String get(final String host) throws IOException, URISyntaxException {
        return Request
            .Get(new URIBuilder(URL).setPath("/json/" + host).build())
            .connectTimeout(1000)
            .socketTimeout(3000)
            .execute()
            .returnContent()
            .asString();
    }
}

Essentially, this is as basic CircuitBreaker implementation as it could possible get. We configured it to have a threshold of 5 failures (which in our case means failing requests) and sleeping window of 2 minutes (120 * 1000 milliseconds). It becomes the responsibility of the application developer to report the successes and failures using success() and throwable(...) methods respectively, with the option to open the circuit breaker immediately using trip() method call. Please take a note that CircuitBreaker relies on Java synchronization mechanisms and is thread-safe.

Interestingly, the CircuitBreaker from Apache Zest uses a little bit different conventions: instead of operating on closed / open states, it treats them as on / off ones. Those are more familiar to most of us. And to finish up, basic JMX instrumentation is also available out of the box.

It requires a couple of lines to be added into the GeoIpService initialization (constructor f.e.) to register and expose managed beans:

public GeoIpService() throws Exception {
    final ObjectName name = new ObjectName("circuit-breakers", 
        "zest-circuit-breaker", "freegeoip.net");

    ManagementFactory
        .getPlatformMBeanServer()
        .registerMBean(new CircuitBreakerJMX(breaker, name), name);
}

Please do not hesitate to glance through official Apache Zest Circuit Breaker documentation, there are quite a few use cases you may found useful for your projects. The complete example is available on Github.

In case you are developing on JVM using Scala programming language, you are certainly a lucky one as there is native circuit breaker implementation available as part of Akka toolkit. For example, let us redesign our Geo IP service consumer as a typical Akka actor which is going to make HTTP call over to https://freegeoip.net/:

case class GeoIp(host: String)
case class GeoIpDetails(ip: String = "", countryCode: String = "", 
  countryName: String = "", latitude: Double = 0, longitude: Double = 0)

class GeoIpActor extends Actor with ActorLogging {
  import spray.json._
  import spray.json.DefaultJsonProtocol._
  
  implicit val materializer: ActorMaterializer = ActorMaterializer()
  
  import context.dispatcher
  import context.system
  
  val breaker = new CircuitBreaker(
    context.system.scheduler,
    maxFailures = 5,
    callTimeout = 15 seconds,
    resetTimeout = 2 minutes)
  
  def receive = {
    case GeoIp(host) => breaker
      .withCircuitBreaker(Http()
        .singleRequest(HttpRequest(uri = s"http://freegeoip.net/json/$host"))
        .flatMap {
          case HttpResponse(OK, _, entity, _) => Unmarshal(entity).to[GeoIpDetails]
          case _ => Future.successful(GeoIpDetails())
        }
      ) pipeTo sender()
  }
}

At this moment, the pattern undoubtedly is looking familiar to all of us. The only new option which Akka's CircuitBreaker brings on the table is overall call timeout: the execution will be considered as failed when it is not completed within this time period (certainly very handy addition to the circuit breaker capabilities). The withCircuitBreaker function takes care of managing the circuit breaker state around the wrapped block of the code.

The interactions with the GeoIpActor are no different from any other Akka actor:

implicit val system = ActorSystem("circuit-breakers")
implicit val timeout: Timeout = 5 seconds

val geoIp = system.actorOf(Props[GeoIpActor], "geo-ip-actor")
val result = (geoIp ? GeoIp("8.8.8.8")).mapTo[GeoIpDetails]
  
result andThen { 
  case Success(details) => log.info("GEO IP: {}", details)
  case Failure(ex) => log.error("Communication error", ex)
} 

By looking a bit deeper into CircuitBreaker documentation we could get some insights about its internals. There are actually three states the CircuitBreaker could be: open, closed and half-open. The presence of half-open state serves the purpose to perform just a single attempt to try out if the invocation is back to normal operations or not.

The code snippet looks perfect but one thing to keep in mind is how the Actor Model and absence of shared mutable state affects the circuit breaker state synchronization. To facilitate that, Akka's CircuitBreaker has a rich set of callback notifications (like onOpen, onHalfOpen, onClose) so the state changes could be broadcasted between actors. The complete project sources are available on Github.

For the curious readers, just a few closing notes about adoption of the circuit breaker implementations. Netflix Hystrix is the number one choice at the moment, particularly (but not only) because of superior support from Spring community. Obviously, Akka CircuitBreaker is a natural choice for Scala developers who build their applications on top of excellent Akka toolkit. Concerning Apache Zest Circuit Breaker, it could be used as-is (if you want to fully control the behavior) or be easily integrated as an useful extension into existing general-purpose clients. For example, Apache CXF allows to configure JAX-RS / JAX-WS clients with Failover feature, including the circuit breaker-based implementation: CircuitBreakerFailoverFeature.

Hope this series of posts extended a little bit your awareness about circuit breaker pattern and the state of its available implementations on the JVM platform. The repository with complete project samples is available on Github.

Stay resilient!