Cloud Native based microservices come with a new set of opportunities and challenges, many of them ignored until it is too late, causing not only a deprived set of benefits that otherwise would be gained, but also potentially causing major trouble by not having the right level of manageability and governance in place.
We know that microservices in general are becoming more and more popular. In recent studies, we have learned that the majority of enterprise organizations are already gaining value from using microservices architectures. Also we have learned that at an enterprise level, consistent challenges are commonly raised with microservices, especially when there are plenty of them.
Interestingly, the challenges that we are observing with microservices in enterprise organizations are not involved with the development of those microservices, but it comes with the right level of their governance.
In this blog, I am going to cover the top considerations to properly manage and govern microservices in production.
In particular, I am going to cover top considerations across the following aspects:
- Discovery and Cataloging
- Communication and Service Connectivity
- Traffic control (rate limiting, retries, circuit breakers, canary deployments, etc.)
- Security (north/south, east/west)
- API governance (policy enforcement and API quality assurance)
- API Telemetry (metrics/logs/traceability)
Discovery and Cataloging
Perhaps the #1 problem that most enterprise organizations face when it comes to microservices is not knowing what they have. This is because it is so simple to create new business logic empowered by polyglot microservices, that in most cases, it is easier to simple build new logic, than having a process to identify if that logic has already been developed before, in which case it would be less costly over time to reuse, rather than rebuild.
The main reason why this occurs, is because it can be very hard to be able to discover and catalog all enterprise APIs from all existing microservices and be able to see from a single place all the enterprise APIs and how to subscribe to them if needed.
Discovering and Cataloging APIs is already hard enough, without mentioning the fact that it is not a once off task, but on the contrary, in order to be effective and successful, it has to be a process tightly aligned with the full CI/CD automation process of building and evolving microservices, so that as new functionality gets designed and developed, this get refreshed in the API catalog as part of a natural process of feature branches being merged into main branches.
MuleSoft Universal API Management provides a mechanism for discovering and cataloging APIs via the use of an automated toolkit that can be scripted as part of a CI/CD pipeline, incorporated as part of an ongoing microservices development, as well as an arduous effort to catalog all existing APIs from an enterprise organization. For more information, refer to the official statement from MuleSoft – See here
Communication and Service Connectivity
For decades, different architectural patterns have emerged and evolved to help map business domains into reusable assets that can help enterprise organizations to accelerate the speed at which they can build new products and better services for their own customers.
Depending on the methodology or architecture pattern, there will be a way in which we can break down business domains, subdomains and their boundaries into the right granular level at which we can promote the best reusability and agility. The result is the conceptual image of microservices that will represent each of the identified business domains.
That is by no question a very effective way to better manage and understand business complexity. However, this mapped domains and subdomains will be of no use unless we can easily and quickly make them talk to each other in ways that were potentially not in the original scope, but that represent new ways in which the business envisions new products and services, as they continuously keep steering the business driving wheel.
Orchestration and Choreography
Today, there are new ways in which we can simplify the orchestration of services. For example, MuleSoft DataGraph uses GraphQL behind the scenes to simplify consuming upstream APIs, into a single request that reduces the overhead of connecting and maintaining APIs for new business domain development, without mentioning the efficiencies gained from a network perspective.
On the other hand, we are seeing more examples of business interactions jumping from pure synchronous orchestrations, into asynchronous event streaming interactions.
Interestingly, now days I am observing that such event streaming interactions are not only around non-functional requirements as we used to see in the past, that is, it is not just because we need to provide high scalability, nor message reliability, but also and more importantly, because we are building business pipes that are moving important messages from event sources to event targets, that happen to also be business domains. If you want more information on how to design and develop Modern Event Driven Architectures, have a look at a previous blog that I wrote about it using MuleSoft and Solace technologies – see here.
Service Connectivity
The aspect of Service Connectivity is paramount in microservices. This is where we can truly enable business agility, as services need to talk to each other in order to build the actual business outcomes. The question is, how can we enable such interactions in a way that can remain flexible?
Over decades, in software development we learned that it is a terrible idea to hardcode interactions among microservices, so we decided to elevate such interactions using a combination of system and environment properties. Platforms like Kubernetes further help us to configure microservices interactions via configuration settings. However, this is not enough, there are ways in which we can dynamically fine-grain control the service interactions via the use of routing policies.
Having fine grain traffic control among microservices, allow us to be able to modify virtual service routes, based on destination rules dynamically, for example in order to achieve behaviors such as:
- Canary deployments
- Blue-Green deployments
- Rolling upgrades
- Retries
- Timeouts
- Etc.
One of the most notable strategies in which we can make this possible, is using service mesh, that uses container sidecars (data plane) to intercept all inbound/outbound traffics from microservices and injects new routing behaviors e.g. based on traffic split or target versions. The data plane is managed by a control plane, where Istio is one of the most common implementations, relying on Envoy as the containers’ sidecar.
MuleSoft Anypoint Service Mesh, extends on top of Istio using custom resource definitions to help discover and push to Exchange microservices running as part of a mesh. Once the microservices are cataloged, policies can be enforced via Anypoint platform.
Although Istio provides a very rich set of capability and control, it also increases the maintenance costs, especially if the customer does not already have Istio adopted as part of their microservices strategy.
However, there is no need to have to adopt Istio or another Service Mesh implementation in order to gain from the benefits that I explained earlier. As part of MuleSoft Universal API Management, the goal is for MuleSoft Anypoint Flex Gateway to provide fine grain traffic control capabilities, without requiring Istio. If you want to learn more about MuleSoft Anypoint Flex Gateway, refer to this site – See here.
Security
Especially with microservices, “security” should not be a developer’s responsibility. Developers should be able to apply their skills at writing business logic, without worrying about security at all. This is not because they can’t, they surely can, but the problem is that dev & test cycles are expensive and so are maintenance tasks. However “security” is a never stopping practice that will keep evolving over time.
We want to apply consistent changes in security and governance in production safely, without opening expensive dev/test cycles, nor risking having different approaches to implement security measures or even worse, missing applying an important security policy in one microservice, compromising confidential information, which can lead to seriously damaging an entire business, regardless of the size, tenure, history, etc.
In terms of security with microservices, there are two typical ways to secure them:
- North/South: This is probably the most common and well known security aspect of a traditional API Management. This involves measures, such as network protection, WAF, firewalls, white-listing, masking, tokenization, etc to provide the right level of Authentication and Authorisation to API clients.
For this, we normally deploy a group of API Gateway instances or Ingress Controllers that via an API Management control plane, enforces the right level of policy to the upstream APIs.
- East/West: This type of security is getting more and more common, especially with the growing adoption of container orchestrators, like Kubernetes. This type of security assumes a zero trust model, in which microservices inter-communication (even inside a private cluster) needs to comply with certain regulatory requirements, such as enforcing policies for:
- Mutual TLS encryption and certificate management
- Gateway/Virtual Service based white-listing
- Rate limiting
- Circuit breakers
- Fine-grain policies to allow/deny traffic, e.g. paths, role-base, namespace, token, etc.
- SQL / XML/ LDAP injection
- Network policies. For example, allowing/blocking certain TCP protocols or network ports.
- Harden Container images (e.g. distroless images).
Once again, in order to provide this level of security, you may need to be operating within the boundaries of a cluster and you can apply these types of security controls using Istio – That is, if you are already an Istio user. Otherwise, in order to avoid added costs and complexity to ramp up on a new technology, MuleSoft Anypoint Flex Gateway, provides fine-grain security policy enforcement, without requiring Istio. If you want to learn more about MuleSoft Anypoint Flex Gateway, refer to this site – See here.
API Governance
When it comes to microservices, it is common to find a polyglot and a multi-platform approach. For example, microservices are built across a variety of technologies, such as Spring boot, Python, NodeJS, Kotlin, Ruby, AWS Lambdas, Azure functions, etc. – Also, we know that depending on the actual technology being used, there will be a preferred method to provide API Management. For example, AWS Lambdas, tend to always be fronted by AWS Gateways.
From an API Governance perspective, having this massive plethora of APIs, it makes it very hard to see from a single place the full compliance of regulatory policy, standards and quality assurance.
I find it extremely innovative the position of MuleSoft API Governance – As it provides a way to define API standards, API quality assurance and API policy enforcement across any API, regardless of its type or location.
That is, it will provide a consolidated view of API standards and the policies being enforced by APIs fronted by 3rd party gateways, without requiring to employ any extra gateway layer. Also, it allows toolkit to script and automate API Governance validation, as part of the microservices lifecycle – For example, as part of a decisive step in a CI/CD pipeline that has to comply with certain compliance thresholds or rules validation in order to progress a deployment into UAT or Production environment. If you want to have a look at MuleSoft API Governance – Have a look at this link here
API Telemetry
Similarly as with API Governance, when involving enterprise microservices, there will be a plethora of APIs running on different technologies and across multiple platforms. Some organizations may have already invested in some 3rd party platforms that allow them to obtain insight about their APIs.
At the bear minimum, organizations need to ensure that the following levels of API observability are met:
- API metrics
- API log aggregation
- API traceability
The problem is that the majority of enterprises don’t have a consistent approach to obtain a full observability across all their most critical APIs, especially when running across multiple platforms.
Not knowing how APIs are performing or interacting with each other, tends to unexpected behaviors in production, with the risk of causing poor customer satisfaction.
As part of MuleSoft Universal API Management, Anypoint Flex Gateway provides two modes of operation:
- Local/Disconnected mode: In this mode, it is possible to leverage 3rd party platforms to provide the full telemetry (API metrics, log aggregation and API traceability).
- Connected mode: In this mode, all telemetry is pushed into Anypoint Platform, bringing the same level of enterprise maturity that for many years MuleSoft has provided to Mule APIs, now to any API regardless of its underlying technology or running platform.
The last piece of the puzzle is around all those APIs that are already governed by 3rd party gateways, e.g. AWS gateways fronting a number of AWS Lambdas – The question is, how can we obtain a single view of all APIs Telemetry in a single dashboard, regardless of their nature? Luckily once again MuleSoft Universal API Management provides this capability by leveraging its agnostic control plane to pull API telemetry from all APIs, that is, Mule APIs, non-Mule APIs fronted by a MuleSoft Anypoint Flex Gateway or any other non-Mule API fronted by 3rd party API Gateways.
How cool is that?
I hope you found this blog useful. If you have any question or comment, feel free to contact me directly at https://www.linkedin.com/in/citurria/
Thanks for your time.