Documentation

There exist three levels of configurations, each distinct and isolated, referenced by lower levels via a unique naming scheme and typically owned by different roles within an organization:

1. Artifacts: defined by standard library or plugin developers per distinct reliability artifact.

2. Blueprints: reusable templates defined by observability experts that build upon artifacts to codify common reliability patterns.

3. Expectations: defined by service owners per service, referencing blueprints.

stardust

An example of how the file types interact across a typical engineering organization

Thus, for the SLO artifact, we might see something like this:

stardust

The same diagram from above made up of examples for the SLO reliability artifact

Artifacts are released as part of the standard library. Within an organization, observability experts create common abstractions (blueprints), working alongside service owners to establish expectations for their services.

The documentation is split by persona with sufficient details for each to operate within their domain.


Standard Library Developers

Artifacts Explanation

An artifact surfaces the standard library or a plugin's interface - it defines the expected values on which the logic operates. An SLO artifact would require thresholds and queries while a systems modelling artifact might require dependency relationships.

An artifact is a .json file with the following required fields per artifact.

Attribute Expected Type Description
name String Unique within a single artifacts.json file.
version String Semantic version of the artifact (e.g., "0.0.1").
base_params Dict(String, AcceptedTypes) Parameters that flow through blueprints to service owners. Blueprints may leave these for service owners to specify, or provide defaults in blueprint inputs.
params Dict(String, AcceptedTypes) Parameters that must be provided by blueprints using this artifact. These cannot be deferred to service owners.

So, an example would look like:

{
  "artifacts": [
    {
      "name": "SLO",
      "version": "0.0.1",
      "base_params": {
        "threshold": "Float",
        "window_in_days": "Integer"
      },
      "params": {
        "queries": "Dict(String, String)",
        "value": "String",
        "vendor": "String"
      }
    }
  ]
}

Documentation and Distribution

Artifacts are distributed with the core Caffeine binary. Documentation is available on the Standard Library Reference page. Organizations can extend Caffeine with custom artifacts via plugins.


Observability Experts

Blueprints Explained

Blueprints represent common abstractions over reliability artifacts typically used to codify (either completely or partially, i.e. templatized) well known, common patterns. Per blueprint take the form of:

Attribute Expected Type Description
name String Unique within a single blueprints.json file.
artifact_ref String An identifier referencing an artifact.
params Dict(String, AcceptedTypes) A collection of input names mapped to input types of attributes required to be defined by service expectations that invoke it.
inputs Dict(String, Any) A collection of input names mapped to input values satisfying the params required by the referenced artifact.

So, an example would look like:

{
  "blueprints": [
    {
      "name": "LCP_Latency",
      "artifact_ref": "SLO",
      "params": {
        "view": "String",
        "p95_latency_in_seconds": "Integer"
      },
      "inputs": {
        "vendor": "datadog",
        "queries": {
          "numerator": "___________________",
          "denominator": "___________________"
        },
        "value": "numerator / denominator"
      }
    }
  ]
}

In the example above, numerator and denominator are left as placeholders (indicated by underscores). The actual query strings will be provided in each service's expectation. A complete blueprint might look like:

{
  "queries": {
    "numerator": "sum:lcp.duration{view:$$view->view$$,duration<$$p95_latency_in_seconds->threshold$$}",
    "denominator": "sum:lcp.duration{view:$$view->view$$}"
  }
}

File Organization

Blueprints are typically defined in one or more blueprints.json files owned by the observability team and referenced by service teams in their expectation files. At this time we don't expect to distribute any common blueprints due to the content typically being organization specific.


Service Owners

Service Expectation Explained

Definition: a service expectation is a single blueprint invocation for a single service within the domain of an owning team.

The organization name, team name, and service name are baked into the filepath:

ORGANIZATION_NAME/TEAM_NAME/SERVICE_NAME.json

As all other configurations, an expectation is a .json file with the following required fields.

Attribute Expected Type Description
name String Unique within a single SERVICE_NAME.json file.
blueprint_ref String An identifier referencing a blueprint.
inputs Dict(String, Any) A collection of input names mapped to input values satisfying the params and base_params required by the referenced blueprint.

So, an example would look like:

{
  "expectations": [
    {
      "name": "Admin Portal Home Page LCP Latency",
      "blueprint_ref": "LCP_Latency",
      "inputs": {
        "view": "/admin/home",
        "p95_latency_in_seconds": 5,
        "threshold": 99.5
      }
    }
  ]
}

Note that window_in_days is not specified above. If the blueprint provides a default value for this parameter, the compiler will use that value automatically.

File Organization

Caffeine has no opinion on where expectation files live, just that the filepath follows the pattern mentioned above.


Appendix

Accepted Types

Accepted Types are type names leveraged throughout Caffeine configuration files as type annotations for the compiler to enforce. Today we support the following:

Type Description
Boolean A true or false value.
Float A floating-point number (e.g., 99.5).
Integer A whole number (e.g., 30).
String A text value (e.g., "my-service").
List(T) A list of elements of type T. Supported inner types: String, Integer, Boolean, Float.
Optional(T) A value that may or may not be present. Supported inner types: String, Integer, Boolean, Float, List(String), List(Integer), List(Boolean), List(Float), Dict(String, String), Dict(String, Integer), Dict(String, Boolean), Dict(String, Float).
Dict(String, T) A key-value mapping where String is the key type and T is the value type (e.g., Dict(String, Integer)). Supported inner types: String, Integer, Boolean, Float.
Defaulted(T, value) A value with a default. The first argument is the type, the second is the default value as a string (e.g., Defaulted(Integer, 10) means an optional integer with default 10). Supported inner types: String, Integer, Boolean, Float, List(String), List(Integer), List(Boolean), List(Float).

Technical Details

While the type definition technically allows infinite recursion as it is currently defined:

pub type AcceptedTypes {
  Boolean
  Float
  Integer
  String
  Dict(AcceptedTypes, AcceptedTypes)
  List(AcceptedTypes)
  Optional(AcceptedTypes)
  Defaulted(AcceptedTypes, String)
}

We do limit it intentionally within the Caffeine compiler at this time with the restrictions on collection types noted in the table above.

Multi-Inheritance

There is no support for multi-inheritance within the existing design of Caffeine: we do not support a single service expectation referencing multiple blueprints or a single blueprint referencing multiple artifacts. This 1:1 relationship seems sensible and we do not want to think hard about the diamond problem...