Documentation

There exist three levels of configurations, each distinct and isolated, referenced by lower levels via a unique naming scheme and typically owned by different roles within an organization:

1. Artifacts: defined by standard library or plugin developers per distinct reliability artifact.

2. Blueprints: reusable templates defined by observability experts that build upon artifacts to codify common reliability patterns.

3. Expectations: defined by service owners per service, referencing blueprints.

stardust

An example of how the file types interact across a typical engineering organization

Thus, for the SLO artifact, we might see something like this:

stardust

The same diagram from above made up of examples for the SLO reliability artifact

Artifacts are released as part of the standard library. Within an organization, observability experts create common abstractions (blueprints), working alongside service owners to establish expectations for their services.

The documentation is split by persona with sufficient details for each to operate within their domain.


Standard Library Developers

Artifacts Explanation

An artifact surfaces the standard library or a plugin's interface - it defines the expected values on which the logic operates. An SLO artifact would require thresholds and queries while a systems modelling artifact might require dependency relationships.

An artifact is a .yaml file with the following required fields per artifact.

Attribute Expected Type Description
name String Unique within a single artifacts.yaml file.
version String Semantic version of the artifact (e.g., "0.0.1").
base_params Dict(String, AcceptedTypes) Parameters that flow through blueprints to service owners. Blueprints may leave these for service owners to specify, or provide defaults using the -> operator.
params Dict(String, AcceptedTypes) Parameters that must be provided by blueprints using this artifact. These cannot be deferred to service owners.

So, an example would look like:

artifacts:
  - name: SLO
    version: 0.0.1
    base_params:
      threshold: Float
      window_in_days: Optional(Integer) -> 30
    params:
      queries: Dict(String, String)
      value: String

Documentation and Distribution

Artifacts are distributed with the core Caffeine binary. Documentation is available on the Standard Library Reference page. Organizations can extend Caffeine with custom artifacts via plugins.


Observability Experts

Blueprints Explained

Blueprints represent common abstractions over reliability artifacts typically used to codify (either completely or partially, i.e. templatized) well known, common patterns. Per blueprint take the form of:

Attribute Expected Type Description
name String Unique within a single blueprints.yaml file.
artifact String An identifier referencing an artifact.
params Dict(String, AcceptedTypes) A collection of input names mapped to input types of attributes required to be defined by service expectations that invoke it.
inputs Dict(String, Any) A collection of input names mapped to input values satisfying the params required by the referenced artifact.

So, an example would look like:

blueprints:
  - name: LCP_Latency
    artifact: SLO
    params:
      view: String
      p95_latency_in_seconds: Integer
    inputs:
      queries:
        numerator:   ___________________
        denominator: ___________________
      value: "numerator / denominator"

In the example above, numerator and denominator are left as placeholders (indicated by underscores). The actual query strings will be provided in each service's expectation. A complete blueprint might look like:

...
      queries:
        numerator: "SELECT COUNT(*) FROM events WHERE lcp < {p95_latency_in_seconds} AND view = '{view}'"
        denominator: "SELECT COUNT(*) FROM events WHERE view = '{view}'"
...

File Organization

Blueprints are typically defined in one or more blueprints.yaml files owned by the observability team and referenced by service teams in their expectation files. At this time we don't expect to dstribute any common blueprints due to the content typically being organization specific.


Service Owners

Service Expectation Explained

Definition: a service expectation is a single blueprint invocation for a single service within the domain of an owning team.

The organization name, team name, and service name are baked into the filepath:

ORGANIZATION_NAME/TEAM_NAME/SERVICE_NAME.yaml

As all other configurations, an expectation is a .yaml file with the following required fields.

Attribute Expected Type Description
name String Unique within a single SERVICE_NAME.yaml file.
blueprint String An identifier referencing a blueprint.
inputs Dict(String, Any) A collection of input names mapped to input values satisfying the params and base_params required by the referenced blueprint.

So, an example would look like:

expectations:
  - name: "Admin Portal Home Page LCP Latency"
    blueprint: LCP_Latency
    inputs:
      view: /admin/home
      p95_latency_in_seconds: 5
      threshold: 99.5

Note that window_in_days is not specified above. Since the SLO artifact defines a default of 30 for this base parameter, the compiler will use that value automatically.

File Organization

Caffeine has no opinion on where expectation files live, just that the filepath follows the pattern mentioned above.


Appendix

Accepted Types

Accepted Types are type names leveraged throughout Caffeine configuration files as type annotations for the compiler to enforce. Today we support the following:

Type Description
Boolean A true or false value.
Float A floating-point number (e.g., 99.5).
Integer A whole number (e.g., 30).
String A text value (e.g., "my-service").
NonEmptyList(T) A list containing at least one element of type T. Supported inner types: String, Integer, Boolean, Float.
Optional(T) A value that may or may not be present. Supported inner types: String, Integer, Boolean, Float, NonEmptyList(String), NonEmptyList(Integer), NonEmptyList(Boolean), NonEmptyList(Float).
Dict(String, T) A key-value mapping where String is the key type and T is the value type (e.g., Dict(String, Integer)). Supported inner types: String, Integer, Boolean, Float, NonEmptyList(String), NonEmptyList(Integer), NonEmptyList(Boolean), NonEmptyList(Float).

Technical Details

While the type definition technically allows infinite recursion as it is currendtly defined:

pub type AcceptedTypes {
  Boolean
  Float
  Integer
  String
  NonEmptyList(AcceptedTypes)
  Optional(AcceptedTypes)
}

We do limit it intentionally within the Caffeine compiler at this time with the restrictions on collection types noted in the table above.

Multi-Inheritance

There is no support for multi-inheritance within the existing design of Caffeine: we do not support a single service expectation referencing multiple blueprints or a single blueprint referencing multiple artifacts. This 1:1 relationship seems sensible and we do not want to think hard about the diamond problem...