Precomputation: refining sources

In this section, we will explain some of the aspects of the precomputation performed by Tamarin. This is relevant for users that model complex protocols since they may at some point run into so-called remaining partial deconstructions, which can be problematic for verification.

To illustrate the concepts, consider the example of the Needham-Schroeder-Lowe Public Key Protocol, given here in Alice&Bob notation:

  protocol NSLPK3 {
    1. I -> R: {'1',ni,I}pk(R)
    2. I <- R: {'2',ni,nr,R}pk(I)
    3. I -> R: {'3',nr}pk(R)

  }

It is specified in Tamarin by the following rules:

rule Register_pk:
  [ Fr(~ltkA) ]
  -->
  [ !Ltk($A, ~ltkA), !Pk($A, pk(~ltkA)), Out(pk(~ltkA)) ]

rule Reveal_ltk:
  [ !Ltk(A, ltkA) ] --[ RevLtk(A)    ]-> [ Out(ltkA) ]

rule I_1:
  let m1 = aenc{'1', ~ni, $I}pkR
  in
    [ Fr(~ni), !Pk($R, pkR) ]
  --[ OUT_I_1(m1)]->
    [ Out( m1 ), St_I_1($I, $R, ~ni)]

rule R_1:
  let m1 = aenc{'1', ni, I}pk(ltkR)
      m2 = aenc{'2', ni, ~nr, $R}pkI
  in
    [ !Ltk($R, ltkR), In( m1 ), !Pk(I, pkI), Fr(~nr)]
  --[ IN_R_1_ni( ni, m1 ), OUT_R_1( m2 ), Running(I, $R, <'init',ni,~nr>)]->
    [ Out( m2 ), St_R_1($R, I, ni, ~nr) ]

rule I_2:
  let m2 = aenc{'2', ni, nr, R}pk(ltkI)
      m3 = aenc{'3', nr}pkR
  in
    [ St_I_1(I, R, ni), !Ltk(I, ltkI), In( m2 ), !Pk(R, pkR) ]
  --[ IN_I_2_nr( nr, m2), Commit(I, R, <'init',ni,nr>), Running(R, I, <'resp',ni,nr>) ]->
    [ Out( m3 ), Secret(I,R,nr), Secret(I,R,ni) ]

rule R_2:
    [ St_R_1(R, I, ni, nr), !Ltk(R, ltkR), In( aenc{'3', nr}pk(ltkR) ) ]
  --[ Commit(R, I, <'resp',ni,nr>)]->
    [ Secret(R,I,nr), Secret(R,I,ni)    ]

rule Secrecy_claim:
  [ Secret(A, B, m) ] --[ Secret(A, B, m) ]-> []

We now want to prove the following lemma:


lemma nonce_secrecy:
  " /* It cannot be that */
    not(
        Ex A B s #i.
          /* somebody claims to have setup a shared secret, */
          Secret(A, B, s) @ i
          /* but the adversary knows it */
        & (Ex #j. K(s) @ j)
          /* without having performed a long-term key reveal. */
        & not (Ex #r. RevLtk(A) @ r)
        & not (Ex #r. RevLtk(B) @ r)
       )"

This proof attempt will not terminate due to there being 12 partial deconstructions left when looking at this example in the GUI as described in detail below.

Partial deconstructions left

In the precomputation phase, Tamarin goes through all rules and inspects their premises. For each of these facts, Tamarin will precompute a set of possible sources. Each such source represents combinations of rules from which the fact could be obtained. For each fact, this leads to a set of possible sources and we refer to these sets as the raw sources, respectively refined sources.

However, for some rules Tamarin cannot resolve where a fact must have come from. We say that a partial deconstruction is left in the raw sources, and we will explain them in more detail below.

The existence of such partial deconstructions complicates automated proof generation and often (but not always) means that no proof will be found automatically. For this reason, it is useful for users to be able to find these and examine if it is possible to remove them.

In the interactive mode you can find such partial deconstructions as follows. On the top left, under “Raw sources”, one can find the precomputed sources by Tamarin.

Tamarin GUI

Cases with partial deconstructions will be listed with the text (partial deconstructions) after the case name. The partial deconstructions can be identified by light green arrows in the graph, as in the following example:

Partial deconstruction visible in green

The green arrow indicates that Tamarin cannot exclude the possibility that the adversary can derive any fresh term ~t.1 with this rule I_2. As we are using an untyped protocol model, the tool cannot determine that nr.7 should be a fresh nonce, but that it could be any message. For this reason Tamarin concludes that it can derive any message with this rule.

Why partial deconstructions complicate proofs

To get a better understanding of the problem, consider what happens if we try to prove the lemma nonce_secrecy. If we manually always choose the first case for the proof, we can see that Tamarin derives the secret key to decrypt the output of rule I_2 by repeatedly using this rule I_2. More specifically, in a) the output of rule I_2 is decrypted by the adversary. To get the relevant key for this, in part b) again the output from rule I_2 is decrypted by the adversary. This is done with a key coming from part c) where the same will happen repeatedly.

Secret derived by using I_2

As Tamarin is unable to conclude that the secret key could not have come from the rule I_2, the algorithm derives the secret key that is needed. The proof uses the same strategy recursively but will not terminate.

Using Sources Lemmas to Mitigate Partial Deconstructions

Once we identified the rules and cases in which partial deconstructions occur, we can try to avoid them. A good mechanism to get rid of partial deconstructions is the use of so-called sources lemmas.

Sources lemmas are a special case of lemmas, and are applied during Tamarin’s pre-computation. Roughly, verification in Tamarin involves the following steps:

Tamarin first determines the possible sources of all premises. We call these the raw sources.
Next, automatic proof mode is used to discharge any sources lemmas using induction.
The sources lemmas are applied to the raw sources, yielding a new set of sources, which we call the refined sources.
Depending on the mode, the other (non-sources) lemmas are now considered manually or automatically using the refined sources.

For full technical details, we refer the reader to (Meier 2012), where these are called type assertions.

In our example, we can add the following lemma:

lemma types [sources]:
  " (All ni m1 #i.
       IN_R_1_ni( ni, m1) @ i
       ==>
       ( (Ex #j. KU(ni) @ j & j < i)
       | (Ex #j. OUT_I_1( m1 ) @ j)
       )
    )
  & (All nr m2 #i.
       IN_I_2_nr( nr, m2) @ i
       ==>
       ( (Ex #j. KU(nr) @ j & j < i)
       | (Ex #j. OUT_R_1( m2 ) @ j)
       )
    )
  "

This sources lemma is applied to the raw sources to compute the refined sources. All non-sources lemmas are proven with the resulting refined sources, while sources lemmas must be proved with the raw sources.

This lemma relates the point of instantiation to the point of sending by either the adversary or the communicating partner. In other words, it says that whenever the responder receives the first nonce, either the nonce was known to the adversary or the initiator sent the first message prior to that moment. Similarly, the second part states that whenever the initiator receives the second message, either the adversary knew the corresponding nonce or the responder has sent the second message before. Generally, in a protocol with partial deconstructions left it is advisable to try if the problem can be solved by a sources lemma that considers where a term could be coming from. As in the above example, one idea to do so is by stating that a used term must either have occurred in one of a list of rules before, or it must have come from the adversary.

The above sources lemma can be automatically proven by Tamarin. With the sources lemma, Tamarin can then automatically prove the lemma nonce_secrecy.

Another possibility is that the partial deconstructions only occur in an undesired application of a rule that we do not wish to consider in our model. In such a case, we can explicitly exclude this application of the rule with a restriction. But, we should ensure that the resulting model is the one we want; so use this with care.

Modelling tricks to Mitigate Partial Deconstructions

Sometimes partial deconstructions can be removed by applying some modelling tricks:

If the deconstruction reveals a term t that, intuitively, can be made public anyway, you can add In(t) to the lhs of the rule. If you are not sure if this transformation is sound, you may write a lemma to ensure that the rule can still fire.

Example: Hashes of a public value are public knowledge, so adding In(p) to the second rule helps here:
```
[] --> [HashChain('hi')]

[HashChain(p)] --> [HashChain(h(p)), Out(h(p))]
```
Give fresh or public type if you know some values are atomic, but you see that pre-computation tries to deduce non-atomic terms from them. This works only under the assumption that the implementation can enforce the correct assignment, e.g., by appropriate tagging.
Using pattern matching instead of destructor functions can help distill the main argument of a proof in the design phase or in first stages of modelling. It is valid, and often successful strategy to start with a simplistic modelling and formulate provable lemmas first, and then proceed to refine the model step by step.

Auto-Sources

Tamarin can also try to automatically generate sources lemmas (Cortier, Delaune, and Dreier 2020). To enable this feature, Tamarin needs to be started using the command line parameter --auto-sources.

When Tamarin is called using --auto-sources, it will check, for each theory it loads, whether the theory contains partial deconstructions, and whether there is a sources lemma. If there are partial deconstructions and there is no sources lemma, it will try to automatically generate a suitable lemma, called AUTO_typing, and added to the theory’s list of lemmas.

This works in many cases, note however that there is no guarantee that the generated lemma is (i) sufficient to remove all partial deconstructions and (ii) correct - so you still need to check whether all partial deconstructions are resolved, and to prove the lemma’s correctness in Tamarin, as usual.

Cases where Tamarin may fail to generate a sufficient or correct sources lemma include in particular theories using non subterm convergent equations or AC symbols, or cases where partial deconstruction stem from state facts rather than inputs and outputs.

To be able to add the sources lemma, Tamarin needs to modify the protocol rules of the loaded theory in two ways:

By adding the necessary annotations which will be used in the lemma to the protocol rules. All added annotations start with AUTO_IN_ or AUTO_OUT_, and can be seen, e.g., by clicking on Multiset rewriting rules in interactive mode. Note that these annotations are by default hidden in the graphs in interactive mode, except during the proof of the sources lemma, to reduce the size of the graphs. One can manually make them visible or invisible using the Options button on the top right of the page.
By splitting protocol rules into their variants w.r.t. the equational theory, if these variants exists. This is necessary to be able to place the annotations. When exporting such a theory from Tamarin using, e.g., the Download button in the interactive mode, Tamarin will export the rule(s) together with their (annotated) variants, which can be re-imported as usual.

Limiting Precomputations

Sometimes Tamarin’s precomputations can take a long time, in particular if there are many open chains or the saturation of sources grows too quickly.

In such a case two command line flags can be used to limit the precomputations:

--open-chains=X or -c=X, where X is a positive integer, limits the number of chain goals Tamarin will solve during precomputations. In particular, this value stops Tamarin from solving any deconstruction chains that are longer than the given value X. This is useful as some equational theories can cause loops when solving deconstruction chains. At the same time, some equational theories may need larger values (without looping), in which case it can be necessary to increase this value. However, a too small value can lead to sources that contain open deconstruction chains which would be easy to solve, rendering the precomputations inefficient. Tamarin shows a warning on the command line when this limit is reached. Default value: 10
--saturation=X or --s=X, where X is a positive integer, limits the number of saturation steps Tamarin will do during precomputations. In a nutshell, Tamarin first computes sources independently, and then saturates them (i.e., applies each source to all other sources if possible) to increase overall efficiency. However, this can sometimes grow very quickly, in which case it might be necessary to fix a smaller value. Tamarin shows a warning on the command line when this limit is reached. Default value: 5

In case Tamarin’s precomputations take too long, try fixing smaller values for both parameters, and analyze the sources shown in interactive mode to understand what exactly caused the problem.

Loop breakers

During the precomputation phase, Tamarin determines a minimal set of ‘loop-breakers’, which are premises that can be excluded from the general precomputation of all premise goals to prevent looping. Specifically, the set of loop breakers is a minimal set of premises that can be excluded to make the directed graph of rules, when connected from conclusions to premises, acyclic.

It is important to note that there is often no unique minimal set of loop-breakers. The loop-breaker computation is deterministic for a given set of rules, but a change to the set of rules may result in different premises being considered loop-breakers. As such, you may find that a small change or addition of a rule to your model can result in changes to how some seemingly unrelated properties are solved.

It is possible to manually break loops in particular places by annotating the relevant premise with the no_precomp annotation. These premises will then be excluded when computing loop-breakers over the rule set, and will not have their sources precomputed. For more on fact annotations, see Fact Annotations.

Cortier, Véronique, Stéphanie Delaune, and Jannik Dreier. 2020. “Automatic generation of sources lemmas in Tamarin: towards automatic proofs of security protocols.” In ESORICS 2020 - 25th European Symposium on Research in Computer Security. Guilford, United Kingdom. https://hal.archives-ouvertes.fr/hal-02903620.

Meier, Simon. 2012. “Advancing Automated Security Protocol Verification.” {PhD} dissertation, ETH Zurich. http://dx.doi.org/10.3929/ethz-a-009790675.