[ad_1]

The `internal_add`

operate tries to effectively insert a brand new vary of integers into an current record of sorted and disjoint integer ranges. For instance, if we began with `[101..=102, 400..=402, 404..=405]`

and added `402..=404`

, we count on a results of `[101..=102, 400..=405]`

.

Ideally, I’d formally confirm this algorithm with Rust-specific instruments [1,2]. These instruments, nevertheless, appear arduous to make use of. As an alternative, I selected Dafny. Dafny is a language and verification system. It’s taught to undergraduates at universities world wide. It’s utilized in business. I discover it to be addictively interactive and programmer pleasant.

Apart: Dafny creator, Dr. Rustan Leino has a connection to Rust past the coincidence of his first title. He helped create Spec#, the primary language to make use of a sort system to keep away from null pointers. Rust, after all, adopted this concept to nice success.

This text covers guidelines 1 to six. Half 2 will cowl guidelines 7 to 9.

Earlier than attempting to show the mathematical correctness of your algorithm, resolve if the hassle is well worth the profit.

Dafny is just not Rust. Utilizing Dafny requires porting algorithms of curiosity from Rust to Dafny. This port can miss particulars and introduce errors. Given this danger, ought to *you *use Dafny to confirm Rust algorithms? I boldly declare that “it relies upon”.

- How vital is your algorithm’s correctness? In case you are printing a report and it appears proper, it in all probability is true. The
`internal_add`

algorithm relates to an information construction that I’d like others to make use of with confidence, giving me further motivation to confirm it. - Perhaps all formal verification, with present instruments, is just too arduous. I imagine, nevertheless, that Dafny makes formal verification as straightforward as at the moment attainable. You’ll find formally verifying code simpler if you’re already accustomed to varieties (for instance, from Rust) and recursion/induction (used sometimes in Rust). You’ll be able to learn this text and resolve for your self if/when formal verification is simple sufficient to be worthwhile to you.
- Perhaps fuzzing (equivalent to
`cargo-fuzz`

) and property-based testing (equivalent to`QuickCheck`

) are adequate. Though these strategies don’t present mathematical certainty, they’re intelligent, helpful, and simple to make use of. (The`range-set-blaze`

crate already makes use of`QuickCheck. See`

Rule 9.5 in a previous article for particulars). - Perhaps formal verification is and can all the time be doomed as a result of writing a specification is as arduous as writing code. I disagree. Take into consideration refactoring. I typically begin coding by writing one thing easy. I then refactor this easy code for effectivity. For
`internal_add`

, I discovered the specification to be easier than any code. (You’ll be able to decide this for your self in Rule 4.)

Apart: Verification then turns into a computer-checked refactoring from a easy specification to a remaining, environment friendly algorithm.

- Perhaps formal verification is and can all the time be doomed as a result of the halting problem tells us formally that formality isn’t usually attainable. The halting downside doesn’t doom us. Whereas we are able to’t all the time perceive
*arbitrary*code, we don’t have to. We solely want to know our personal code, which we (hopefully) wrote to be comprehensible. Beginning in Rule 2, we’ll see how Dafny simply verifies that*particular*loops and recursions halt. - Perhaps porting to Dafny is just too arduous. This has not been my expertise. Like Rust, Dafny mixes and matches crucial and useful programming. I discovered porting my algorithm to be easy.

Assuming you continue to need to confirm your algorithm with Dafny, the next step is to be taught Dafny.

Dafny is each a programming language and an interactive verification system. I like to recommend you install it as a VS Code extension.

To be taught it, begin at https://dafny.org/. Of particular curiosity is the Online Tutorial and the Reference Manual. I additionally discovered the Verification Corner videos on YouTube useful. (Of attainable curiosity is the faculty textbook, *Program Proofs, *$49 for the Kindle Version). I discovered the programming language a part of Dafny simpler to be taught than Rust, maybe comparable in issue to C#.

Dafny, like Rust, is absolutely typed. Dafny, like Python, is rubbish collected. Right here is a “Hello World”:

`// hi there.dfy`

methodology Important()

{

var s := "Hi there World";

print s, "n";

}

Dafny, additionally like Python, gives integers of arbitrary dimension. Right here is a program that *provably *provides two pure numbers by repeatedly incrementing.

Some factors of curiosity:

- Dafny coding tips observe C#, not Rust. So, we title the operate
`SlowAdd`

not`slow_add`

(though both will run). - Dafny helps subtypes. For instance, any
`int`

that may be proven to be non-negative can also be a`nat`

. - Task is
`:=`

and equality is`==`

. (There is no such thing as a`=`

.) - Operate parameters, for instance,
`x`

and`y`

above, are immutable. - Dafny makes use of
`ensures`

and`invariant`

statements to confirm the code at compile-type. It then removes these statements to complete compiling. - The inexperienced test mark exhibits that this code verifies. Dafny’s VS Code extension will, by default, repeatedly attempt to validate every methodology. This provides an virtually gambling-like pleasure to working with Dafny. Within the instance above, if I make
`y`

an`int`

relatively than a`nat`

, then validation ought to and can fail. (Can you determine why?) Dafny will mark my operate with a pink X and inform me “`This postcondition won't maintain: r == x + y`

”. - Dafny is aware of among the arithmetic of integers, arrays, units, maps, sequences, and so on. This typically permits it to complete the final particulars of validation by itself.

Now that you understand about Dafny, you need to let it learn about your algorithm.

The `range-set-blaze`

crate represents units of integers as sorted, disjoint ranges. For instance, this record of three ranges:

`100..=2_393,`

20_303..=30_239_000,

501_000_013..=501_000_016

represents a set of 30,220,996 integers.

In Rust, the `RangeSetBlaze`

struct represents this knowledge construction internally with a typical `BTreeMap.`

Recall {that a} `BTreeMap`

represents a listing of key/worth pairs, sorted by key. Right here, our keys are the ranges’ begins (for instance, `100`

, `20_303`

, `501_000_013`

) and the values are the ranges’ inclusive ends (for instance, `2_393`

, `30_239_000, 501_000_016`

. `RangeSetBlaze`

shops the record with a `BTreeMap`

relatively than a `vec`

to make key search for extra cache pleasant.

`RangeSetBlaze`

depends upon `BTreeMap`

, so should we implement `BTreeMap`

in Dafny? Fortunately, no. We will, as a substitute, use Dafny’s `vec`

-like `seq`

knowledge kind. This substitution works as a result of `BTreeMap`

, `vec`

, and `seq`

can all signify sorted lists — simply with totally different efficiencies. For the aim of formal verification, we solely care about correctness and might ignore effectivity.

`RangeSetBlaze`

requires the record of ranges be sorted and disjoint. How do we are saying “sorted and disjoint” in Dafny? We will say it through this *ghost predicate* (and related code):

`ghost predicate ValidSeq(sequence: seq<NeIntRange>) :: !Contact(sequence[i], sequence[j]))`

kind IntRange = (int, int)

kind NeIntRange = x: IntRange | !IsEmpty(x) witness (0,0)

operate IsEmpty(r: IntRange): bool

{

r.0 > r.1

}

A *predicate* is one other title for a technique that returns `bool`

. A *ghost* methodology (or predicate) is one that may solely be used for validation, not for operating the ultimate code.

At a excessive stage, the `ValidSeq`

predicate takes as enter a sequence of non-empty integer ranges. It then exams that the beginning values are sorted and that the ranges don’t contact. Particularly,

- An
`IntRange`

is a tuple of two`int`

values. - An
`IntRange`

`IsEmpty`

precisely when its begin is bigger than its finish. (This follows Rust’s convention.) - A
`NeIntRange`

(non-empty integer vary) is an`IntRange`

that’s not empty, for instance,`(0,0)`

. [All our ranges are end inclusive.] - This expression exams that the beginning values are sorted:

`forall i:nat, j:nat | i < j < |sequence| :: sequence[i].0 < sequence[j].0`

It may be learn as “for all pure numbers *i* and *j* — such that *i* is lower than *j *and *j* is lower than the size of the sequence — check that the beginning worth at index *i* is lower than the beginning worth as index *j*”.

Apart: Word {that a} Rust

`BTreeMap`

doesn’t help (random-access) indexing however right here we’re utilizing such indexing. That is OK as a result of`is a ghost predicate and so will solely be used for validation.`

ValidSeq

- This expression exams that the ranges are disjoint:

`forall i:nat, j:nat | i < j < |sequence| :: !Contact(sequence[i], sequence[j])`

It may be learn as “for all pure numbers *i* and *j* — such that *i* is lower than *j *and *j* is lower than the size of the sequence — check that the vary at index *i* doesn’t contact the vary at index *j*. However what’s `Contact`

?

We’ll outline `Contact`

on two-levels. On a mathematical stage, a spread *i* is claimed to the touch a spread *j* if there exists an integer *i0* in vary *i* and an integer *j0* in vary *j* such that *i0* and *j0* are inside a distance of certainly one of one another. On an environment friendly programming stage, we need to keep away from definitions relying on “there exists”. Here is a Dafny predicate that’s each mathematical and environment friendly:

`predicate Contact(i: NeIntRange, j: NeIntRange)`

ensures Contact(i, j) == exists i0, j0 ::

Accommodates(i, i0) && Accommodates(j, j0) && -1 <= i0 - j0 <= 1

{

assert Accommodates(i, i.0) && Accommodates(i, i.1) && Accommodates(j, j.0) && Accommodates(j, j.1);

if i.1 < j.0 then

assert (-1 <= i.1 - j.0 <= 1) == (i.1+1 == j.0);

i.1+1 == j.0

else if j.1 < i.0 then

assert (-1 <= j.1 - i.0 <= 1) == (j.1+1 == i.0);

j.1+1 == i.0

else

var k0 := Max(i.0, j.0);

assert Accommodates(i, k0) && Accommodates(j, k0);

true

}operate Accommodates(r: IntRange, i: int): bool

{

r.0 <= i && i <= r.1

}

operate Max(a: int, b: int): int

{

if a < b then b else a

}

Some factors of curiosity:

`Contact`

is just not a ghost. In different phrases, we are able to use it in each common code and validation code.- The
`assert`

statements assist Dafny show that the common code meets the mathematical`ensures`

assertion. - For effectivity, the Dafny prover validates the within of a
`methodology`

individually from its outdoors. Solely the`ensures`

(and the yet-to-be-seen,`requires`

) statements cross this border. In distinction to a`methodology`

, a Dafny`operate`

is clear to the validator. (I consider it as inlining code with respect to validation.)

With ideas equivalent to `ValidSeq`

and `Contact`

outlined, we subsequent transfer onto specifying what our algorithm is meant to do.

Finally, I need to show that my particular Rust algorithm for inserting a brand new vary right into a `RangeSetBlaze`

is appropriate. Earlier than we try this, nevertheless, let’s outline what “correct” range insertion is.

`methodology InternalAdd(xs: seq<NeIntRange>, a: IntRange) returns (rs: seq<NeIntRange>)`

requires ValidSeq(xs)

ensures ValidSeq(rs)

ensures SeqToSet(rs) == SeqToSet(xs) + RangeToSet(a)

{

if IsEmpty(a)

{

rs := xs;

}

else

{

assume false; // cheat for now

}

}

This says that `InternalAdd`

is a technique that takes `xs`

, a sequence of non-empty integer ranges, and `a`

, an integer vary (that might be empty). The strategy outputs `rs`

, a brand new sequence of non-empty integer ranges.

We have to say that `xs`

and `rs`

have to be sorted and disjoint. That’s simply executed with the `ValidSeq`

’s within the `requires`

and first `ensures`

.

We additionally have to say that `rs`

comprises the suitable stuff. Is this tough? It’s not. We simply say that the set of integers in `rs`

should equal the set of integers in `xs`

unioned with the integers in `a`

.

Apart: In Dafny, “+” when utilized to units is “union”.

The set of integers in a spread is:

`ghost operate RangeToSet(pair: IntRange): set<int>`

{

set i {:autotriggers false} | pair.0 <= i <= pair.1 :: i

}

And the set of integers in a sequence of non-empty ranges might be outline inductively (that’s, recursively):

`ghost operate SeqToSet(sequence: seq<NeIntRange>): set<int>`

decreases |sequence|

requires ValidSeq(sequence)

{

if |sequence| == 0 then {}

else if |sequence| == 1 then RangeToSet(sequence[0])

else RangeToSet(sequence[0]) + SeqToSet(sequence[1..])

}

Some factors of curiosity:

- The road:
`assume false; // cheat for now`

makes validation work even if it really shouldn’t. We use it as a brief placeholder. - We make
`RangeToSet`

and`SeqToSet`

*ghosts*to cease us from utilizing them in common code. We make them*features*(as a substitute of*strategies*) - As a result of Dafny is aware of loads about creating and manipulating units and sequences, we frequently revenue by utilizing units and sequences in our specification.
- Even when our common code makes use of loops as a substitute of recursion, our validation code will typically use recursive-like induction.
- The
`{:autotriggers false}`

pertains to avoiding a warning message. For extra info see this Stack Overflow answer by Prof. James Wilcox.

We now have a proper specification of `InternalAdd`

. I discover this specification quick and intuitive. However what in case you need assistance determining a specification or different Dafny code?

The principle discussion board for Dafny questions is Stack Overflow. To my shock, I truly acquired a lot helpful assist there.

I like to recommend beginning your query’s title with “Dafny:”. Additionally, be sure you tag your query with `dafny`

and, maybe, `formal-verification`

.

Apart: On the location, you’ll be able to see my 11 questions and Divyanshu Ranjan’s 48 Dafny-related answers.

As an open-source undertaking on GitHub, Dafny additionally hosts GitHub Discussions and Points.

The Dafny group is small however appears smitten by serving to customers and enhancing the undertaking.

With assist at hand, we should subsequent discover an algorithm that meets the specification.

As a novice to formal verification, I made a decision to postpone work on the actual `internal_add`

utilized in my Rust code. As an alternative, I began work on an `InternalAdd`

algorithm that I hoped can be simpler to validate. I ended up with this:

`methodology InternalAdd(xs: seq<NeIntRange>, a: IntRange) returns (rs: seq<NeIntRange>)`

requires ValidSeq(xs)

ensures ValidSeq(rs)

ensures SeqToSet(rs) == SeqToSet(xs) + RangeToSet(a)

{

if IsEmpty(a)

{

rs := xs;

}

else

{

var notTouching, merged := PartitionAndMerge(xs, a);

var indexAfter := NoTouchIndexAfter(notTouching, merged);

rs := InsertAt(notTouching, [merged], indexAfter);

}

}

The concept is that if vary `a`

is empty, we return the enter sequence unchanged. In any other case, we divide the work into three steps, which we are able to validate independently. Step one, `PartitionAndMerge,`

returns:

`notTouching`

, a sequence of ranges that don’t contact vary`a`

, and`merged`

, a single vary created from`a`

and every little thing it touches.

Right here is an instance enter and output:

`InternalAdd`

subsequent finds the place to insert `merged`

and, lastly, inserts it.

Right here is the code for `PartitionAndMerge:`

`methodology PartitionAndMerge(xs: seq<NeIntRange>, a: NeIntRange) returns (notTouching: seq<NeIntRange>, merged: NeIntRange)`

requires ValidSeq(xs)ensures ValidSeq(notTouching)

ensures RangeToSet(merged) >= RangeToSet(a)

ensures forall vary | vary in notTouching :: !Contact(vary, merged)

ensures SeqToSet(xs) + RangeToSet(a) == SeqToSet(notTouching) + RangeToSet(merged)

{

// Cut up into touching and never touching seqs

var touching: seq<NeIntRange>;

touching, notTouching := Partition(a, xs);

// Merge the touching seq into one vary with our authentic vary

merged := UnionSeq(a, touching);

}

This says that `PartitionAndMerge`

`requires`

that `xs`

be a legitimate sequence of non-empty integer ranges and that `a`

be a non-empty integer vary. It ensures that `nonTouching`

is one other legitimate sequence of non-empty integer ranges. It ensures that the integers in vary `merged`

are a superset of these in vary `a`

. It ensures that no vary in `notTouching`

touches vary `merged`

. And eventually, it ensures that the integers in `xs`

and `a`

are precisely the identical because the integers in `notTouching`

and `merged`

.

`PartitionAndMerge`

additionally divides the work, this time into two steps (`Partition `

and `UnionSeq`

) that may be validated independently. These steps proceed to subdivide their work. The place does it finish? Let’s take a look at one instance.

The strategy `UnionSeq`

calls `UnionRange`

which merges two ranges:

`operate UnionRange(x: IntRange, y: IntRange): IntRange`

requires IsEmpty(x) || IsEmpty(y) || Contact(x, y)

ensures RangeToSet(x) + RangeToSet(y) == RangeToSet(UnionRange(x,y))

{

if IsEmpty(x) then y

else if IsEmpty(y) then x

else (Min(x.0, y.0), Max(x.1, y.1))

}

The `UnionRange`

code handles the empty circumstances after which returns the minimal bounding vary. (The minimal bounding vary is the vary from the smaller of the 2 begins to the bigger of the 2 ends.) However how can this be appropriate? Generally, a minimal bounding vary of two ranges would possibly embody further integers. We’d get one thing greater than the union of the inputs, like so:

The code is appropriate as a result of it `requires`

that the 2 enter ranges contact or are empty. This `ensures`

that the union of the integers in vary `x`

with the integers in vary `y`

are precisely the integers within the output vary.

At compile time, Dafny proves this operate appropriate. Past that, it proves that every little thing that calls this operate supplies inputs which might be empty or touching.

I consider this as a generalization of Rust’s borrow checker. At compile-time Rust checks that we’re protected from many reminiscence errors. At compile time, verification methods, equivalent to Dafny, can show virtually arbitrary properties. In fact, as we’re seeing, this means comes at the price of complexity.

The full code for this verified algorithm is about 200 strains, organized into a few dozen strategies and features.

This rule exhibits that we are able to confirm *an* algorithm for `InternalAdd`

, however it’s not the algorithm I utilized in Rust. We are going to flip to that subsequent.

[ad_2]