Luciogi

Compare Two Versions (Zig)

Table of Contents



Published on

Background

When I wrote VBR build script in bash, it was working awesome, after sometime maintaining VBR build script, I felt that bash code not readable, as I have to visit that after 1-2 month.

It was also painful to debug that code, due extensive use of array. Its code confuses when array is passed to function and function that return array due to that I have added print statements which are commented by default. Whenever I need to debug, I un-comment those lines and bang my head in script to debug issues.

Between this time, I got interest in low level programming, I chose Zig lang, because it is being developed, and it claims to be simple and fast like C, but with safety guardrails. I spend sometime learning syntax and basic of manual memory management, now I wanted to put that knowledge into some project. So I decided to rewrite VBR build script in Zig.

Why I need to compare versions?

In short I wanted to learn, how versions are compared. In Bash script, I also wrote version comparsion logic from scratch.

How I compared versions in Bash?

Let say we have two strings of versions v1 and v2. v1 = “1.0” v2 = “1.1” I split both strings by delimiter ("."), which gives integer in string representations then loop over both string and compare character by character.

Difficult case is when both versions have unequal length e.g. v1 = “2.1” v2 = “2.1.15”

After removing delimiter,

v1 = “21” v2 = “2115”

then which ever string length is less, append zero in that string. How many zeros to append? v2.length - v1.length i.e 4 - 2 = 2 v1 = “2100” v2 = “2115”

Now we can loop over both strings of equal length, and compare character by character to make decision

This approch was working welling, but it was slow due to multiple loop.

Optimizing the approach

I started looking for version comparison solutions on internet, I found that, this problem statement is part of “Two Pointer” in Data Structure and Algorithms.

After reviewing my code(bash script), appending zeros was wasting resources.

I was complicating problem, by not converting number stored in string to actual integer type "2" -> 2, which I remember may took me to appending zero crappy solution.

and that simple thing I could not figure out in past. because now seeing my code, it also have similar solution in code which I find on internet, but I could not translate that correctly into bash implementation

Solution I found on internet

Note: Following code should be treated as psuedo code

v1 = "2.11"
v2 = "2.12"

Create two counters which will go through each version string

idx1 = 0
idx2 = 0

Loop over strings until we reach end

while ( idx1 < length of v1 OR idx2 < length of v2 ) {
    ...
}

Do following inside loop, Create variables to string integer representation of revision (part which is separated by “.”)

rev1 = 0
rev2 = 0

loop again until we reach “.” and importantly end of string, this is way to get number out of string

while ( idx1 < length of v1 AND v1[idx1] != '.' ) {
    rev1 = (rev1 * 10) + int( v1[idx1] )
    idx1 += 1 # we have consumed 1 character of v1 string
}

Decoding statement rev1 = (rev1 * 10) + int( v1[idx1] )

v1 has substring “11” above statement converts that at "11" into integer/number i.e 11.

See string as array representation

array --> ["1", "1"]
index -->   0    1

rev1 is 0 go inside loop

rev1    = (0 * 10) + 1
        = 0 + 1
        = 1

again going into loop now rev1 is 1

rev1    = (1 * 10) + 1 
        = 10 + 1
        = 11

Here 10 seems like magic. Indeed it is.

Similarily, add another loop for extracting rev2 from v2.

while ( idx2 < length of v2 AND v2[idx2] != '.' ) {
    rev2 = (rev2 * 10) + int( v2[idx2] )
    idx2 += 1 # we have consumed 1 character of v2 string
}

At this point you have two integers for comparison, your fast processor can help, in comparing these.

if (rev1 < rev2) {
    print("v1 is less than v2")
    break # get out of loop
}
else if (rev1 > rev2) {
    print("v1 is greater than v2")
    break # get out of loop
}

Last thing dont forget to increment indicies at end point outer loop

while(...) {
    ...
    while(...) {...}
    while(...) {...}
    ...
    idx1 += 1
    idx2 += 1
}

Zig implementation

Source code: https://codeberg.org/Luciogi/version-comp-zig/src/branch/main/src/root.zig

const std = @import("std");

pub const VersionResult = enum(u2) {
    GREATER,
    LESS,
    EQUAL,
};

const Delimiter = enum(u8) {
    DOT = '.',
    UNDERSCORE = '_',
    HYPHEN = '-',
};

fn isDelimiter(char: u8) bool {
    if (char == @intFromEnum(Delimiter.DOT) or
        char == @intFromEnum(Delimiter.UNDERSCORE) or
        char == @intFromEnum(Delimiter.HYPHEN)
    ) {
        return true;
    }
    return false;
}

fn charToDigit(char: u8) u8 {
    return switch(char) {
        '0'...'9' => char - '0',
        else => unreachable,
    };
}

pub fn compare(v1: []const u8, v2: []const u8) VersionResult {
    std.debug.print("v1: {s}", .{v1});
    std.debug.print("\n", .{});
    std.debug.print("v2: {s}", .{v2});
    std.debug.print("\n", .{});

    const len1 = v1.len;
    const len2 = v2.len;

    var idx1: u32 = 0;
    var idx2: u32 = 0;

    while( idx1 < len1 or idx2 < len2 ) : ({
        idx1 += 1;
        idx2 += 1;
    }) {
        var rev1: u32 = 0;
        var rev2: u32 = 0;
        while( idx1 < len1 and ! isDelimiter(v1[idx1]) ) {
            rev1 = rev1 * 10 + charToDigit(v1[idx1]);
            idx1 += 1;
        }
        while( idx2 < len2 and ! isDelimiter(v2[idx2]) ) {
            rev2 = rev2 * 10 + charToDigit(v2[idx2]);
            idx2 += 1;
        }

        if (rev1 < rev2) return VersionResult.LESS
        else if (rev1 > rev2) return VersionResult.GREATER;
    }

    return VersionResult.EQUAL;
}



Related Posts

Tags: