Puc Tutorial 1: Basic Programming Model and Concepts

1. Introduction

Puc is syntactically similar to C language. Different from sequential execution in C programs, Puc has a unique programming abstraction - unifying myriads of compute nodes to form a big virtual machine and presenting programmers the view of a single computer where thousands of tasks executing their corresponding code concurrently in a large, unified, and snapshotted memory space.

2. Programming Model

This section illustrates the construct and basic concepts of a Puc program.

2.1. Main function

Similar to C, the entrance for execution of a Puc program is inside a main function. However, a Puc program should have a main function with no parameters or return value.

void main()
{
    runner add1()
           using a11, a12, s1, ec1;

    runner add2()
           using a21, a22, s2, ec2;

    runner check_add()
           using s, s1, s2
           watching ec1, ec2;

    commit;
}

2.2. Task

The basic execution unit in Puc is defined as a task. To achieve massive parallelism, many task instances are deployed inside one or across a few physical machines to execute in parallel. A task's execution is defined inside a function with a void return type and parameters of variables or array segments defined in Puc's mannual. Another major difference from C's function definition is that, in defining a function for a task, you need to explicitly specify either of two key words commit or abort at the end, in order to commit or abort the change of memory locations specified inside the using statement when creating a task.

void add1()
{
    ec1 = 0;
    a11 = 1;
    a12 = 2;
    s1 = a11 + a12;
    ec1 = ec1 + 1;
    commit;
}

void add2()
{
    ec2 = 0;
    a21 = 3;
    a22 = 4;
    s2 = a21 + a22;
    ec2 = ec2 + 1;
    commit;
}

The syntax of creating a task is much like that for invoking a function in C, but you need to add a keyword runner at the front.

    runner add1()
        using a11, a12, s1, ec1;

    runner add2()
        using a21, a22, s2, ec2;

main() is a simple task without any memory range in its heap. It can be considered and created as a task without the using statement. This means that main() cannot use any global variable.

In C, a function can access any global variables. However, due to the unique organization of memory in Puc, a task cannot access a global variable acquiescently, unless a using statement is declared with it to permit its access to the global variable. In the above example, s1 can only be accessed by task executing add1(). Similarly, s2 can only be accessed by task executing add2().

2.3. Depending task

Different from the execution order followed by the syntatic order in sequential programs, add1() and add2() in the above program are both executed in parallel by various task instances. As a consequence, the execution order between both tasks is undefined. In other words, add1() may be excuted earlier than add2(), following the same order defined in the source code. Or counter-intuitively, add1() can be excuted later than add2(). However, we cannot proceed with summing until we get the result from both add1() and add2(). In other words, it is necessary to enforce that the execution of a Puc program is in an order following the dependency among all of its tasks.

In Puc, depending task instances are created to enforce the execution order between one task and its depending tasks. A depending task itself is a special task. A depending task's execution will be triggered only after its depended tasks committed successfully and some specified memory location has been modified during a commit. These specified memory locations are denoted as watched variables in a depending task. That is to say, a depending task will be triggered when at least one of its watched variables are modified. To create a depending task, we also use the runner keyword, but we need to add one additional watching statement on the set of memory locations to be watched. The parameter to be watched can be either a simple type, a pointer to a simple type or an array segment of a simple type.

    runner check_add()
           using s, s1, s2
           watching ec1, ec2;
![Figure 2-1](./tutorial1_pic1.png)

As is illustrated in the figure above, check_add() is a task depending on tasks add1() and add2(). With watched variables specified, a depending task instance for check_add() will be started when either ec1 or ec2 are modified and committed.

Inside the definition of a depending task, different from a normal task commit and abort only commit or abort the current instance of the depending task. That is, the depending task can be triggered again if the watched ranges are changed and committed by other task. Programmers need to explicitly specify either of the two key words commitd or abortd to commit or abort the current depending task and stop it from being triggered.

void check_add()
{
    if (ec1 == 1) {
        if (ec2 == 1) {
           s = s1 + s2;
           commitd;
        }
    }
    abort;
}

The check_add() will commit and stop being triggered if both ec1 and ec2 are 1s and the sum operation is done, and abort but keep "watching" if either is not 1.

In this way, although the order of commit by both tasks are undefined, check_add() is guaranteed to be executed only when the value of both ec1 and ec2 change to 1, equivalent to the circumstance that both tasks have committed their tasks.

2.4. Standalone

standalone is a special keyword in Puc. It is for declaration of global variables which are in the Shared Region (SR). When a global variable is declared standalone, it always monopolizes one or many memory pages. Since the basic unit of memory space management is a page, in this way, using standalone can avoid false sharing on global variables and reduce conflicts of commit on different global variables in multiple concurrent tasks.

standalone long s1;
standalone long s2;
standalone long s;
standalone long a11;
standalone long a12;
standalone long a21;
standalone long a22;
standalone long ec1;
standalone long ec2;

In this program, if task add1() modifies s1 and task add2() modifies s2, there will be no conflict between the commit of both tasks.

3. Appendix

The Puc source code for the illustrated example in this tutorial is shown as follows.

standalone long s1;
standalone long s2;
standalone long s;
standalone long a11;
standalone long a12;
standalone long a21;
standalone long a22;
standalone long ec1;
standalone long ec2;

void add1()
{
    ec1 = 0;
    a11 = 1;
    a12 = 2;
    s1 = a11 + a12;
    ec1 = ec1 + 1;
    commit;
}

void add2()
{
    ec2 = 0;
    a21 = 3;
    a22 = 4;
    s2 = a21 + a22;
    ec2 = ec2 + 1;
    commit;
}

void check_add()
{
    if (ec1 == 1) {
        if (ec2 == 1) {
           s = s1 + s2;
           commitd;
        }
    }
    abort;
}

void main()
{
    runner add1()
           using a11, a12, s1, ec1;

    runner add2()
           using a21, a22, s2, ec2;

    runner check_add()
           using s, s1, s2
           watching ec1, ec2;

    commit;
}

4. What's Next

We have learned the basic model and concepts, including task and depending task, in Puc programming. To have a deeper and broader understanding of the underlying execution framework in D-thinker, please refer to Puc Programmer Guide.