CM Under the Hood

Felipe Teixeira
Oct 12, 2022
5 min read

Updated: Oct 13, 2022

Howdy friends!

Things have been wild around here, and, as you can see, it impacted the plans and cadency for the Blog quite heavily.

I'm back now!

I know I haven't wrapped up the series about Closures, but I'd like to kickstart another one right now. This one will cover the internals and the behind-the-scenes of CET, which I don't think many people look at very frequently.

I intend to use the series to explore how our code in CM affects the instructions and the code compiled and used inside the platform.

To do that, we will start small and slow. We will get there - together!

TAC, ISNS, and ASM

The first thing we need to understand before diving deeper is what we will be looking at. Let's explore two pieces of information: CET's TAC and the generated ASM.

TAC and ISNS

Here's Wikipedia's description for TAC:

In computer science, three-address code... is an intermediate code used by optimizing compilers to aid in the implementation of code-improving transformations. Each TAC instruction has at most three operands and is typically a combination of assignment and a binary operator.

To see the ISNSs for a given TAC instruction generated for any CM Function, you can do the following:

ASM

And here's what Wikipedia says about ASM:

In computer programming, assembly language ... is any low-level programming language with a very strong correspondence between the instructions in the language and the architecture's machine code instructions.

Similarly, to inspect for the ASM of a CM function, all you need to do is:

Function-Scoped const

Maybe you were unaware you could do this, but CM allows you to declare variables as const.

As it turns out, Function-scoped const is not the same as regular const. But fear not, there's hope, and I got you covered.

First, let's see what actually happens when you declare a Function-scoped const.

Reading through ISNS and TAC

Let's explore a very simple function that leverages the const keyword.

To understand both const and static declarations in TAC form, let's go through the shared piece of instructions that both of them emit when compiled.

The first instruction we see here is the prologue, and the last one is the epilogue. They are fairly common Assembly instructions that are meant to prepare the stack and registers for whatever is about to happen or, in the case of epilogue, has happened. More on them here.**

Next, we have some calls to the CM compiler's method "plnTracing" (line 5); the value it returns is stored in line 6 and used for comparison in line 7. If the result of "plnTracing" is true, then we get a subsequent call to "printScrRef"; otherwise, we jump to line 11, which is where the label for "noTrace" is.

Inside the "noTrace" label (from lines 12 through 16) is the actual body of our method.

Const

For the "const" declaration, we can see that the value "something" is pushed straight into the stack, and immediately after that, we have to call "printStr" in line 14, using that value as a parameter (by calling "pop"). Last, in line 16, we just print a new line, and we're done.

Static

On the other hand, if we change just the keyword from "const" to "static," the generated TAC is greatly impacted. Let's take a look:

It's pretty clear that the beginning of the code is already completely different. Let's go line by line and understand what's going on.

In line 4, we're initializing and setting the value of a local (0x4) to be the value (if any) of the static value allocated into a specific spot in memory (0x00007FF48839E858).

Next, we check if the value in the local int is equal to zero, and if it is NOT* (meaning it was previously set to 1 in line 6), we should jump to line 8 (to the "skip" label section).

This, my friends, is the main difference between both codes. Effectively, line 4 prevents the reassignment of the value for the value "something" (stored at 0x00007FF488A03928). It ensures that lines 6 and 7 are only called once per session.

The section between lines 8 and 15 is exactly like the one we previously explored in the code for const (related to plnTracing).

The other difference we can see here is that in line 18, we are no longer pushing the value "something"; instead, we're pushing whatever is the value stored at 0x00007FF488A03928 - which is done in line 7.

Some extra piece of exploration

The findings we've made in this article helps clarify why you can't assign a value to a function-scoped const. The reason is that the code generated by the compiler doesn't generate a reference in memory that you could use. Instead, it just pushes the value forward.

Conversely, you can change the value for a function-scoped static variable. Although I think it behaves in a way that is not obvious if you don't understand what's behind the scenes.

Look at this code and try to guess the output once we run it.

Here it is:

run("c:/CetDev/release13.0/base/cm/compiler/test/runtime/function.cm");
First try...
something
something else
Second try...
something else
something else
cm>

Cool huh?!

What happened here results from that first check performed before the actual function body is called. During the first execution of foo, both values get the chance to be stored and retrieved.

The first allocation for "something" is done because there's nothing in memory for the str "myValue." The assignment for "something else" is performed through a simple call to "set" (set [static local 0x0.. str] "something else").

Things get a little bit funky the second time around. That's because the value for "myValue" is not restored to "something." This, as you might have guessed, is because the check for it right before the beginning of the method body tells that there's already a value stored in memory for that particular address. Hence, the code moves along without changing anything.

plnTracing

The check performed in line 7 enables you to trace all the existing calls to "pln" inside your code. Here's an example of that (look at like 8).

Will output:

run("c:/CetDev/release13.0/base/cm/compiler/test/runtime/function.cm");
ERROR c:\CetDev\release13.0\base\cm\compiler\test\runtime\function.cm:98:14 -  trace pln/pnn source
A
ERROR c:\CetDev\release13.0\base\cm\compiler\test\runtime\function.cm:99:14 -  trace pln/pnn source
B
ERROR c:\CetDev\release13.0\base\cm\compiler\test\runtime\function.cm:100:14 -  trace pln/pnn source
C
cm>

Wrapping up

Although not very obvious, the differences between function-scoped const and static cannot be ignored. Hopefully, with what you've learned today, you'll be able to choose the best approach for each case and, if necessary, explore other alternatives by looking into the generated TAC.