We don't need no Yoda's syntax

We don�t need no Yoda�s syntax

There is an urban myth in the programmers� community that the so called �Yoda�s syntax� performs better when checking an object for nullity. Let�s demystify it...

Yoda�s syntax

There is an urban myth in the programmers� community that the so called �Yoda�s syntax� performs better because it saves an operation cycle when checking an object for nullity.

Let�s not digress on the benefit that saving well even *one* operation cycle in the Common Intermediate Language (CIL) would bring. Let�s rather see whether this urban myth is justified, and if not, let�s demystify it by proving that, at the end � that is in the CIL code � there is really no difference in the order of writing the members of a comparison for equality.

So, first of all, what is the �Yoda�s syntax�? If you have ever watched Star Wars, Yoda is the character that speaks the object-subject-verb word order (�The apple is green� in Yodic would be �Green, the apple is�, try it yourself on the Yoda-speak generator).

In C# language, the equivalent of a Yoda�s expression is to ask whether the object is equal to the subject, rather than the English-correct way round. So, in practice, the �if� expression for checking whether an object is null, would be typically written as follows in plain English:

if (obj == null)

{

�� Console.WriteLine("obj == null");

}

Whereas, by adopting the Yoda�s syntax, this expression would be:

if (null == obj)

{

�� Console.WriteLine("null == obj");

}

Does it annoy you to read the expression in this way? Possibly. Let�s me make another example. Instead of checking for null, let�s check for a number. Instead of:

if (count == 5)

{

�� Console.WriteLine("count == 5");

}

Let�s speak Yoda�s logic:

if (5 == count)

{

�� Console.WriteLine("count == 5");

}

Sounds quite unnatural, doesn�t it?

So, why would someone decide to adopt Yoda�s syntax? Well, besides worshipping Star Wars, the official reason is that the compiler, when encountering the expression �null == obj� would optimise the CIL code to skip loading in the stack a value for the null member, as a null value is a no value, after all. Therefore, instead of comparing two locations of memory containing the value for �null� and �obj�, the compiler is smart enough to understand that null is not a value, thus it would skip one operation cycle necessary for loading the value of one member of comparison.

Besides making this justification irrelevant when comparing to numbers or valued objects (so basically this would apply only to comparison to null), I�ll tell you more: the C# compiler is even smarter, and makes the order of writing the members of an equality (and inequality) irrelevant.

The C# compiler makes the order of writing members of an equality irrelevant.

Disassembling Yoda

We can prove this by looking at the intermediate language generated by the C# compiler for the following simple statements.

public class Yoda

{

�� public void TestYodaSyntax(string obj)

�� {

�� if (null == obj)

�� {

�� Console.WriteLine("null == obj");

�� }

��

�� if (obj == null)

�� {

�� Console.WriteLine("obj == null");

�� }

}

Testing Yoda Syntax

In order to see the CIL for this C# code, we should use a disassembler tool. The �IL Disassembler�, aka IL DASM, available with Visual Studio is perfect for the purpose.

This is the result of disassembling the assembly generated by the C# compiler for the TestYodaSyntax method in the Yoda class in the example above.

.method public hidebysig instance void TestYodaSyntax(string obj) cil managed

{

� // Code size�� 50 (0x32)

� .maxstack� 2

� .locals init ([0] bool CS$4$0000)

� IL_0000:� nop

� IL_0001:� ldnull

� IL_0002:� ldarg.1

� IL_0003:� ceq

� IL_0005:� ldc.i4.0

� IL_0006:� ceq

� IL_0008:� stloc.0

� IL_0009:� ldloc.0

� IL_000a:� brtrue.s�� IL_0019

� IL_000c:� nop

� IL_000d:� ldstr�� "null == obj"

� IL_0012:� call�� void [mscorlib]System.Console::WriteLine(string)

� IL_0017:� nop

� IL_0018:� nop

� IL_0019:� ldarg.1

� IL_001a:� ldnull

� IL_001b:� ceq

� IL_001d:� ldc.i4.0

� IL_001e:� ceq

� IL_0020:� stloc.0

� IL_0021:� ldloc.0

� IL_0022:� brtrue.s�� IL_0031

� IL_0024:� nop

� IL_0025:� ldstr�� "obj == null"

� IL_002a:� call�� void [mscorlib]System.Console::WriteLine(string)

� IL_002f:� nop

� IL_0030:� nop

� IL_0031:� ret

} // end of method Yoda::TestYodaSyntax

I�ve marked in red the lines of our investigation.

But before we can start analysing the CIL code, let�s make a step back and try to understand better what CIL actually is and how it works� well, at least at very high level for the purpose of our demonstration!

The Common Intermediate Language (CIL for friends) is the lowest-level human-readable programming language used by the .NET Framework before it gets translated into native code for the platform on which it is running, or is executed by a CLI-compliant (Common Language Infrastructure) virtual machine. CIL itself is a platform-independent instruction set. CIL was initially known as Microsoft Intermediate Language or MSIL, but because of the standardization of C# and the Common Language Infrastructure, the bytecode is now officially known as CIL. CIL is object-oriented and entirely stack-based; that means that data are pushed on a stack instead of pulled from registers.

Let�s analyse lines 0001 to 0012 of the CIL code (they are numbered in hexadecimal) and let�s assume that we�re passing a null reference for the obj string in the input parameter; these lines correspond to the first �if� block, i.e. if (null == obj).

IL_0001:�� Load a null reference onto the stack

IL_0002:�� Load argument 1 of this method onto the stack (which is null)

IL_0003:�� Pop two values off the stack, compare them and load 1 (one) onto the stack if they are equal, else load 0 (zero)

In our example, after the first three lines of CIL, the stack would have the following value:

IL_0005:�� Push 0 (zero) onto the stack; this is because we now need to check the result of the equality comparison in the previous line, so we need to verify whether the result is 0 or 1 by comparing the existing value in the stack to 0 (CIL does not have a compare-non-equal instruction)

IL_0006:�� Compare again the last two values in the stack, which now is:

The last two values in the stack are different, so we expect the comparison operation �ceq� to push a 0 (zero) in the stack.

IL_0008:�� Pop a value from the stack into local variable 0; so now we are removing the last value entered from the stack (a stack is a LIFO � Last In First Out � structure).

IL_0009:�� Load local variable 0 onto the stack; �and back again in the stack, so this is our stack now:

IL_000a: Jump to line IL_0019 if the last value in the stack is non-zero (true); this is the condition to enter or skip the �if� block; in our example, the last value in the stack is zero, therefore the if block is executed, and the flow carries on to the next line

IL_000d:�� Load the "null == obj" string in the stack

IL_0012:�� Execute the .NET function System.Console::WriteLine passing in input the last value available in the stack; this is just the print on the console

�nop� stands for �no operation� and corresponds to blank lines in the C# source code (debug mode). �nop� instructions are removed when compiling in release mode.

Right, eleven lines of CIL code� but wait a minute, did we say initially that the alleged benefit of the Yoda�s syntax is to spot the null value upfront and remove checking for it? So basically I would expect to save the instruction ldnull at line IL_0001.

Let�s have a look now at the second block of code, the one related to the second if statement. Well, you can easily note that the eleven lines of CIL code are exactly the same instructions as the first block of code, with the only difference of lines IL_0019 and IL_001a that are inverted in sequence compared to their equivalent lines IL_0001 and IL_0002. Here the second if statement the obj is first pushed in the stack, and then is the null reference, as we would expect by the compiler turning this expression if (obj == null) into CIL code.

In short, there is absolutely no difference in the operations executed at lower level for comparing an object to null or a null reference to an object.

There is no difference in the CIL operations for comparing an object to null or viceversa.

Prevent assignment errors

Another point made by Yoda�s supporters is that using this unnatural order of members would prevent accidental assignments of a null reference to the object, instead of the expected comparison in the �if� statement.

Basically, they are saying that it is easier to write accidentally something like:

if (obj = null)

{

�� Console.WriteLine("obj == null");

}

Please note the single �=� sign. This is an assignment operator instead of a comparison operator, and the obj variable is set to null no matter the �if� statement, which actually would turn to be always true. Therefore, by inverting the position of null with obj, Yoda�s fans claim that this kind of error can never occur as it is not possible to assign an object to a null reference:

if (null = obj)

{

�� Console.WriteLine("null == obj");

}

When this kind of typo happens, the compile immediately suggests that the left-hand side of an assignment must be a variable, property or indexer, preventing compilation of the above code.

Good point. But also in this case, the C# compiler is smarter than what you think, preventing even the otherwise correct statement �obj = null� to occur within an �if� statement, by throwing the error �Cannot implicitly convert type string to bool� (string is the data type of the assignment operation, and bool is the expected data type for the �if� statement).

Sorry Yoda, you really are not good a programmer!

World, hello!

To invoke the TestYodaSyntax method that we have analysed so far, simply use a console application. Obviously, test it with the most classical of the examples in any programming language: Hello World! Or should I say �World, hello!�? J

public class Program

{

�� static void Main(string[] args)

�� {

�� Yoda yoda = new Yoda();

�� yoda.TestYodaSyntax("World, hello!");

�� }

}

Comments

New comment

Paul

On 26 Oct 2020 at 11:10

I though the reason for the Yoda syntax was to catch incorrect assignments so if (5 = count) { Console.WriteLine("count == 5"); } would produce a compiler warning whereas if ( count = 5 ) { Console.WriteLine("count == 5"); } would compile to valid code and may not be so easily caught in the code review. In fact writing it the other way around is so alien that it makes you think why you are doing it so the chances of a single '=' being written is reduced.

Neil

On 11 Oct 2020 at 17:56

Thanks for the interesting analysis. I also learnt a bit more about CIL, but I am curious about one thing. What is the purpose of the lines IL_0008 and IL_0009? They seem to just store the value from the stack into local variable 0, and then immediately reverse that operation by loading the value back into the stack, so why do they need to be there?

Source Code

Project Name: YodaUrbanMyth