Sunday, July 25, 2010

The cost of accessing object fields (part 1)

The common sense make us believe that adding more code and/or more variables leads to bigger programs. Looking at the generated code of one example in the previous post, the addition of one variable made the executable smaller. This occurs because fpc is smart enough to reuse registers (in this case eax).

This week, while fixing one Lazarus bug i noticed the following pattern in the generated code of method TDBEdit.DataChange:


movl 12(%ebx),%eax
movl 24(%eax),%eax


Basically this is the code to access FDataLink.Field property (the first instruction get the FDataLink address and the second get the Field address). So what would happen if this field was "buffered" in a TField local variable?

Before:


procedure TDBEdit.DataChange(Sender: TObject);
begin
if FDataLink.Field <> nil then begin
Alignment := FDataLink.Field.Alignment;
[..]


After:


procedure TDBEdit.DataChange(Sender: TObject);
var
DataLinkField: TField;
begin
DataLinkField := FDataLink.Field;
if DataLinkField <> nil then begin
Alignment := DataLinkField.Alignment;
[..]


This simple change lead to these differences.

As expected the code became smaller but two things surprised me:
  • There's no increase in the temporary memory allocated
  • The variable assignment did cost nothing (not even one instruction)

    The above test was done with a "clone" of TDBEdit.DataChange in a test project. To make sure there are no confounding factors i also tested with the original code to confirm the differences. Notice that in this case, although the code is also smaller, the addition of the variable increase the temporary memory allocated as well the variable assignment requires one extra instruction. Bad.

    But there was one last hope: compile LCL with -O2 option (i assumed that LCL was already compiled with that optimization turned on). Seems that my assumption was wrong. The -O2 option did the trick: the same result as before.

    In the next post i will play with a few more scenarios.

    And remember: don't forget to put -O2 in LCL build options when doing a release, it makes difference.
  • Wednesday, July 21, 2010

    Condition check versus a type map

    Often, the programmer is faced with the need to translate from one type to another, e.g., given a boolean variable return a corresponding integer value. As a real world example see a piece of Lazarus code:


    if NewWordWrap then
    gtk_text_view_set_wrap_mode(AGtkTextView, GTK_WRAP_WORD)
    else
    gtk_text_view_set_wrap_mode(AGtkTextView, GTK_WRAP_NONE);

    NewWordWrap is a boolean variable, but the gtk function expects an integer. To translate from type to another a condition check is done.

    Another way to handle this would be creating a map array with the type to be translated. Lazarus also has an example of this technique:


    const
    WidgetDirection : array[boolean] of longint = (GTK_TEXT_DIR_LTR, GTK_TEXT_DIR_RTL);
    [..]
    gtk_widget_set_direction(AGtkWidget, WidgetDirection[UseRightToLeftAlign]);

    Here is the same pattern: UseRightToLeftAlign is a boolean variable and the gtk function expects a integer, but instead of checking for the variable value a boolean to integer map (WidgetDirection) is used.

    While the map approach seems faster because avoids a check, it adds an additional constant. I decided to look at the generated code to see the actual benefits.

    Check the condition code:


    if B then
    DoIt(CONST_1)
    else
    DoIt(CONST_2);

    Map code:


    const
    BoolMap: array[Boolean] of Integer = (CONST_2, CONST_1);

    DoIt(BoolMap[B])

    Here is the generated code. This shows a clear advantage to the map approach. Notice that in this small example the size of executables were the same.

    I also tested a more complex type than boolean: an enumerated.

    Check the condition code:


    case E of
    EnumA: DoIt(CONST_1);
    EnumB: DoIt(CONST_2);
    EnumC: DoIt(CONST_3);
    end;

    Map code:


    const
    EnumMap: array[TMyEnum] of Integer = (CONST_1, CONST_2, CONST_3);

    DoIt(EnumMap[E])

    The result.

    Now with a slight optimized code for the condition check...


    var
    I: Integer;

    case E of
    EnumA: I := CONST_1;
    EnumB: I := CONST_2;
    EnumC: I := CONST_3;
    end;
    DoIt(I);

    ... i got this.