C++: The Basics - eps1.18_Scope | embeddingchris.com

Hello and welcome. Hey, there is something we should talk about. First of all, sit down calmly. It’s like this…

… Now that we can break our programs into smaller units using functions, it’s time we had a serious word with each other about the scope of validity of variables. It is very important to know when a variable exists, by whom it can be seen or used, and where it has no validity.

You will notice this at the latest with your first out of scope error.

So far we have ignored this topic. But it is extremely important and belongs to the foundation of your programming knowledge. If you don’t pay attention to the scope, it can make your whole application unusable. It is a bit complex and not so easy to describe because of its abstract character.

Nevertheless, I try to illustrate the validity area as tangibly as possible. I waited until now with it, since our knowledge state, particularly over Pointer and memory addresses, helps us now to understand the Scope more easily.

Scope and visibility

For a long time I thought about the scope of variables and tried to get a clear as well as understandable picture of it. But it is not so easy to make it tangible. If you look in different literature for it, then it is mostly only described with its effects and rules.

The scope of a variable starts with its declaration and exists only within the block in which it was declared.

Do you remember what a block is? It is an area within the program code that is delimited by curly braces {}. Think for example of the block of a loop or of a function.

// listing1: scope block

{
  int anInt;
}

Outside the block the variable is not valid. There it is an unidentifiable entity for the compiler and it does not recognize it. This is the _visibility of the variable.

We have already seen that blocks can be nested. A declared variable is also valid for all inner blocks. But this does not work the other way round. Because the validity of the variable ends with the closing curly bracket } of the block.

listing2 and listing3 show the view of the compiler on a variable from different areas.

// listing2: in scope

#include <iostream>

int main()
{
  {
    int anInt = 7;

    std::cout << "I can see anInt(" << anInt << ")" << std::endl;

    {
      std::cout << "I can still see anInt(" << anInt << ")" << std::endl;  
    }
  }

  return 0;
}

# Output

I can see anInt(7)
I can still see anInt(7)

// listing3: not in scope

#include <iostream>

int main()
{
  {
    int anInt = 7;
  }
  
  // will not compile; anInt is out of scope
  std::cout << "I can not see anInt(" << anInt << ")" << std::endl;

  return 0;
}

# Output

In function 'int main()':
error: 'anInt' was not declared in this scope
std::cout << "I can not see anInt(" << anInt << ")" << std::endl;

Lokale und globale Variablen

Die Variable anInt in den Beispielen ist eine Lokale Variable. Sie existiert nur in ihrem Gültigkeitsbereich und ist auch nur dort sichtbar. Außerhalb wird sie nicht gesehen. Anders verhält es sich bei Verschachtelungen. Der innere Block kann die lokalen Variablen seines umschließenden Blocks sehen (siehe listing2).

Doch Vorsicht! Der Name der Variable spielt bei der Sichtbarkeit eine entscheidende Rolle. Diese kann durch die Deklaration einer neuen Variable gleichen Namens im inneren Block eingeschränkt werden. Dann ist die äußere Variable unsichtbar.

Der Grund dafür liegt an der Art und Weise, wie lokale Variablen im Speicher verarbeitet werden. Beim Betreten eines Gültigkeitsbereichs werden alle lokalen Daten auf einem besonderen Speicherbereich, den sogenannten Stack, angelegt. Dieser arbeitet nach dem last in- first out Prinzip. Das bedeutet, die zuletzt auf den Stack abgelegten Daten werden zuerst abgerufen. Beim Verlassen des Gültigkeitsbereichs werden alle lokalen Daten auf dem Stack wieder freigegeben. Somit existiert eine lokale Variable nur innerhalb seines Blocks.

Das bringt den Vorteil, dass wir uns nicht um die Speicherverwaltung bei Blockbeginn und Blockende kümmern müssen. Führt aber die Eigenschaft mit sich, dass Variablen in verschiedenen Gültigkeitsbereichen mit gleichem Namen deklariert werden können und nur dort sichtbar sind. Hat ein eingenesteter Block und sein umschließender Block den Namen verwendet, ist für jeden nur seine eigene Variable gültig. Das siehst du gut in listing4.

This has the advantage that we do not have to worry about memory management at the beginning and end of blocks. However, it brings with it the property that variables can be declared in different validity areas with the same name and are only visible there. If a built-in block and its enclosing block use the same name, only its own variable is valid for each. You can see this well in listing4.

// listing4: variable name in scope

#include <iostream>

int main()
{
  int anInt = 3;

  {
    std::cout << "I can see anInt(" << anInt << ")" << std::endl;

    int anInt = 7;

    std::cout << "I can still see anInt(" << anInt << "), right?" << std::endl;
  }

  std::cout << "I am sure anInt(" << anInt << ") is still the same" << std::endl;

  return 0;
}

# Output

I can see anInt(3)
I can still see anInt(7), right?
I am sure anInt(3) is still the same

Now you know what to look for in your local variables so that they are visible in the current block. However, if you have a value that you would like to use again and again in different validity areas, it would be easier if the variable can be seen from everywhere. So not only local, but global visible.

For this purpose there is the possibility in C++ to create global variables. These must be declared by you outside of your main function and can be seen afterwards from everywhere.

// listing5: global variable

#include <iostream>

int globalInt = 3;

int main()
{
  {
    int localInt = 5;

    std::cout << "I can see globalInt(" << globalInt << ")" << std::endl;
    std::cout << "I can see localInt("  << localInt  << ")" << std::endl;
 
    int globalInt = 7;

    std::cout << "I can still see globalInt(" << globalInt << "), right?" << std::endl;
  }
  
  std::cout << "I can see globalInt(" << globalInt << ")" << std::endl;
  // std::cout << "I can not see localInt("  << localInt  << ")" << std::endl;
  
  return 0;
}

# Output

I can see globalInt(3)
I can see localInt(5)
I can still see globalInt(7), right?
I can see globalInt(3)

Global variables can therefore be read from any location and values can be assigned to them.

Very practical. Why shouldn’t I use them more often? Saves time after all, can create all variables bundled in one place and no more information is lost when leaving blocks.

Sounds tempting at first and you will see it in older code sometimes, but globality has its disadvantages: Variables can be changed from anywhere for anyone. So you are never absolutely sure that the global variable carries the expected value, or if it was unknowingly changed somewhere else. Thus global variables have an unpredictability, and carry a certain risk, which increases with larger projects, with several programmers or multithreading (parallel execution of functions).

See more thanks to namespaces

Also, global variables are subject to the stack and they become invisible if a local variable uses the same name. However, you can counteract this with namespaces.

We have already become familiar with the std namespace from the C++ standard library. Now we will create our own namespace.

A namespace can only be defined outside your main function. Using the keyword namespace followed by the Namespace Specifier you can create a block {} into which all variables and functions that should carry the specifier will go.

In listing6 you can see very nicely how the visibility of globalInt is increased. Feel free to compare it with listing5.

// listing6: namespace variable

#include <iostream>

namespace myNamespace{
  int globalInt = 5;
}

int main()
{
  {
    int localInt = 5;

    std::cout << "I can see globalInt(" << myNamespace::globalInt << ")" << std::endl;
    std::cout << "I can see localInt("  << localInt  << ")" << std::endl;

    int globalInt = 7;

    std::cout << "And I can see another globalInt(" << globalInt << "), declared in this scope" << std::endl;
    std::cout << "Meanwhile, I can still see globalInt(" << myNamespace::globalInt << ") with its namespace" << std::endl;
  }

  return 0;
}

# Output

I can see globalInt(5)
I can see localInt(5)
And I can see another globalInt(7), declared in this scope
Meanwhile, I can still see globalInt(5) with its namespace

Namespaces are particularly interesting when you make your code available to others. It happens quickly that names appear several times, but your code always remains visible in every scope thanks to the namespace.

Functions, references, pointers and their scope

Now it’s time to unpack your knowledge about functions, references and pointers. We’ll now take a closer look at scope in functions.

A function has its own scope, bounded by curly braces {}, has local variables, and has access to global variables. So far nothing new. In addition, the arguments from the signature of the function, which have a different behavior depending on their form.

// listing7: function scope

#include <iostream>

int globalInt = 3;

int myFunction(int arg)
{
  int localInt = 5;
  int localSum = localInt + globalInt + arg;

  return localSum;
}

int main()
{
  int value = 5;
  std::cout << "Function returns: " << myFunction(value) << std::endl;
  
  return 0;
}

# Output

Function returns: 13

The function myFunction in listing7 has a total of three local variables localInt, localSum and arg. The special thing about the arg argument is that it is a local copy of the passed value. The reverse is true for localSum. The return statement makes a copy of the local variable and provides it as return value. This way values can be given to and received from a function. The scope is as we expect.

However, if the arguments are references or pointers, suddenly it’s not so easy to see who all can see the variable. Let’s define three functions. One gets the argument as a copy, the second gets the argument as a reference, and in the third function the argument is a pointer (see listing 8).

// listing8: different types of function arguments

#include <iostream>

// function argument as copy
int myFunction(int argument)
{
  // cannot see local variable of main function
  // std::cout << "Value of argument in main: << value;
  argument = 7; // can only see local copy of it, passed as argument

  std::cout << "Value of argument in my function: "
            << argument << std::endl;
  std::cout << "Address of argument in my function: "
            << &argument << std::endl;

  int localInt = 5;
  int localSum = localInt + argument;

  return localSum;
}

// function argument as reference
int myFunctionReference(int &argument)
{
  // cannot see local variable of main function
  // std::cout << "Value of argument in main: << value;
  argument = 7; // can only see the reference to it, passed as argument
  // But I can change the value of it as I have access to the memory address

  std::cout << "Value of argument in my function as reference: "
            << argument << std::endl;
  std::cout << "Address of argument in my function as reference: "
            << &argument << std::endl;

  int localInt = 5;
  int localSum = localInt + argument;

  return localSum;
}

// function argument as pointer
int myFunctionPointer(int *argument)
{
  // cannot see local variable of main function
  // std::cout << "Value of argument in main: << value;
  *argument = 5; // can only see the a pointer to it, passed as argument
  // But I can change the value of it as I have access to the memory address

  std::cout << "Value of argument in my function as pointer: "
            << *argument << std::endl;
  std::cout << "Address of argument in my function as pointer: "
            << argument << std::endl;

  int localInt = 5;
  int localSum = localInt + *argument;

  return localSum;
}

int main()
{
  int value = 5;

  std::cout << "Value of argument in main: " << value << std::endl;
  std::cout << "Address of argument in main: "<< &value << std::endl;

  int returnValue = myFunction(value);
  std::cout << "My function returns " << returnValue << std::endl << std::endl;

  std::cout << "Value of argument in main: " << value << std::endl;
  std::cout << "Address of argument in main: "<< &value << std::endl;

  returnValue = myFunctionReference(value);
  std::cout << "My function returns " << returnValue << std::endl << std::endl;

  std::cout << "Value of argument in main: " << value << std::endl;
  std::cout << "Address of argument in main: "<< &value << std::endl;

  returnValue = myFunctionPointer(&value);
  std::cout << "My function returns " << returnValue << std::endl << std::endl;
  std::cout << "Value of argument in main: " << value << std::endl;

  return 0;
}

# Output

Value of argument in main: 5
Address of argument in main: 0x7fff93758140
Value of argument in my function: 7
Address of argument in my function: 0x7fff9375811c
My function returns 12

Value of argument in main: 5
Address of argument in main: 0x7fff93758140
Value of argument in my function as reference: 7
Address of argument in my function as reference: 0x7fff93758140
My function returns 12

Value of argument in main: 7
Address of argument in main: 0x7fff93758140
Value of argument in my function as pointer: 5
Address of argument in my function as pointer: 0x7fff93758140
My function returns 10

Value of argument in main: 5

You see that none of the three functions can see the local variable value from the scope of the main function. However, references and pointers can access its memory address and change the value anyway. If you don’t want this, you have the possibility to make the reference immutable with const &argument. With pointers you cannot prevent writing to memory.

That sticks

This time we took a closer look at the validity (scope) and the lifetime of variables. In order to properly determine visibility and access to our data in any situation, we learned about different types of scopes and their rules.

With the declaration, we introduce a variable into its scope. A Local Variable is declared within a block or function. Thus, its scope extends from then to the end of the block. You can recognize a block by the pair of curly braces {} that enclose it. The arguments of a function also behave like local variables.

Variables declared outside the main function are visible from anywhere and are called global variables. Since they can be changed from anywhere, we are never quite sure of their state. And thus they increase the risk of errors in our code.

A name can be uniquely identified if it is defined in a namespace {} outside of your blocks. While its scope also extends from the point of declaration to the end of its block, it can be used elsewhere using the Namespace Specifier as a prefix myNamespace::myVar.

References and Pointers do not increase the visibility of your variables, but they can access the memory address and thus still change the value.

In general you can remember for all variables that they have to be initialized and all functions have to be defined before you can use them. And that they are destroyed again at the end of the validity range. For global objects, the time of destruction is the end of the program. Pointer variables created with new, however, live until you destroy them single-handedly with delete.

As you can see, there are some things you have to pay attention to when creating your variables and using your functions, so that they are available to you exactly when you need them.

I wish you maximum success!

Sources

[1] B. Stroustrup, A Tour of C++. Pearson Education, 2. Auflage, 29. Juni 2018.
[2] B. Stroustrup, Programming: Principles and Practice Using C++. Addison Wesley, 2. Auflage, 15. Mai 2014.
[3] U. Breymann, Der C++ Programmierer. C++ lernen – professionell anwenden – Lösungen nutzen. Aktuell zu C++17. München: Carl Hanser Verlag GmbH & Co. KG; 5. Auflage, 6. November 2017