[Perl 6 page]

[Table of Contents] [next in this series]

Perl 6 Information Model

This describes the Information Model of Perl 6, as set forth in the original Apocalypses and worked over for many years since.

This is a purely theoretical description. It provides economy in description, e.g. it is “elegant”. It minimises the number of rules and concepts introduced, and avoids special cases. However, any real implementation that actually worked this way would be rather inefficient. Rather, real implementations will have lots of special-case rules and a very complex design, to perform common things in a much more efficient manner. As long as it gets the right answer in the end, it’s “correct”.

Knowing the formal model is helpful in understanding the language as a whole. What’s good for the formal description—enocomy of expression and lack of special cases—can be good for your brain too. The full richness of many Perl 6 features emerges from this design, so keeping it in mind will help you understand why many details are the way they are, naturally.

You also need to know this if you want to do something that relies on more subtle, less common, interactions of features. However, be warned that straying from the well-worn idioms may run into non-conformaties in the implementation, or cause the implementation to drag out the full formal framework and run slower than the common optimized cases.

Overview

The best way to approach it is with an illustration.

Illustration of the basic information model

Here you can see that a symbol table (shown in green) is the entry point into the collection of objects, and that objects may refer to other objects.

Some of the objects are “collections” which have a central role in the language. But at the highest level, everything is just objects, and objects can contain native values like numbers and strings or refer to other objects.

Primitive Object References

The red squares in the drawing are “cells” which somehow refer to other objects. This referencing is shown with arrows. What these cells actually are and how they work is below the level of the Perl 6 language, and the details are up to the implementation. To Perl 6, they are opaque, accessible only through built-in methods that are supplied for that purpose.

These “cells” exist within built-in types that serve as containers, and as “slots” in classes generated using the native object system (called p6opqaue). The built-in container classes have methods called FETCH and STORE, and objects made using p6opaque have private accessors generated for their slots.

You can think of these cells and primitive references as some kind of machine language pointer. A real implementation can acheive the effect however it likes. How the cells and the chaseing of the arrows works is completly unknown and unknowable to a standard Perl 6 program. Of course, each implementation can choose to offer extensions that expose the details of how it chose to do it; that is how “built ins” and external hooks can be written for that implementation.

Symbol Tables

Because objects are linked together, once you have an object you can get to other objects. But where do you get that first one from? In general, a symbol table provides object lookup by name, and are used by the compiler to get an object based on mentioning the name in the source code.

At its simplest, a symbol table maps strings to objects, just like a Hash. Not shown here are additional details such as other information associated with the name and the relationship between different symbol tables.

Binding

The cell located in the symbol table for a particular name is called a binding.

For example, @a is bound to the Array instance labeled #1 in the drawing. Names are often bound to objects that serve as containers for other objects, but that is not necessary. Names using the “$” sigil can be bound to any object.

Containers

Normally in a programming languages, variables are used to assocate a particular value (an object) with a name known to the compiler. For example, you would casually think about "Hello" being stored in $x, and three individual values (42, 99, and -1) stored in @a. However, in the drawing you can see that this casual usage leaves out the existance of the containers.

You might, at first, consider it overly pedantic to point out that @a is bound to an Array (object #1) and Array #1 holds 3 objects of type Int, which contain the values 42, 99, and -1 respectively.

But when you get confused as to why you can’t push, or if trying to alias a parameter with the formal argument so you can assign to it, knowing these details makes it all clear. So, know these details, but abbreviate in common use when you don’t need that level of detail.

Item Containers

In the drawing above, you can see that $x is bound to a Scalar that contains a Str, while $y is bound directly to a Str without the use of a container. Containers are not required for variables named using the “$” sigil. When they are used, they function as a container-of-1. In Perl 6, value types such as Int and Str are immutable. The object holding "Hello" will never change to represent a different string. Meanwhile, container types are mutable, holding different objects at different times.

A container may have its contents changed. So a single-item container confers this changability onto a single item. $y cannot refer to a different Str without rebinding the name. But $x can have its contents changed by changing what’s in the container.

So what’s the point, you ask? The item container will provide for lvalues and alising so that a function can change the caller’s objects. In fact, assignment is implemented for containers (including those single-item containers), not for value objects. For variables that are mentioned directly in the source code, you could use := instead of =, but when you want to alias another variable in such a way that you can assign to it, the difference is apparent. From an information model point of view, they are all just slots.

From a language point of view, the item containers keep out of the way, staying hidden but providing their capabilities to the information model when needed. Just as you don’t think about the existance of the container when you contemplate that $x holds “Hello”, so it is within the Perl 6 language. In fact, you’d have a hard time telling the difference beteen the different morphologies of $x and $y. $x.print(); will output “Hello”, and print substr($x,Chars(2),Chars(2)); will output “rl”. The $x variable seems to be a Str, so where’s the Scalar?

Almost every time you use an item container, it uses the contained item instead. So you generally won’t notice it unless you need to draw upon the features it provides for us. You can’t assign to $y using the = operator, for example. More interestingly, you can’t pass $y to a function that wants to treat the parameter as rw.

To continue

To continue, read the next page which concerns lvalues, and their role in parameter passing.

TODO: More on the nearly-invisible item container on another page.

TODO: read about Containers.

Or, see the Table of Contents for other subjects.

Also, here is a drawing on assignment that was prepared in September and not used anywhere.

Orthodoxy

Technical details checked as of May 2009.

This is exactly as described in the Synopses as of S02<168>, S03<166>, and S12<82>. In particular:

From S02:

$x may be bound to any object, including any object that can be bound to any other sigil.

Perl variables have two associated types: their "value type" and their "implementation type". (More generally, any container has an implementation type, including subroutines and modules.) The value type is stored as its of property, while the implementation type of the container is just the object type of the container itself.

   my $spot is Scalar;             # this is the default

Plus descriptions of Mutable and Immutable types, Value types, Implementation types, and many details that come to play to a lesser degree.

From S03:

A new form of assignment is present in Perl 6, called binding, used in place of typeglob assignment. It is performed with the := operator. Instead of replacing the value in a container like normal assignment, it replaces the container itself. For instance:

   my $x = 'Just Another';
   my $y := $x;
   $y = 'Perl Hacker';

From S12

Method calls on mutable scalars always go to the object contained in the scalar (autoboxing value types as necessary):

   $result = $object.doit();
   $length = "mystring".codes;

Method calls on non-scalar variables just calls the Array, Hash or Code object bound to the variable:

   $elems = @array.elems;
   @keys  = %hash.keys;
   $sig   = &sub.signature;

Use the prefix VAR macro on a scalar variable to get at its underlying Scalar object:

   if VAR($scalar).readonly {...}

This series of detailed descriptions with illustrations and use cases may “explain everything”, or may turn up shortcomings that need to be addressed. Furthermore, newer changes to the specification may come back to influence some of this basic stuff, and people with some experience at implementation will have insight to issues that are not in the specification. Finally, Larry is still having ideas.

So not only do I expect things to change, but I see this is being prepatory to discussions that effect such changes.