7.6 KiB
Units
Modules - Concept
- Equivalent to a package in Go
- Contains one or more FSPL source files
- Uniqued by a UUIDv4
- Depends on zero or more other units
- Source files in a module can access functionality of dependencies
Addressing
When compiling source files, depending on a module, etc. an address is used to
refer to a unit, which is a module or file. An addresser is anything that
addresses a unit, the addressee. An addresser can be a module, a user invoking
the compiler, or something else. An address is represented by a string. If the
string ends in .fspl
, the address refers to an FSPL source file. If not, the
address refers to a module.
If the address begins in a /
, ./
or ../
, the address is interpreted as an
absolute path beginning from the filesystem root, the current directory of the
addressee, or the directory above that respectively. Otherwise, the unit is
searched for within a set of standard or configured paths.
For example, if the search path is /usr/include/fspl
, and the address is
foo
, then the unit will be located at /usr/include/fspl/foo
. If the address
is foo/bar
, then the unit will be located at /usr/include/fspl/foo/bar
. If
there is an additional directory in the search path, such as
/usr/local/include/fspl
, then the unit will be searched for in each one (in
order) until it is found.
There are standard paths that the compiler will search for units in. These are, in order of preference:
$HOME/.local/src/fspl
$HOME/.local/include/fspl
/usr/local/src/fspl
/usr/local/include/fspl
/usr/src/fspl
/usr/include/fspl
On Windows, these are used instead:
%LOCALAPPDATA%\fspl\src
%LOCALAPPDATA%\fspl\include
%ALLUSERSPROFILE%\fspl\src
%ALLUSERSPROFILE%\fspl\include
%ProgramFiles%\fspl\src
%ProgramFiles%\fspl\include
Files in include
directories should not include program code, and should
only define types and external functions and methods, similar to header files in
C. They may have a corresponding shared object file that programs can
dynamically link against.
Files in src
directories may contain program code, and may be compiled into
an object file if the user wishes to link them statically. Because of FSPL's
ability to "skim" units (discussed later in this document), files in src
may
be used in the same way that files in include
are. Thus, src
files are
effectively more "complete" versions of include
files with extended
capability, and that is why they are searched first.
Uniqueness
Each unit is uniqued by a UUID. Most users will never directly use UUIDs, but they are essential in order to prevent name collisions within the compiler or linker. For modules, the UUID is specified in the metadata file. For other units, the UUID is a UUIDv3 (md5) generated using the zero-UUID as a namespace and the basename of the file (with the extension) as the data.
When creating a module, a UUID should be randomly generated for it. Keep in mind that altering the UUID of a library will cause programs that used to dynamically load it to no longer function until they are re-compiled. Therefore, UUIDs should only be altered if you are introducing breaking ABI changes to your library. If you are forking an existing module and making changes to it, a similar rule applies: only keep the same UUID if you intend on keeping an entirely backwards compatible ABI.
Built-in entities that can be accessed globally from any module (such as the
String
type) are given a zero-UUID, which represents the "global" unit.
Anything that is a part of this unit is accesisble from any other unit, without
having to use a nickname to refer to it.
When generating code, top-level entities must be named like this if their link name was not specified manually:
<uuid>::<name>
Where <uuid>
is the base64 encoding of the UUID. For example, the built-in
String type would be assigned the following link name:
AAAAAAAAAAAAAAAAAAAAAA==::String
And a type Bird
in a lone source file with the name bird.fspl
would be:
eT/CnSopFDlFwpDCnSEAThjDsBw=::Bird
Methods are named as follows:
<uuid>::<name>.<method>
Where <uuid>
and <name>
correspond to the base64 UUID of the unit and the
name of the method's owner type respectively, and <method>
corresponds to the
method name.
Module Structure
Each module is represented by a directory, which contains source files along
with a metadata file called fspl.mod
. The metadata file is of the form:
<file> -> <UUIDv4> <directive>*
<directive> -> <depedency>
<dependency> -> "+" <stringLiteral> [<ident>]
<UUIDv4> -> <stringLiteral>
Metadata files only make use of tokens defined in the FSPL lexer, and are designed to make use of the same parsing and lexing infrastructure used to parse and tokenize source files. A sample metadata file might look like:
'5a8353f8-cad8-4604-be60-29a2575996bc'
+ 'io'
+ '../io' customIo
The UUID is represented as a string, and so are addresses. When depending on a unit, it may be "nicknamed" by supplying an identifier after the address. This changes how the unit is referred to within the module.
Referencing Units
Compiled by itself, an FSPL source file has no access to other units. However, when compiling a module as a whole, all source files within the module have access to units depended on by the module's metadata file. Note that no actual data or code is imported into the module from the units it depends on, because all methods and functions defined within them are automatically turned into prototypes. The module must be linked either statically or dynamically to the unit's object code after compilation. This is why the FSPL compiler outputs object files by default.
FSPL source files may reference functions or types from dependencies by
prefixing them with a unit name and a double colon (::
), like this:
reader: io::Reader = x
data: *:Byte = io::[readAll reader]
The name of a unit depends on the associated dependency directive used in the module metadata file. If a nickname is listed, then that is used as the unit name. Otherwise, the unit name is the basename of the address, which is normalized and formatted into a valid identifier by the the following rules:
- If the name contains at least one dot, the last dot and everything after it are removed
- All non-alphabetical and non-numeric characters are removed, and any alphabetical characters that were directly after them are converted to uppercase
- All numeric digits at the start of the string are removed
- The first character is converted to lowercase
For example:
100-bottles-of-glue_test
Picture.jpg
Just a straight up sentence
Would become:
bottlesOfGlueTest
picture
justAStraightUpSentence
If the unit name is still not a valid identifier or is empty, the compiler will refuse to process the module and it is up to the user to either nickname the unit, or change the unit's basename to something workable.
The compiler will also refuse to process the module if one or more units end up
with the same unit name. However, a function or a variable may have the same
name as a unit because units are only ever used within the context of their own
special syntax (::
).
Future Work
Addresses do not necessarily have to refer to units. They could also refer to
arbitrary blobs to embed into a compiled program, similarly to how Go's embed
system works. There of course would need to be a distinction between depending
on units and embedding data, because someone might want to embed an FSPL source
file. Thus, there would need to be a separate metadata file directive, possibly
starting with an !
or something like that.