184 lines
7.4 KiB
Markdown
184 lines
7.4 KiB
Markdown
# Units
|
|
|
|
## Modules - Concept
|
|
|
|
- Equivalent to a package in Go
|
|
- Contains one or more FSPL source files
|
|
- Uniqued by a UUIDv4
|
|
- Depends on zero or more other units
|
|
- Source files in a module can access functionality of dependencies
|
|
|
|
## Addressing
|
|
|
|
When compiling source files, depending on a module, etc. an *address* is used to
|
|
refer to a *unit*, which is a module or file. An *addresser* is anything that
|
|
addresses a unit, the *addressee*. An addresser can be a module, a user invoking
|
|
the compiler, or something else. An address is represented by a string. If the
|
|
string ends in `.fspl`, the address refers to an FSPL source file. If not, the
|
|
address refers to a module.
|
|
|
|
If the address begins in a `/`, `./` or `../`, the address is interpreted as an
|
|
absolute path beginning from the filesystem root, the current directory of the
|
|
addressee, or the directory above that respectively. Otherwise, the unit is
|
|
searched for within a set of standard or configured paths.
|
|
|
|
For example, if the search path is `/usr/include/fspl`, and the address is
|
|
`foo`, then the unit will be located at `/usr/include/fspl/foo`. If the address
|
|
is `foo/bar`, then the unit will be located at `/usr/include/fspl/foo/bar`. If
|
|
there is an additional directory in the search path, such as
|
|
`/usr/local/include/fspl`, then the unit will be searched for in each one (in
|
|
order) until it is found.
|
|
|
|
There are standard paths that the compiler will search for units in. These are,
|
|
in order of preference:
|
|
- `$HOME/.local/src/fspl`
|
|
- `$HOME/.local/include/fspl`
|
|
- `/usr/local/src/fspl`
|
|
- `/usr/local/include/fspl`
|
|
- `/usr/src/fspl`
|
|
- `/usr/include/fspl`
|
|
|
|
Files in `include` directories should *not* include program code, and should
|
|
only define types and external functions and methods, similar to header files in
|
|
C. They may have a corresponding shared object file that programs can
|
|
dynamically link against.
|
|
|
|
Files in `src` directories *may* contain program code, and may be compiled into
|
|
an object file if the user wishes to link them statically. Because of FSPL's
|
|
ability to "skim" units (discussed later in this document), files in `src` may
|
|
be used in the same way that files in `include` are. Thus, `src` files are
|
|
effectively more "complete" versions of `include` files with extended
|
|
capability, and that is why they are searched first.
|
|
|
|
## Uniqueness
|
|
|
|
Each unit is uniqued by a UUID. Most users will never directly use UUIDs, but
|
|
they are essential in order to prevent name collisions within the compiler or
|
|
linker. For modules, the UUID is specified in the metadata file. For other
|
|
units, the UUID is a UUIDv3 (md5) generated using the zero-UUID as a namespace
|
|
and the basename of the file (with the extension) as the data.
|
|
|
|
When creating a module, a UUID should be randomly generated for it. Keep in mind
|
|
that altering the UUID of a library will cause programs that used to dynamically
|
|
load it to no longer function until they are re-compiled. Therefore, UUIDs
|
|
should only be altered if you are introducing breaking ABI changes to your
|
|
library. If you are forking an existing module and making changes to it, a
|
|
similar rule applies: only keep the same UUID if you intend on keeping an
|
|
entirely backwards compatible ABI.
|
|
|
|
Built-in entities that can be accessed globally from any module (such as the
|
|
`String` type) are given a zero-UUID, which represents the "global" unit.
|
|
Anything that is a part of this unit is accesisble from any other unit, without
|
|
having to use a nickname to refer to it.
|
|
|
|
When generating code, top-level entities must be named like this if their link
|
|
name was not specified manually:
|
|
|
|
`<uuid>::<name>`
|
|
|
|
Where `<uuid>` is the base64 encoding of the UUID. For example, the built-in
|
|
String type would be assigned the following link name:
|
|
|
|
`AAAAAAAAAAAAAAAAAAAAAA==::String`
|
|
|
|
And a type `Bird` in a lone source file with the name `bird.fspl` would be:
|
|
|
|
`eT/CnSopFDlFwpDCnSEAThjDsBw=::Bird`
|
|
|
|
Methods are named as follows:
|
|
|
|
`<uuid>::<name>.<method>`
|
|
|
|
Where `<uuid>` and `<name>` correspond to the base64 UUID of the unit and the
|
|
name of the method's owner type respectively, and `<method>` corresponds to the
|
|
method name.
|
|
|
|
## Module Structure
|
|
|
|
Each module is represented by a directory, which contains source files along
|
|
with a metadata file called `fspl.mod`. The metadata file is of the form:
|
|
|
|
```
|
|
<file> -> <UUIDv4> <directive>*
|
|
<directive> -> <depedency>
|
|
<dependency> -> "+" <stringLiteral> [<ident>]
|
|
<UUIDv4> -> <stringLiteral>
|
|
```
|
|
|
|
Metadata files only make use of tokens defined in the FSPL lexer, and are
|
|
designed to make use of the same parsing and lexing infrastructure used to parse
|
|
and tokenize source files. A sample metadata file might look like:
|
|
|
|
```
|
|
'5a8353f8-cad8-4604-be60-29a2575996bc'
|
|
+ 'io'
|
|
+ '../io' customIo
|
|
```
|
|
|
|
The UUID is represented as a string, and so are addresses. When depending on a
|
|
unit, it may be "nicknamed" by supplying an identifier after the address. This
|
|
changes how the unit is referred to within the module.
|
|
|
|
## Referencing Units
|
|
|
|
Compiled by itself, an FSPL source file has no access to other units. However,
|
|
when compiling a module as a whole, all source files within the module have
|
|
access to units depended on by the module's metadata file. Note that no actual
|
|
data or code is imported into the module from the units it depends on, because
|
|
all methods and functions defined within them are automatically turned into
|
|
prototypes. The module must be linked either statically or dynamically to the
|
|
unit's object code after compilation. This is why the FSPL compiler outputs
|
|
object files by default.
|
|
|
|
FSPL source files may reference functions or types from dependencies by
|
|
prefixing them with a unit name and a double colon (`::`), like this:
|
|
|
|
```
|
|
reader: io::Reader = x
|
|
data: *:Byte = io::[readAll reader]
|
|
```
|
|
|
|
The name of a unit depends on the associated dependency directive used in the
|
|
module metadata file. If a nickname is listed, then that is used as the unit
|
|
name. Otherwise, the unit name is the basename of the address, which is
|
|
normalized and formatted into a valid identifier by the the following rules:
|
|
|
|
- If the name contains at least one dot, the last dot and everything after it
|
|
are removed
|
|
- All non-alphabetical and non-numeric characters are removed, and any
|
|
alphabetical characters that were directly after them are converted to
|
|
uppercase
|
|
- All numeric digits at the start of the string are removed
|
|
- The first character is converted to lowercase
|
|
|
|
For example:
|
|
|
|
- `100-bottles-of-glue_test`
|
|
- `Picture.jpg`
|
|
- `Just a straight up sentence`
|
|
|
|
Would become:
|
|
|
|
- `bottlesOfGlueTest`
|
|
- `picture`
|
|
- `justAStraightUpSentence`
|
|
|
|
If the unit name is still not a valid identifier or is empty, the compiler will
|
|
refuse to process the module and it is up to the user to either nickname the
|
|
unit, or change the unit's basename to something workable.
|
|
|
|
The compiler will also refuse to process the module if one or more units end up
|
|
with the same unit name. However, a function or a variable may have the same
|
|
name as a unit because units are only ever used within the context of their own
|
|
special syntax (`::`).
|
|
|
|
## Future Work
|
|
|
|
Addresses do not necessarily have to refer to units. They could also refer to
|
|
arbitrary blobs to embed into a compiled program, similarly to how Go's embed
|
|
system works. There of course would need to be a distinction between depending
|
|
on units and embedding data, because someone might want to embed an FSPL source
|
|
file. Thus, there would need to be a separate metadata file directive, possibly
|
|
starting with an `!` or something like that.
|
|
|