Merge pull request 'document-packages' (#30) from document-packages into main

Reviewed-on: sashakoshka/fspl#30
This commit is contained in:
Sasha Koshka 2024-02-10 23:52:17 +00:00
commit 1593ecef7b
8 changed files with 370 additions and 85 deletions

89
analyzer/README.md Normal file
View File

@ -0,0 +1,89 @@
# analyzer
## Responsibilities
- Define syntax tree type that contains entities
- Turn streams of tokens into abstract syntax tree entities
## Organization
The entry point for all logic defined in this package is the Tree type. On this
type, the Analyze() method is defined. This method checks the semantic
correctness of an AST, fills in semantic fields within its data structures, and
arranges them into the Tree.
Tree contains a scopeContextManager. The job of scopeContextManager is to manage
a stack of scopeContexts, which are each tied to a function or method that is
currently being analyzed. In turn, each scopeContext manages stacks of
entity.Scopes and entity.Loops. This allows for greedy/recursive analysis of
functions and methods.
## Operation
When the analyze method is called, several hidden fields in the Tree are filled
out. Tree.ensure() instantiates data that can persist between analyses, which
consists of map initialization and merging the data in the builtinTypes map into
Tree.Types.
After Tree.ensure completes, Tree.assembleRawMaps() takes top-level entities
from the AST and organizes them into rawTypes, rawFunctions, and rawMethods. It
does this so that top-level entites can be indexed by name. While doing this, it
ensures that function and type names are unique, and method names are unique
within the type they are defined on.
Next, Tree.analyzeDeclarations() is called. This is the entry point for the
actual analysis logic. For each item in the raw top-level entity maps, it calls
a specific analysis routine, which is one of:
- Tree.analyzeTypedef()
- Tree.analyzeFunction()
- Tree.analyzeMethod()
These routines all have two crucial properties that make them very useful:
- They refer to top-level entities by name instead of by memory location
- If the entity has already been analyzed, they return that entity instead of
analyzing it again
Because of this, they are also used as accessors for top level entities within
more specific analysis routines. For example, the routine Tree.analyzeCall()
will call Tree.analyzeFunction() in order to get information about the function
that is being called. If the function has not yet been analyzed, it is analyzed
(making use of scopeContextManager to push a new scopeContext), and other
routines (including Tree.analyzeDeclarations()) will not have to analyze it all
over agian. After a top-level entity has been analyzed, these routines will
always return the same pointer to the one instance of the analyzed entity.
## Expression Analysis and Assignment
Since expressions make up the bulk of FSPL, expression analysis makes up the
bulk of the semantic analyzer. Whenever an expression needs to be analyzed,
Tree.analyzeExpression() is called. This activates a switch to call one of many
specialized analysis routines based on the expression entity's concrete type.
Much of expression analysis consists of the analyze checking to see if the
result of one expression can be assigned to the input of another. To this end,
assignment rules are used. There are five different assignment modes:
- Strict: Structural equivalence, but named types are treated as opaque and are
not tested. This applies to the root of the type, and to types enclosed as
members, elements, etc. This is the assignment mode most often used.
- Weak: Like strict, but the root types specifically are compared as if they
were not named. analyzer.ReduceToBase() is used to accomplish this.
- Structural: Full structural equivalence, and named types are always reduced.
- Coerce: Data of the source type must be convert-able to the destination type.
This is used in value casts.
- Force: All assignment rules are ignored. This is only used in bit casts.
All expression analysis routines take in as a parameter the type that the result
expression is being assigned to, and the assignment mode. To figure out whether
or not they can be assigned, they in turn (usually) call Tree.canAssign().
Tree.canAssign() is used to determine whether data of a source type can be
assigned to a destination type, given an assignment mode. However, it is not
called automatically by Tree.analyzeExpression() because:
- Determining the source type is sometimes non-trivial (see
Tree.analyzeOperation())
- Literals have their own very weak assignment rules, and are designed to be
assignable to a wide range of data types

View File

@ -7,16 +7,20 @@ import "git.tebibyte.media/sashakoshka/fspl/entity"
import "git.tebibyte.media/sashakoshka/fspl/integer" import "git.tebibyte.media/sashakoshka/fspl/integer"
type strictness int; const ( type strictness int; const (
// name equivalence // Structural equivalence, but named types are treated as opaque and are
// not tested. This applies to the root of the type, and to types
// enclosed as members, elements, etc. This is the assignment mode most
// often used.
strict strictness = iota strict strictness = iota
// structural equivalence up until the first base type, then name // Like strict, but the root types specifically are compared as if they
// equivalence applies to the parts of the type // were not named. analyzer.ReduceToBase() is used to accomplish this.
weak weak
// structural equivalence // Full structural equivalence, and named types are always reduced.
structural structural
// allow if values can be converted // Data of the source type must be convert-able to the destination type.
// This is used in value casts.
coerce coerce
// assignment rules are completely ignored and everything is accepted // All assignment rules are ignored. This is only used in bit casts.
force force
) )

View File

@ -145,7 +145,7 @@ referring to usually the name of a function. The result of a call may be assigne
any type matching the function's return type. Since it contains inherent type any type matching the function's return type. Since it contains inherent type
information, it may be directly assigned to an interface. information, it may be directly assigned to an interface.
### Method call ### Method call
Method call calls upon the method of the variable before the dot that is Method call calls upon the method (of the expression before the dot) that is
specified by the first argument, passing the rest of the arguments to the specified by the first argument, passing the rest of the arguments to the
method. The first argument must be a method name. The result of a call may be method. The first argument must be a method name. The result of a call may be
assigned to any type matching the method's return type. Since it contains assigned to any type matching the method's return type. Since it contains
@ -258,12 +258,15 @@ does not return anything, the return statement does not accept a value. In all
cases, return statements have no value and may not be assigned to anything. cases, return statements have no value and may not be assigned to anything.
### Assignment ### Assignment
Assignment allows assigning the result of one expression to one or more location Assignment allows assigning the result of one expression to one or more location
expressions. The assignment statement itself has no value and may not be expressions. The assignment expression itself has no value and may not be
assigned to anything. assigned to anything.
# Syntax entities # Syntax entities
Below is a rough syntax description of the language. Below is a rough syntax description of the language. Note that `<assignment>`
is right-associative, and `<memberAccess>` and `<methodCall>` are
left-associative. I invite you to torture yourself by attempting to implement
this without hand-writing a parser.
``` ```
<file> -> (<typedef> | <function> | <method>)* <file> -> (<typedef> | <function> | <method>)*