Ok now slices actually
This commit is contained in:
parent
53c0bb7171
commit
57e41e48ff
|
@ -0,0 +1,80 @@
|
|||
# Analyzer
|
||||
|
||||
Package analyzer:
|
||||
- Ensures an AST is semantically correct
|
||||
- Fills in semantic information within the AST
|
||||
- Gives named types a pointer to the defined type they reference
|
||||
- Gives variables a pointer to the declaration they reference
|
||||
- Gives all expressions, including literals, a type even if it is void
|
||||
- Organizes keyed entities into maps
|
||||
- Checks to make sure there are no duplicate key names
|
||||
- Returns the resulting AST as a Tree
|
||||
|
||||
## Types
|
||||
|
||||
### Scoped
|
||||
- Variable (name string) entity.Declaration
|
||||
- AddVariable (entity.Declaration)
|
||||
|
||||
Functions, methods,
|
||||
|
||||
### Tree
|
||||
Tree acts as a semantic tree to contain entities from the entity package that
|
||||
have semantic information filled in.
|
||||
|
||||
#### Data
|
||||
- Raw map of names -> types
|
||||
- Raw map of names -> functions
|
||||
- Raw map of type.names -> methods
|
||||
- Completed map of names -> types
|
||||
- Completed map of names -> functions
|
||||
- Scope breadcrumb trail
|
||||
- Every expression is assigned a Type when its type is determined
|
||||
- Methods are moved into a map within their types
|
||||
|
||||
#### Methods
|
||||
##### Analyze(parser.Tree) error
|
||||
Analyze takes in an AST and analyzes it within its own context. If this method
|
||||
is called multiple times, it will parse all of the trees as if they were one.
|
||||
This method returns an error if the tree has a semantic error and cannot be
|
||||
turned into a proper semantic tree.
|
||||
|
||||
First, add all top level AST declarations to quick access maps and make sure
|
||||
their names are unique.
|
||||
|
||||
For each top level declaration, call one of the analysis functions with its
|
||||
name.
|
||||
|
||||
##### pushScope (Scoped)
|
||||
Pushes a scope onto the scope trail
|
||||
|
||||
##### popScope ()
|
||||
Removes the last scope from the scope trail
|
||||
|
||||
##### variable (string) entity.Declaration
|
||||
Returns a named variable, and nil if it doesn't exist. Goes through all scopes
|
||||
from the top of the trail to the bottom until it finds one.
|
||||
|
||||
##### addVariable (string, entity.Declaration)
|
||||
Adds a variable to the top scope.
|
||||
|
||||
##### analyzeTypedef
|
||||
- If already analyzed return what has been analyzed
|
||||
- Get typedef from raw map
|
||||
- Analyze type
|
||||
- Store analyzed type in completed map
|
||||
|
||||
##### analyzeFunction
|
||||
- If already analyzed return what has been analyzed
|
||||
- Get function from raw map
|
||||
- Analyze signature
|
||||
- Analyze expression
|
||||
- Store analyzed function in completed map
|
||||
|
||||
##### analyzeMethod
|
||||
- Analyze type
|
||||
- If already analyzed return what has been analyzed
|
||||
- Get method from raw map
|
||||
- Analyze signature
|
||||
- Analyze expression
|
||||
- Store analyzed method in type
|
|
@ -0,0 +1,118 @@
|
|||
# Main
|
||||
|
||||
Package main:
|
||||
- Parses CLI arguments
|
||||
- Displays help
|
||||
- Compiles file(s)
|
||||
- Displays any errors in a readable way
|
||||
|
||||
## Subroutines
|
||||
|
||||
### Compile (inputFiles ...string, outputFile string) error
|
||||
- Create AST
|
||||
- For each input file:
|
||||
- Feed file into AST
|
||||
- If error, print error and terminate
|
||||
- Feed AST from parser into analyzer
|
||||
- If error, print error and terminate
|
||||
- Open temporary IR output pipe
|
||||
- Feed semantic tree from analyzer and the output pipe into generator
|
||||
- If error, print error and terminate
|
||||
- Invoke LLVM on temporary IR output pipe, instruct it to output to output
|
||||
file
|
||||
- Close termporary IR output pipe
|
||||
|
||||
# Entity
|
||||
|
||||
Package entity defines types to represent language entities, as well as some
|
||||
convenience methods for dealing with them.
|
||||
|
||||
## Types
|
||||
|
||||
- TopLevel
|
||||
- Function
|
||||
- Typedef
|
||||
- Method
|
||||
- Type
|
||||
- TypeNamed
|
||||
- TypePointer
|
||||
- TypeArray
|
||||
- TypeStruct
|
||||
- TypeInterface
|
||||
- Expression
|
||||
- LiteralInt
|
||||
- LiteralFloat
|
||||
- LiteralArray
|
||||
- LiteralStruct
|
||||
- Variable
|
||||
- Declaration
|
||||
- Call
|
||||
- Subscript
|
||||
- Dereference
|
||||
- Reference
|
||||
- ValueCast
|
||||
- BitCast
|
||||
- Operation
|
||||
- Block
|
||||
- MemberAccess
|
||||
- IfElse
|
||||
- Loop
|
||||
- Break
|
||||
- Return
|
||||
- Assignment
|
||||
- Member
|
||||
- Signature
|
||||
|
||||
# Parser
|
||||
|
||||
Package parser parses data in an input io.Reader into an AST.
|
||||
|
||||
## Types
|
||||
|
||||
### Tree
|
||||
Tree acts as an AST to contain top-level declarations from the entity package.
|
||||
#### Methods
|
||||
##### Parse(name string, file io.Reader) error
|
||||
Parse takes in a file name, and an io.Reader and parses its contents into the
|
||||
tree. The name is only used for error reporting purposes, this method does not
|
||||
open any files. It returns an error if the syntax read from the input reader is
|
||||
erroneous and cannot be parsed.
|
||||
##### ParseFile(name string) error
|
||||
ParseFile is like Parse, except it automatically opens and closes the file
|
||||
specified by the given name.
|
||||
|
||||
# Analyzer
|
||||
|
||||
Package analyzer:
|
||||
- Ensures an AST is semantically correct
|
||||
- Fills in semantic information within the AST
|
||||
- Gives named types a pointer to the defined type they reference
|
||||
- Gives variables a pointer to the declaration they reference
|
||||
- Gives all expressions, including literals, a type even if it is void
|
||||
- Organizes keyed entities into maps
|
||||
- Checks to make sure there are no duplicate key names
|
||||
- Returns the resulting AST as a Tree
|
||||
|
||||
## Types
|
||||
|
||||
### Tree
|
||||
Tree acts as a semantic tree to contain entities from the entity package that
|
||||
have semantic information filled in.
|
||||
#### Methods
|
||||
##### Analyze(parser.Tree) error
|
||||
Analyze takes in an AST and analyzes it within its own context. If this method
|
||||
is called multiple times, it will parse all of the trees as if they were one.
|
||||
This method returns an error if the tree has a semantic error and cannot be
|
||||
turned into a proper semantic tree.
|
||||
|
||||
# Generator
|
||||
|
||||
Package generator turns semantic trees into LLVM IR and outputs it to an
|
||||
io.Writer.
|
||||
|
||||
## Subroutines
|
||||
|
||||
### Generate (analyzer.Tree, io.Writer) error
|
||||
Generate takes in a semantic tree and writes corresponding LLVM IR to the given
|
||||
io.Writer. It returns an error in case there is something wrong with the
|
||||
semantic tree that prevents the code generation process from occurring.
|
|
@ -0,0 +1,274 @@
|
|||
# Semantic entities
|
||||
|
||||
## Top level
|
||||
### Type definition
|
||||
Type definitions bind a type to a global identifier.
|
||||
### Function
|
||||
Functions bind a global identifier and argument list to an expression which is
|
||||
evaluated each time the function is called. If no expression is specified, the
|
||||
function is marked as external. Functions have an argument list, where each
|
||||
argument is passed as a separate variable. They return one value. All of these
|
||||
are typed.
|
||||
### Method
|
||||
A method is like a function, except localized to a defined type. Methods are
|
||||
called on an instance of that type, and receive a pointer to that instance via
|
||||
the "this" variable. Method names are not globally unique, bur are unique within
|
||||
the type they are defined on.
|
||||
|
||||
## Types
|
||||
### Named
|
||||
Named refers to a user-defined or built in named type.
|
||||
### Pointer
|
||||
Pointer is a pointer to another type.
|
||||
### Array
|
||||
Array is a group of values of a given type stored next to eachother. The length
|
||||
of an array is fixed and is part of its type. Arrays are passed by value unless
|
||||
a pointer is used.
|
||||
### Struct
|
||||
Struct is a composite type that stores keyed values. The positions of the values
|
||||
within the struct are decided at compile time, based on the order they are
|
||||
specified in. Structs are passed by value unless a pointer is used.
|
||||
### Interface
|
||||
Interface is a polymorphic pointer that allows any value of any type through,
|
||||
except it must have at least the methods defined within the interface.
|
||||
Interfaces are always passed by reference. When assigning a value to an
|
||||
interface, it will be referenced automatically. When assigning a pointer to an
|
||||
interface, the pointer's reference will be used instead.
|
||||
|
||||
## Expressions
|
||||
### Location expressions
|
||||
Location expressions are special expressions that only refer to the location of
|
||||
a memory address. An expression is only a location expression if its value
|
||||
originates from another location expression. Such expressions are marked here
|
||||
with a star (*).
|
||||
### Literals
|
||||
#### Integer
|
||||
An integer literal specifies an integer value. It can be assigned to any type
|
||||
that is derived from an integer, or a float.
|
||||
It cannot be directly assigned to an interface because it contains no inherent
|
||||
type information. A value cast may be used for this purpose.
|
||||
#### Float
|
||||
A float literal specifies a floating point value. It can be assigned to any type
|
||||
that is derived from a float.
|
||||
It cannot be directly assigned to an interface because it contains no inherent
|
||||
type information. A value cast may be used for this purpose.
|
||||
#### Array
|
||||
Array is a composite array literal. It can contain any number of values. It can
|
||||
be assigned to any array type that:
|
||||
1. has an identical length, and
|
||||
2. who's element type can be assigned to by all the element values in the
|
||||
literal.
|
||||
|
||||
It cannot be directly assigned to an interface because it contains no inherent
|
||||
type information. A value cast may be used for this purpose.
|
||||
#### Struct
|
||||
Struct is a composite structure literal. It can contain any number of name:value
|
||||
pairs. It can be assigned to any struct type that:
|
||||
1. has at least the members specified in the literal
|
||||
2. who's member types can be assigned to by the corresponding member values in
|
||||
the literal.
|
||||
|
||||
It cannot be directly assigned to an interface because it contains no inherent
|
||||
type information. A value cast may be used for this purpose.
|
||||
### Variable *
|
||||
Variable specifies a named variable. It can be assigned to a type matching the
|
||||
variable declaration's type. Since it contains inherent type information, it may
|
||||
be directly assigned to an interface.
|
||||
### Declaration *
|
||||
Declaration binds a local identifier to a typed variable, but also acts as a
|
||||
variable expression allowing the variable to be used the moment it is defined.
|
||||
Since it contains inherent type information, it may be directly assigned to an
|
||||
interface.
|
||||
### Block
|
||||
Block is an ordered collection of expressions that are evaluated sequentially.
|
||||
It has its own scope. The last expression in the block specifies the block's
|
||||
value, and any assignment rules of the block are equivalent to those of its last
|
||||
expression.
|
||||
### Call
|
||||
Call calls upon the function specified by the first argument, and passes the
|
||||
rest of that argument to the function. The first argument must be a function
|
||||
type, usually the name of a function. The result of a call may be assigned to
|
||||
any type matching the function's return type. Since it contains inherent type
|
||||
information, it may be directly assigned to an interface.
|
||||
### Member access *
|
||||
Member access allows referring to a specific member of a value with a struct
|
||||
type. It accepts any struct type that contains the specified member name, and
|
||||
may be assigned to any type that matches the type of the selected member. Since
|
||||
it contains inherent type information, it may be directly assigned to an
|
||||
interface.
|
||||
### Method access
|
||||
Method access allows referring to a specific method of a type, or a behavior of
|
||||
an interface. It can only be assigned to the first argument of a call.
|
||||
### Array subscript *
|
||||
Array subscripting allows referring to a specific element of an array. It
|
||||
accepts any array, and any offset of type Size. It may be assigned to any type
|
||||
matching the array's element type. Since it contains inherent type information,
|
||||
it may be directly assigned to an interface.
|
||||
### Pointer dereference *
|
||||
Pointer dereferencing allows retrieving the value of a pointer. It accepts any
|
||||
pointer. It may be assigned to any type matching the pointer's pointed type.
|
||||
Since it contains inherent type information, it may be directly assigned to an
|
||||
interface.
|
||||
### Value reference
|
||||
Value referencing allows retrieving the location of a value in memory. It
|
||||
accepts any location expression, and can be assigned to any type that is a
|
||||
pointer to the location expression's type. Since it contains inherent type
|
||||
information, it can be directly assigned to an interface, although it doesn't
|
||||
make a whole lot of sense to do so because assigning a value to an interface
|
||||
automatically references it anyway.
|
||||
### Bit casting
|
||||
Bit casting takes the raw data in memory of a certain value and re-interprets it
|
||||
as a value of another type. Since it contains inherent type information, it may
|
||||
be directly assigned to an interface.
|
||||
### Value casting
|
||||
Vaue casting converts a value of a certain type to another type. Since it
|
||||
contains inherent type information, it may be directly assigned to an interface.
|
||||
### Operations
|
||||
Operations perform math, logic, or bit manipulation on values. They accept
|
||||
values of the same type as the type they are being assigned to, except in
|
||||
special cases. Since they contain no inherent type information, they may not be
|
||||
assigned to interfaces.
|
||||
#### Math
|
||||
Mathematical operations perform math on numeric values.
|
||||
- `+` Returns the sum of all arguments
|
||||
- `++` Returns the sum of all arguments, plus 1
|
||||
- `-` Returns all arguments after the first subtracted from the first
|
||||
- `--` Returns all arguments after the first subtracted from the first, minus 1
|
||||
- `*` Returns the product of all arguments
|
||||
- `/` Returns A0 / A1 / ... / An
|
||||
- `%` Returns the remainder of the first argument divided by the second.
|
||||
#### Logic
|
||||
Logic operations perform logic on booleans.
|
||||
- `!` Returns the logical negation of the argument
|
||||
- `|` Returns the logical OR of all arguments
|
||||
- `&` Returns the logical AND of all arguments
|
||||
- `^` Returns the logical XOR of all arguments
|
||||
#### Bit manipulation
|
||||
Bit manipulation allows for manipulating values at the binary level. These work
|
||||
on all types except reference types.
|
||||
- `!!` Returns the bitwise negation of the argument
|
||||
- `||` Returns the bitwise OR of all arguments
|
||||
- `&&` Returns the bitwise AND of all arguments
|
||||
- `^^` Returns the bitwise XOR of all aruments
|
||||
- `<<` Returns the first argument bit-shifted to the left by the second
|
||||
argument. The second argument must be an integer.
|
||||
- `>>` Returns the first argument bit-shifted to the right by the second
|
||||
argument. The second argument must be an integer.
|
||||
#### Comparison
|
||||
Comparison operations compare two values and return a boolean.
|
||||
- `<` Returns if all operands are in ascending order from left to right.
|
||||
- `>` Returns if all operands are in descending order from left to right.
|
||||
- `<=` Returns if all operands are in ascending order from left to right,
|
||||
allowing equal operands.
|
||||
- `>=` Returns if all operands are in descending order from left to right,
|
||||
allowing equal operands.
|
||||
- `=` Returns if all operands are equal to eachother.
|
||||
|
||||
Comparison operations are the only constructs in FSPL which are allowed to infer
|
||||
their argument types. The rules for this are as follows:
|
||||
- If at least one argument has type information, that type is used for all
|
||||
arguments that do not.
|
||||
- Else, fail. Optionally call the user an idiot if this is because they directly
|
||||
compared two literals.
|
||||
### If/else
|
||||
If/else is a control flow branching expression that executes one of two
|
||||
expressions depending on a boolean value. If the value of the if/else is unused,
|
||||
the else expression need not be specified. It may be assigned to any type that
|
||||
satisfies the assignment rules of both the true and false expressions.
|
||||
### Loop
|
||||
Loop is a control flow expression that repeats an expression until a break
|
||||
statement is called from within it. The break statement must be given a value
|
||||
if the value of the loop is used. Otherwise, it need not even have a break
|
||||
statement. The result of the loop may be assigned to any type that satisfies the
|
||||
assignment rules of all of its break statements. Loops may be nested, and break
|
||||
statements only apply to the closest containing loop. The value of the loop's
|
||||
expression is never used.
|
||||
### Break
|
||||
Break allows breaking out of loops. It has no value and may not be assigned to
|
||||
anything.
|
||||
### Return
|
||||
Return allows terminating functions before they have reached their end. It
|
||||
accepts values that may be assigned to the function's return type. If a function
|
||||
does not return anything, the return statement does not accept a value. In all
|
||||
cases, return statements have no value and may not be assigned to anything.
|
||||
### Assignment
|
||||
Assignment allows assigning the result of one expression to one or more location
|
||||
expressions. The assignment statement itself has no value and may not be
|
||||
assigned to anything.
|
||||
|
||||
# Syntax entities
|
||||
|
||||
```
|
||||
<file> -> (<typedef> | <function> | <method>)*
|
||||
<typedef> -> <identifier> ":" <type>
|
||||
<function> -> <signature> ["=" <expression>]
|
||||
<method> -> <identifier> "." <function>
|
||||
|
||||
<type> -> <namedType>
|
||||
| <pointerType>
|
||||
| <arrayType>
|
||||
| <structType>
|
||||
| <interfaceType>
|
||||
<namedType> -> <identifier>
|
||||
<pointerType> -> "*" <type>
|
||||
<arrayType> -> <intLiteral> "x" <type>
|
||||
<structType> -> "(" <declaration>* ")"
|
||||
<interfaceType> -> "(" <signature> ")"
|
||||
|
||||
<expression> -> <intLiteral>
|
||||
| <floatLiteral>
|
||||
| <arrayLiteral>
|
||||
| <structLiteral>
|
||||
| <variable>
|
||||
| <declaration>
|
||||
| <call>
|
||||
| <subscript>
|
||||
| <dereference>
|
||||
| <reference>
|
||||
| <valueCast>
|
||||
| <bitCast>
|
||||
| <operation>
|
||||
| <block>
|
||||
| <memberAccess>
|
||||
| <ifelse>
|
||||
| <loop>
|
||||
| <break>
|
||||
| <return>
|
||||
<statement> -> <expression> | <assignment>
|
||||
<variable> -> <identifier>
|
||||
<declaration> -> <identifier> ":" <type>
|
||||
<call> -> "[" <expression>+ "]"
|
||||
<subscript> -> "[" "." <expression> <expression> "]"
|
||||
<dereference> -> "[" "." <expression> "]"
|
||||
<reference> -> "[" "@" <expression> "]"
|
||||
<valueCast> -> "[" "cast" <type> <expression> "]"
|
||||
<bitCast> -> "[" "bitcast" <type> <expression> "]"
|
||||
<operation> -> "[" <operator> <expression>* "]"
|
||||
<block> -> "{" <statement>* "}"
|
||||
<memberAccess> -> <variable> "." <identifier>
|
||||
<methodAccess> -> <variable> "::" <identifier>
|
||||
<ifelse> -> "if" <expression>
|
||||
"then" <expression>
|
||||
["else" <expression>]
|
||||
<loop> -> "loop" <expression>
|
||||
<break> -> "[" "break" [<expression>] "]"
|
||||
<return> -> "[" "return" [<expression>] "]"
|
||||
<assignment> -> <expression> "=" <expression>
|
||||
|
||||
<intLiteral> -> /-?[1-9][0-9]*/
|
||||
| /-?0[0-7]*/
|
||||
| /-?0x[0-9a-fA-F]*/
|
||||
| /-?0b[0-1]*/
|
||||
<floatLiteral> -> /-?[0-9]*\.[0-9]+/
|
||||
<arrayLiteral> -> "(*" <expression>* ")"
|
||||
<structLiteral> -> "(" <member>* ")"
|
||||
|
||||
<member> -> <identifier> ":" <expression>
|
||||
<signature> -> "[" <identifier> <declaration>* "]" [":" <type>]
|
||||
<identifier> -> /[A-Za-z]+/
|
||||
<operator> -> "+" | "++" | "-" | "--" | "*" | "/" | "%"
|
||||
| "!!" | "||" | "&&" | "^^"
|
||||
| "!" | "|" | "&" | "^" | "<<" | ">>"
|
||||
| "<" | ">" | "<=" | ">=" | "="
|
||||
```
|
||||
|
|
@ -78,7 +78,7 @@ type Subscript struct {
|
|||
func (*Subscript) expression(){}
|
||||
func (*Subscript) statement(){}
|
||||
func (this *Subscript) String () string {
|
||||
return fmt.Sprint("[.", this.Array, this.Offset, "]")
|
||||
return fmt.Sprint("[.", this.Array, " ", this.Offset, "]")
|
||||
}
|
||||
|
||||
// Slice adjusts the start and end points of a slice relative to its current
|
||||
|
@ -87,9 +87,22 @@ func (this *Subscript) String () string {
|
|||
// is never a valid location expression.
|
||||
type Slice struct {
|
||||
Pos lexer.Position
|
||||
Slice Expression `parser:" '[' '\\' @@ "`
|
||||
Start Expression `parser:" @@? "`
|
||||
End Expression `parser:" ':' @@? ']' "`
|
||||
Slice Expression `parser:" '[' '\\\\' @@ "`
|
||||
Start Expression `parser:" @@? "`
|
||||
End Expression `parser:" ':' @@? ']' "`
|
||||
}
|
||||
func (*Slice) expression(){}
|
||||
func (*Slice) statement(){}
|
||||
func (this *Slice) String () string {
|
||||
out := fmt.Sprint("[\\", this.Slice, " ")
|
||||
if this.Start != nil {
|
||||
out += fmt.Sprint(this.Start)
|
||||
}
|
||||
out += ":"
|
||||
if this.End != nil {
|
||||
out += fmt.Sprint(this.End)
|
||||
}
|
||||
return out + "]"
|
||||
}
|
||||
|
||||
// Pointer dereferencing allows retrieving the value of a pointer. It accepts
|
||||
|
|
|
@ -39,7 +39,8 @@ testString (test,
|
|||
[call] = [print 3 1]
|
||||
[emptyBlock] = {}
|
||||
[nestedBlock] = {342 93.34 {3948 32}}
|
||||
[subscript] = [.arr 3]
|
||||
[subscript] = [.(* 3 1 2 4) 3]
|
||||
[slice] = [\(* 1 2 3 4 5 6 7 8 9) consumed:]
|
||||
[dereference] = [.ptr]
|
||||
[reference] = [@val]
|
||||
[valueCast] = [cast F32 someValue]
|
||||
|
@ -65,7 +66,8 @@ testString (test,
|
|||
32
|
||||
}
|
||||
}
|
||||
[subscript] = [. arr 3]
|
||||
[subscript] = [. (* 3 1 2 4) 3]
|
||||
[slice] = [\ (* 1 2 3 4 5 6 7 8 9) consumed:]
|
||||
[dereference] = [. ptr]
|
||||
[reference] = [@ val]
|
||||
[valueCast] = [cast F32 someValue]
|
||||
|
|
Loading…
Reference in New Issue