Skip to content

Schemta API

[module] schemta

The Scheme Multi Target Assembler

[record] asm-target

[constructor] (make-asm-target #!key (ENDIAN 'little) (REGISTERS '()) (REGISTER-SETS '()) (ADDRESSING-MODES '()) (FLAGS '()) (FLAG-SETS '()) (EXTRA '()) (INSTRUCTIONS (make-hash-table)))
[predicate] asm-target?
implementation: defstruct

field getter setter default type
endian asm-target-endian asm-target-endian-set! 'little symbol
registers asm-target-registers asm-target-registers-set! '() (list-of (list symbol fixnum))
register-sets asm-target-register-sets asm-target-register-sets-set! '() (list-of (list symbol (list-of symbol)))
addressing-modes asm-target-addressing-modes asm-target-addressing-modes-set! '() (list-of (list symbol procedure))
flags asm-target-flags asm-target-flags-set! '() (list-of (list symbol fixnum))
flag-sets asm-target-flag-sets asm-target-flag-sets-set! '() (list-of (list symbol (list-of symbol)))
extra asm-target-extra asm-target-extra-set! '() (list-of (list symbol *))
instructions asm-target-instructions asm-target-instructions-set! (make-hash-table) hash-table

[variable] *schemta-include-path*

type: string
default: "mdal-targets/"

[procedure] (set-schemta-include-path! P)

[variable] register

default: #f

[variable] address

default: #f

[variable] flag

default: #f

[variable] extras

default: #f

[variable] flag-value

default: #f

[variable] register-value

default: #f

[variable] numeric

default: #f

[variable] current-origin

default: #f

[variable] symbol

default: #f

[variable] symbol-ref

default: #f

[variable] defined?

default: #f

Return the least significant byte of N.

[procedure] (lsb N)

Return the most significant byte of word N. If N is a too large to be be represented in a word (2 bytes), returns the msb of N & 0xffff.

[procedure] (msb N)

Return the least significant word of N.

[procedure] (lsw N)

Returns the most significant word of N.

[procedure] (msw N)

[procedure] (int->bytes I NUMBER-OF-BYTES ENDIAN)

Convert the integer I into a list of bytes, capped at NUMBER-OF-BYTES and respecting ENDIANness.

Instruction Parser

[procedure] (required-symbols EXPR)

Get the list of symbols needed to resolve the s-expression directive EXPR.

[procedure] (list-required-symbols OPERANDS)

List the symbols required to evaluate the given OPERANDS

[procedure] (match-operands OPERANDS PARSER-OPTIONS)

Match operands against instruction parser options.

[procedure] (parse-operands OPERANDS TARGET)

[procedure] (resolve-operands OPERANDS OPTION-LST TARGET)

Match the operands of an instruction against the options in the target instruction table, and return a list containing the parsed operands in car, and the output composition in cadr.

[procedure] (resolve-instruction OPCODE OPERANDS TARGET)

Returns either an asm-instruction structure, or a list of bytes if the instruction can be resolved immediately.

[variable] horizontal-whitespace

default: (string->char-set " \t")

[procedure] (a-string-parser CHAR-SET CONVERT-FN)

[variable] a-quoted-string


(as-string (sequence (is #\")
              (any-of (preceded-by (is #\\)
                           (is #\"))
                  (in (char-set-difference
                       (char-set-union char-set:graphic
                       ;; TODO for some reason this line fails
                       ;; when emitting a .types file.
                       (->char-set #\")))))
             (is #\")))

[procedure] (in-parens PARSER)

[procedure] (in-brackets PARSER)

[procedure] (preceded-by* PARSER PRECEDING-PARSER)

Like comparse#preceded-by, but takes only one preceding parser. Provided to enable simple sequences using generated parsers (register, address) usable from instruction set definitions.

[procedure] (followed-by* PARSER FOLLOW-UP-PARSER)

Like comparse#followed-by, but takes only one follow-up parser and consumes its input.

[variable] a-decimal


(bind (as-string (sequence (maybe (is #\-))
                   (one-or-more (in char-set:digit))))
      (lambda (n) (result (string->number n))))

[variable] a-char


(bind (enclosed-by (is #\')
               (in char-set:graphic)
               (is #\'))
      (lambda (r) (result (char->integer r))))

[procedure] (number-parser PREFIX RADIX CHARSET)

[variable] a-hexadecimal

default: (number-parser (any-of (is #\$) (char-seq "0x")) 16 char-set:hex-digit)

[variable] a-octal

default: (number-parser (char-seq "0o") 8 (string->char-set "01234567"))

[variable] a-binary

default: (number-parser (is #\%) 2 (string->char-set "01"))

[variable] a-number

default: (any-of a-hexadecimal a-octal a-binary a-decimal a-char)

[procedure] (check-limit N MIN MAX)

[procedure] (signed-number-range BITS)

[procedure] (unsigned-number-range BITS)

[variable] a-comment


(sequence (zero-or-more (in horizontal-whitespace))
                  (is #\;)
                  (zero-or-more (in (char-set-union

[variable] a-symbol-name


     (zero-or-more (in (char-set-union char-set:letter+digit
                       (string->char-set "_-+*/!?%.:=")))))

[procedure] (a-label TARGET)

[variable] a-local-label


(bind (as-string (preceded-by (is #\_) a-symbol-name))
      (lambda (r)
        (result (list 'local-label
               (string-downcase (string-append "_" r)))))))

[procedure] (a-symbol TARGET)

[variable] a-sexp-directive


(sequence* ((_ (zero-or-more (in horizontal-whitespace)))
        (_ (is #\.))
        (sexp a-sexp))
           (result (list 'sexp-directive sexp (required-symbols sexp))))

[variable] a-sexp-directive-string


(sequence* ((_ (zero-or-more (in horizontal-whitespace)))
        (_ (is #\.))
        (sexp a-sexp-string))
           (result (string-append "." sexp)))

[procedure] (an-opcode TARGET)

[variable] an-operand


     (zero-or-more (in horizontal-whitespace))
     (any-of a-sexp-directive-string
         (bind (as-string
            (sequence (maybe (is #\,))
                   (in (char-set-difference
                    (char-set-union char-set:graphic
                            (string->char-set " "))
                    (string->char-set ";,"))))))
           (lambda (r)
             ;; (print "parsed " r)
             (result (string-downcase r)))))
     (maybe (is #\,)))

[procedure] (an-instruction TARGET)

[variable] a-atom


       (any-of a-quoted-string
        (in (char-set-union char-set:letter+digit
       (zero-or-more (in char-set:whitespace))))

[variable] a-toplevel-atom


       (any-of a-quoted-string
        (in (char-set-union char-set:letter+digit
       (zero-or-more (in horizontal-whitespace))))

[variable] a-cons


     (as-string (sequence (maybe (is #\'))
              (is #\()
              (zero-or-more (any-of a-atom a-cons))
              (is #\))
              (zero-or-more (in char-set:whitespace)))))

[variable] a-toplevel-cons


(as-string (sequence (maybe (is #\'))
             (is #\()
             (zero-or-more (any-of a-atom a-cons))
             (is #\))
             (zero-or-more (in horizontal-whitespace))))

[variable] a-sexp-string

default: (as-string (any-of a-toplevel-atom a-toplevel-cons))

[variable] a-sexp

default: (bind a-sexp-string (lambda (s) (result (with-input-from-string s read))))

[procedure] (a-numeric TARGET)

[procedure] (a-directive-using-string-operand ID)

[variable] include-directive

default: (a-directive-using-string-operand 'include)

[variable] incbin-directive

default: (a-directive-using-string-operand 'incbin)

[variable] cpu-directive

default: (a-directive-using-string-operand 'cpu)

[procedure] (org-directive TARGET)

[procedure] (numeric-operands TARGET)

[procedure] (a-directive-using-max-2-numeric-operands ID TARGET)

[procedure] (align-directive TARGET)

[procedure] (ds-directive TARGET)

[procedure] (a-directive-using-multiple-numeric-operands ID TARGET)

[procedure] (dw-directive TARGET)

[procedure] (dl-directive TARGET)

[procedure] (db-directive TARGET)

[procedure] (a-directive TARGET)

[procedure] (a-assign TARGET)


[procedure] (an-element TARGET)

[variable] a-blank-line


(bind (followed-by (sequence (zero-or-more (in horizontal-whitespace))
                 (maybe a-comment))
               (any-of (is #\newline)
      (lambda (r) (result '())))

[procedure] (a-line TARGET)

[procedure] (count-newlines STR)

Assembly Procedures

[procedure] (is-local-symbol? SYM)

[procedure] (have-all-symbols? REQUIRED-SYMBOLS STATE)

[procedure] (symbol-lookup S STATE DEFAULT)

[procedure] (eval-operand OP STATE)

[procedure] (do-instruction NODE STATE)

[procedure] (do-assign NODE STATE)

Execute .equ directive.

[procedure] (do-label NODE STATE)

Create a global symbol and set to current origin. This will also set the current local-namespace.

[procedure] (do-local-label NODE STATE)

Create local symbol and set to current origin. Will also create a global symbol which prefixes the current local-namespace.

[procedure] (get-fill-param NODE STATE)

get fill byte value for align/ds nodes

[procedure] (word->bytes W TARGET)

[procedure] (long->bytes L TARGET)

[procedure] (string->bytes STR)

[procedure] (do-directive NODE STATE)

Execute asm directive

[procedure] (do-sexp-directive NODE STATE)

Execute a sexp-directive

[procedure] (do-md-result NODE STATE)

[procedure] (do-swap-namespace NODE STATE)

[procedure] (do-swap-target NODE STATE)

dispatch AST-NODE to evaluator procedures

[procedure] (assemble-node AST-NODE STATE)

[procedure] (ast->bytes AST)

[variable] target-cache


(let ((cache '()))
      (lambda args
    (if (null? args)
        (case (car args)
          ((add) (alist-update! (cadr args) (caddr args) cache))
          ((get) (alist-ref (cadr args) cache))
          (else (error 'target-cache (string-append "Invalid command "
                            (->string args))))))))

[procedure] (construct-target #!key ENDIAN (REGISTERS '()) (REGISTER-SETS '()) (ADDRESSING-MODES '()) (FLAGS '()) (FLAG-SETS '()) (EXTRA '()) INSTRUCTIONS)

Low level interace for make-target.

[procedure] (make-target TARGET-NAME)

Creates an asm-target struct for the given TARGET-NAME, which must be a symbol.

Remove comments, trailing whitespace, empty lines

[procedure] (strip-source SOURCE)

[procedure] (split-source-lines SOURCE)

Split raw assembly source code into lines. Helper for throw-syntax-error.

[procedure] (throw-syntax-error SOURCE REMAINDER)

Throw an exception for a syntax error.

parse the given assembly SOURCE and output the abstract source tree. SOURCE must be a string.

[procedure] (parse-source SOURCE TARGET)


Internal helper function. Do not use this directly unless you know what you're doing.

[procedure] (make-assembly TARGET-CPU SOURCE #!optional (ORG 0) (EXTRA-SYMBOLS '()))

Parses the assembly source code string SOURCE and returns an assembly object. TARGET-CPU must be a symbol identifying the initial target CPU architecture.

The resulting assembly object ASM can be called as follows:

(ASM 'done?)

Returns #t if no more passes are required to complete the assembly.

(ASM 'ast [AST])

With no additional arguments, returns the current abstract syntax tree. Otherwise, set the AST to the list of nodes given as the second argument.

(ASM 'local-namespace)

Returns the current local assembly namespace. Mostly used internally.

(ASM 'target)

Returns an asm-target struct defining the current target CPU architecture.

(ASM 'assemble MAX-PASSES)

Assemble the source, using a maximum of MAX-PASSES passes.

`(ASM 'current-origin)

Returns the current origin. This will be equal the initial origin if no assembler passes were performed yet. Otherwise it will be the address of the first byte after the assembled output if known, or #f in any other case.

(ASM 'result)

If no more passes are required, returns the assembled machine code as a list of integer values. Otherwise, returns #f.

(ASM 'set-base-origin ADDRESS)

Change the base origin to ADDRESS. This is useful for parsing the source without specifying an origin address. It has no effect once one or more assembly passes have been performed.

(ASM 'symbols [SYMBOLS])

With no additional arguments, returns the list of currently defined symbols. Otherwise, update the symbol table with the given alist of SYMBOLS. Previously existing symbols will have their definition updated, and new symbols are added to the table.

(ASM 'copy) Create a fresh copy of the assembly object in its current state.

[procedure] (assemble TARGET-CPU SOURCE #!key (ORG 0) (EXTRA-SYMBOLS '()) (MAX-PASSES 3))

Assemble the string SOURCE, returning a list of byte values. TARGET-CPU must be a symbol identifying the instruction set to use. #:org takes a start address (origin, defaults to 0). #:extra-symbols takes a list of key, value pairs that will be defined as assembly-level symbols. Note that the values may be arbitrary types. You can change the maximum number of assembler passes to run by specifying #:max-passes.

[procedure] (asm-file->bytes TARGET-CPU FILENAME #!key (ORG 0) (EXTRA-SYMBOLS '()) (MAX-PASSES 3))

Read and assemble the source file FILENAME, returning a list of byte values. See assemble for further details.

[procedure] (asm-file->bin-file TARGET-CPU INFILENAME #!key (OUTFILENAME (string-append infilename ".bin")) (ORG 0) (EXTRA-SYMBOLS '()) (MAX-PASSES 3) ;; TODO UNUSED EMIT-SYMBOLS EMIT-LISTING)

Read and assemble INFILENAME, and write the result to OUTFILENAME. See assemble for further details.