Commit graph

147 commits

Author SHA1 Message Date
Jack Palevich
8968e8e115 Change assignment code gen to use leaR0 .. storeR0ToTOS.
This is another step towards being able to handle lval / rvals.

Improved storeR0ToTOS to convert R0 to the proper type to store into
*TOS. (This is something that storeR0 was already doing.)

Removed storeR0 as it is no longer being used.
2009-07-30 16:57:33 -07:00
Jack Palevich
8f361faffc Fix bad ARM code generation for '||' and '&&' operators.
Add tests of '&', '&&', '|' and '||' operators.
2009-07-30 16:19:43 -07:00
Jack Palevich
9f51a26961 Load function symbols using lea syntax.
Use a common code path for ordinary, forward, and indirect calls.
2009-07-29 16:22:26 -07:00
Jack Palevich
a7813bda4a Remove loadR0 in favor of lea + loadR0FromR0.
(This is another small step towards lval/rval.)

+ Use strd to store local doubles.
+ loadR0FromR0 now handles pointers.
2009-07-29 11:36:04 -07:00
Jack Palevich
ddf7c9c141 Implement inc/dec in a more lval-friendly way. 2009-07-29 10:28:18 -07:00
Jack Palevich
7fcdf1c5f8 Adjust stack alignment for local variables to work more like arguments.
This makes it easier to generate frame-pointer-relative addresses for ARM.

Prior to this we had stored char sized local variables in the highest
address of the 4-byte stack allocation. Now we store "char"s in the
lowest address of the 4-byte stack allocation, just like chars are
passed as arguments.

We now store global chars on byte boundaries.
2009-07-27 17:54:10 -07:00
Jack Palevich
2ff5c22e96 Keep track of the current arena.
This means we don't have to pass it around as an argument.

This change was made because I'm about to start creating pointer types
during expression evaluation, and I didn't want to add an arena
argument to all the expression functions.
2009-07-23 15:11:22 -07:00
Jack Palevich
89baa2083f Fix the ARM postdecrement operator.
Add a test for ++ and -- so this bug won't happen again.
2009-07-23 11:45:15 -07:00
Jack Palevich
58c30eef99 Code generator cleanup
Factor ARM integer binary operation setup code into a function.

Don't pass redundant pType information into loadR0FromR0, storeR0ToTOS,
gcmp, gUnaryCmp, li

Separate inc/dec from variable loading. Generates worse code, but now
we handle pointer inc/dec and char inc/dec.
2009-07-17 16:35:23 -07:00
Jack Palevich
b40367bde1 Remove unused logging code. 2009-07-17 13:51:51 -07:00
Jack Palevich
ba929a4ffa Track lvalues vs. rvalues. 2009-07-17 10:20:32 -07:00
Jack Palevich
3377bfd845 Report error (rather than crashing) when a declaration name is missing.
Repo case:

void main()
{
   int );
}
2009-07-16 19:05:07 -07:00
Jack Palevich
8148c5be54 Coerce R0 to destination type before storing it into a variable. 2009-07-16 18:24:47 -07:00
Jack Palevich
dc45646238 Implement a "#line" directive. 2009-07-16 16:50:56 -07:00
Jack Palevich
b1544cad42 Detect assignments to undeclared variables.
Previously we only detected reading from undefined variables.
2009-07-16 15:09:20 -07:00
Jack Palevich
ce105a9082 If the compile failed, return NULL from symbol lookups. 2009-07-16 14:30:33 -07:00
Jack Palevich
d1f57e689b Improve error handling
Don't segfault if the right-hand operand of a binary operator is missing.
Don't segfault if a semicolon is missing at the end of a forward
declaration.
2009-07-15 18:23:22 -07:00
Jack Palevich
2aaf21f1be Improve numerical constant parsing. 2009-07-15 16:16:37 -07:00
Jack Palevich
8c246a9dc2 Add accRegisterSymbolCallback API to control external symbol linkage.
Until now dlsym was used to lookup external symbols. Now you can
register your own function to be called when an undefined symbol is
used.
2009-07-14 21:14:10 -07:00
Jack Palevich
fd3db48e2e Add test for passing floats and doubles as ints, floats, and doubles. 2009-07-14 19:39:36 -07:00
Jack Palevich
37c54bd22e Make forward declarations of external symbols really work.
Until now we had always been treating external variables as "int",
and external functions as int (...);
2009-07-14 18:35:36 -07:00
Jack Palevich
7ecc5556ae Remove unused variable. 2009-07-14 16:24:55 -07:00
Jack Palevich
a8f427f606 Implement pointer arithmetic. 2009-07-13 18:40:08 -07:00
Jack Palevich
25c0ccaed4 Implement support for "char" local and global variables. 2009-07-13 16:56:28 -07:00
Jack Palevich
45431bc252 Implement general casts and pointer dereferencing.
Prior to this casts and pointer dereferencing were special-cased.
2009-07-13 15:57:26 -07:00
Jack Palevich
59178c0a3d Run tests on both ARM and x86 2009-07-13 14:15:18 -07:00
Jack Palevich
b7718b973c Implement floating point for ARM. 2009-07-09 22:00:24 -07:00
Jack Palevich
bab8064203 Add x86 floating point test. 2009-07-09 13:54:54 -07:00
Jack Palevich
2a4e1a9f88 Finish implementing x86 floating point
Support floating-point if/while statements: if(1.0) { ... }
Support reading values from float and double pointers.

And some additional error checking.
Detect malformed "return" statements
Detect passing the results of "void" functions as arguments.
2009-07-09 13:34:25 -07:00
Jack Palevich
a39749f641 Implement x86 floating point operations
+ unary floating point operation -
 + unary floating point compare: !
 + binary floating point operations +-*/
 + binary floating point comparisons: < <= == != >= >
2009-07-08 20:40:31 -07:00
Marco Nelissen
eea5ae9ceb Class with virtual methods should have virtual destructors too. 2009-07-08 17:43:17 -07:00
Jack Palevich
9cbd226960 Implement global, local, and stack based float and double variables. 2009-07-08 16:48:41 -07:00
Jack Palevich
128ad2d204 Implement x86 int <--> float. 2009-07-08 14:51:31 -07:00
Jack Palevich
1a539db23c Some x86 floating point code works.
We now check the types and number of arguments to a function.
2009-07-08 13:04:41 -07:00
Jack Palevich
8df4619e09 Start tracking types in expressions. 2009-07-07 14:48:51 -07:00
Jack Palevich
1aeb87b52b Parse floating point (and double) constants. 2009-07-06 18:33:20 -07:00
Jack Palevich
9eed7a2c7b Start teaching the code generator about types.
Remove the concept of "R1" from the code generator API, R1 is used
internally as needed.
2009-07-06 17:24:34 -07:00
Jack Palevich
95727a0b05 Initial support for float, double. 2009-07-06 12:07:15 -07:00
Jack Palevich
3f22649d98 Implement our hard casts using our type system.
Doesn't add any new capabilities, but we now generate error
messages if you attempt casts we don't support,
and we also found and fixed some bugs in declaration parsing.
2009-07-02 14:46:19 -07:00
Jack Palevich
40600de143 Clean up expression code.
Concatenate adjacent string literals.
2009-07-01 15:32:35 -07:00
Jack Palevich
8635198c57 Add a type system.
We now track the declared type of variables and functions.
2009-06-30 18:09:56 -07:00
Jack Palevich
569f135862 Implement a token table and an arena allocator.
+ Tokens are now simple IDs, rather than ids or maybe pointers.
+ We can now allocate data that's freed automatically when
  compilation end or when a block goes out of scope.
+ Renamed our Array utility class to Vector, and made its
  api work a little more like the STL vector template class.
2009-06-30 10:16:43 -07:00
Jack Palevich
609c994f7b Rewrite compiler test using python.
Nice because we're now checking the output for success/failure
automatically rather than by eye.
2009-06-25 13:55:12 -07:00
-b master
422972cb12 Align ARM stack pointer to an 8-byte boundary when calling functions.
This is required by the ARM EABI standard.
2009-06-17 19:13:52 -07:00
Jack Palevich
a1804ddeba Allow local variables to be declared anywhere in a block. 2009-06-12 14:40:04 -07:00
Jack Palevich
d7461a7342 Support variable initialization.
Global variables can only be initialized to integer constants.

Local variables can be initialized to arbitrary expressions.
2009-06-12 14:26:58 -07:00
Jack Palevich
f1728bec74 Reserve all C99 keywords.
And improve checks/error messages around using non-symbols where a
symbol is expected.
2009-06-12 13:53:51 -07:00
Jack Palevich
22e3e8e1a6 Handle end-of-file inside of comments, local declarations. 2009-06-12 13:12:55 -07:00
Jack Palevich
b4758ff1de Implement string and character backslash constants.
Until now we only supported '\n'. Now we support everything, including
octal ('\033') and hex '\x27' constants.
2009-06-12 12:49:14 -07:00
Jack Palevich
2ccc40d096 Make #define work again. (Had accidentally omitted the keyword.) 2009-06-12 11:53:07 -07:00
Jack Palevich
a6baa23f08 Improve symbol-related error checking
+ Duplicate symbols.
+ Undefined variables;
+ Forward-referenced functions that were never defined.
2009-06-12 11:25:59 -07:00
Jack Palevich
61d22dc763 Improve nested variable test.
Test that we can have two levels of local variables.
2009-06-11 22:03:24 -07:00
Jack Palevich
b67b18f7c2 Add code generator tracer. 2009-06-11 21:50:17 -07:00
Jack Palevich
303d8ffca9 Improve local variable scoping.
Until now we faked local variables -- they only worked correctly if
there was no overlap between local variables and global variables.

Use a symbol table stack instead of a string list.

Fix bug with looking up undefined symbols.
2009-06-11 21:47:57 -07:00
Jack Palevich
2db168f12f Use a separate table for keywords. 2009-06-11 14:29:47 -07:00
Jack Palevich
0a280a0dde Remove use of setjmp/longjmp from libacc compiler.
It makes it harder to deal with memory allocation.

Also fix bug in the otcc-ansi.c test, where the wrong part of the
code buffer was being mprotected, so that if the code buffer happened
to be allocated across a page boundary, some code would no receive
execute permission.
2009-06-11 10:53:51 -07:00
Jack Palevich
8dc662efe9 Make otcc code work in x64 based system with 32-bit chroot.
Set execute permission on code before running it.
Handle negative relative offsets for global variables.
Add printfs to report the progress of nested compiles.
Change way we detect whether we can run the host compiler
or not. We used to check if we were running on a 32-bit
Linux. Now we check if the executable is a 32-bit Linux
executable.
2009-06-09 22:59:04 +00:00
Jack Palevich
36d9414f72 Make a host version of acc for testing.
Don't run the code we've compiled unless the -R option is present.
2009-06-08 15:55:32 -07:00
Jack Palevich
2d11dfba27 Move macros into their own table.
Catch attempts to define macros with parens (not supported.)
2009-06-08 14:34:26 -07:00
Jack Palevich
b7c81e9952 Switch to ANSI C style C function declarations.
main(argc, argv) --> int main(int argc, char** argv)

Although we accept int, void, and char types, and pointers to same,
we actually still treat everything as an int.
2009-06-04 19:56:13 -07:00
Jack Palevich
eedf9d2083 Add support for #pragma foo(bar)
Report unsupported/unknown preprocessor directives.
Report line number of error rather than character offset.
2009-06-04 16:38:35 -07:00
Jack Palevich
f1f39cca30 Make sure we don't overflow various internal compiler buffers.
We may replace some of these tables with dynamically growing data
structures, but in the meantime we will not trash memory.
2009-05-29 18:03:15 -07:00
Jack Palevich
ac0e95eb60 Improve ACC error reporting.
Now return an error code and an error message, rather than just
printing to stderr or calling exit().

Check to see we don't exceed our code size.
2009-05-29 13:53:44 -07:00
Jack Palevich
653f42da92 Pointer-ize the acc front end.
The ACC compiler used to be able to compile itself. This was a neat
feature, but because ACC only supports ints, pointers are stored as
ints, and cast to pointers when used.

This checkin turns many ints that are really pointers back into
pointers, so that the code is clearer.

 int ch;
 char* glo;
 char* sym_stack;
 char* dstk;
 char* dptr;
 int dch;
 char* last_id;
2009-05-29 09:32:14 -07:00
Jack Palevich
09555c7a18 Fix symbol lookup logic, squelch LOG output. 2009-05-27 12:25:55 -07:00
Jack Palevich
1cdef20774 Convert libacc into a shared library.
Document internal CodeGenerator interface

Move license to a separate license file.

Define a public API for calling libacc.

Update the "acc" test program to use the public API.
Move "main.cpp" and test scripts into the tests subdirectory.
Move test data from tests to tests/data
Remove stale test data.
2009-05-22 12:09:55 -07:00
Jack Palevich
8b0624c3d3 Fix x64 int / pointer warnings. 2009-05-20 12:12:06 -07:00
Jack Palevich
e7b590666d Implement architecture-dependent defaults.
If libacc is built on x86, then x86 is the default code generator.
If libacc is built on arm. then ARM is the default code generator
And so on for future architectures.

The 64-bit x64 machine has no working code generator currently.
We may add one to support the simulator builds.

Improved the test program so we don't try to run tests if the
compile failed. Also avoid running tests that don't work on
a given platform.
2009-05-20 11:27:04 -07:00
Jack Palevich
274663bf67 Add a test script for testing the libacc compiler on ARM. 2009-05-19 14:07:41 -07:00
Jack Palevich
3d474a74a7 ACC ARM codegen: implement /, % 2009-05-15 15:12:38 -07:00
Jack Palevich
7810bc9abd ACC ARM codegen: Implement calling indirect functions. 2009-05-15 14:31:47 -07:00
Jack Palevich
4d93f30bef ACC ARM code gen: Implement global variables.
Collapsed the inc/dec codegen into the loadEAX function, because it
allows us to generate better code for inc/dec'ing a global variable
on ARM, because we don't have to load the variable's address twice.
2009-05-15 13:30:00 -07:00
Jack Palevich
9918d0a2ee Add license, document language changes. 2009-05-15 11:01:21 -07:00
Jack Palevich
bd894904f7 ACC: Arm code gen improvements ++/--, &, odds and ends
Added C++-style "//..end-of-line" comments, since I kept trying to use them when writing
test programs.

The biggest known missing piece is global variables.
2009-05-14 19:35:31 -07:00
Jack Palevich
8de461dc9e Implement <, >, ==, !=, >= <=, &&, and ||. 2009-05-14 17:21:45 -07:00
Jack Palevich
69796b6c84 ACC ARM code gen: Implement local variables, function args
+ Fix prolog and epilog code.
2009-05-14 16:43:18 -07:00
Jack Palevich
cb1c9ef38c ACC ARM code gen improvements. printf("Hello, world\n"); works!
+ Improved li to handle all 32-bit values.
+ Implemented push/pop of temp registers during evaluation
+ Implemented the unary and binary easy math operators (+,-,*,<<,>>,|,&,^,~)
+ Implemented global function calling.
2009-05-14 14:27:06 -07:00
Jack Palevich
a653561097 ARM codegen: Add disassembler, implement return
This program works:

    main() { return 42; }

The disassembler was borrowed from codeflinger, and just modified enough to compile
under C++ without warnings.

Implemented gsym
Implemented a hack verison of li, only works for -256..255
Implemented gjmp
2009-05-13 19:51:03 -07:00
Jack Palevich
546b2249ef Begin filling in ARM code generator.
We can now call functions that have no arguments (and return from them too!)
2009-05-13 15:10:04 -07:00
Jack Palevich
2230513fc0 Add stub Arm code generator. 2009-05-13 10:58:45 -07:00
Jack Palevich
bf42c9c163 Move all x86-specific knowlege into the X86CodeGenerator. 2009-05-12 13:46:16 -07:00
Jack Palevich
21a15a2416 Various C++ improvements
+ Changed indentation, causes the stats to say there's a lot of lines changed.
+ Should be able to compile multiple times with the same compiler object.
+ Create a CodeBuffer class to hold the code.
+ Create a CodeGenerator class to encapsulate knowlege of the CPU instruction set.
+ Started filling in the code generator.
2009-05-11 18:49:27 -07:00
Frabrice Bellard
a96930572c Document acc language features.
Original text from http://www.ioccc.org/2001/bellard.hint
2009-05-11 14:51:47 -07:00
Jack Palevich
bbf8ab504a Added command-line option "-t" to allow run-time switching between running and dumping.
Fixed some C++ warnings reported by g++ .
Verified that the compiler actually works when run on 32-bit Linux.
2009-05-11 11:54:30 -07:00
Jack Palevich
77ae76eea9 converted to C++
Base address of constant table changed, so had to update the "-orig" files.
2009-05-10 19:59:24 -07:00
Jack Palevich
f6b5a531d8 Remove all gcc warnings. 2009-05-10 19:16:42 -07:00
Jack Palevich
e27bf3eb29 Replace acc.c with the contents of otccn.c, update tests.
We are no longer checking if the constant data is the same, just the
generated code.
2009-05-10 14:09:03 -07:00
Fabrice Bellard
16134598fb Original version of otccn.c from http://bellard.org/otcc/otccn.c 2009-05-10 14:01:59 -07:00
Jack Palevich
431055cc9b More deobfuscation. 2009-05-08 20:30:47 -07:00
Jack Palevich
7448a2ebb7 Converted code constants from decimal to hexidecimal. 2009-05-08 20:30:47 -07:00
Jack Palevich
f0cbc92fc0 More unobfuscation. 2009-05-08 20:30:47 -07:00
Jack Palevich
50791f5466 Make our global variables static. 2009-05-08 20:30:47 -07:00
Jack Palevich
d160530a58 Got rid of all warnings. (Yeah, right.) 2009-05-08 20:30:47 -07:00
Jack Palevich
f54db02e5d Add a simple regression test framework. 2009-05-08 20:30:47 -07:00
Jack Palevich
ae54f1fba8 Continue deobfuscation.
Add a license.

Indent the code.
Add includes for external functions.
Improve function prototypes.
Start adding the correct type to things that are pointers.
Start fixing compiler warnings.

Instead of directly executing the compiled code (which only works on x86 Linux),
write the internal compiler state to stdout. This makes it easier to test whether
or not our refactoring is breaking anything.

Return a zero status on success.

Add error checking for a missing input file.
2009-05-08 20:30:47 -07:00
Jack Palevich
883114867a Start de-obfuscation process. 2009-05-08 20:30:47 -07:00
Fabrice Bellard
38aa39a200 Original freeware Obfuscated Tiny C Compiler.
From http://bellard.org/otcc/otcc.c

License from http://bellard.org/otcc:

The obfuscated OTCC and OTCCELF are freeware.
2009-05-08 20:30:47 -07:00