I finally got a nice block of time this weekend, and was able to finish the CSharpDom spec and implementation. The parser now parses into this structure, and then regenerates C# from it. I haven't done a lot of recompile testing, probably still a few lingering bugs, but mostly this part is done. There is also a visual test harness application (drag and drop C# files to see that various stages of parsing, as well as the tree), with fairly complete api documentation for the C#Dom part.

You can download the application at (just unzip and run, than drag C# files into it):

http://www.debreuil.com/CSharp/CSharpDom_03.zip

The most excting part is I started generating swf files from this in the last few days (and yes, that is exciting to me -- and yes, I lead a pretty quiet life). There is a fair amount of work to do here, but it is moving along pretty quickly.

What is a post without a screenshot - hopefully I can get a few crude swfs up shortly as well...

posted on Monday, April 12, 2004 5:26 AM
Feedback
  • # re: C# Dom - parser now regenerates C#
    monsieurfil
    Posted @ 4/13/2004 4:02 AM
    That's very very nice!
    I dropped in some .cs and the generated code is quite correct.
    Do you plan to parse or generate .swf?


  • # re: C# Dom - parser now regenerates C#
    Robin Debreuil
    Posted @ 4/13/2004 5:27 AM
    Great to hear : )

    I've found a bug with trailing comments since posting, I'll update shortly (I'm sure there are more, but this was more an antlr 'thing' I didn't fully understand).

    This will **100% FOR SURE** do swfs, both parse (to C#Dom and a generic graphics/timeline format) and generate (graphic + code). I might actually do decompiling swf actions into C# next -- I have that mostly written already. I was going into a different codeDom type format before, but it wasn't working out, so I bit the bullet and made this. It shouldn't be hard to switch as they were similar enough...

    The big decision now it whether to generate swfs using the existing structures for things like classes, inheritance, scope etc, or just to go to a much more basic structure (a bunch of scope tables as arrays or something). The advantage of the first way is it is really simple and fast, but there are subtle ways that is different from the C# spec, and even very minor bugs with swfs (like with super). The second method might make for faster code, and certainly easier to obfuscate, but it might be a challenge to make sure it is 100% correct (a challenge for me that is, it wouldn't be mathematically impossible or anything). I would really like the resulting programs to work in both swfs and .net exe's, I'm hoping the edge differences can be ironed out to make that possible.

    Ultimately this will be a command line compiler for swfs, and then an extra Visual Studio plugin to do the same. It will compile using the MS compiler first, which takes a huge burden off this compiler, as it knows all code coming in will be 'correct'. So it is quite important that there aren't many sublte differences caused by things like the __proto__ inheritance in swf, etc.

    Anyway, tonight I have to concentrate on the gin and tonic I'm drinking mostly. The level keeps changing, very strange.

    Thanks for the feedback : ) - if you see anything really wrong in the csdom please let me know!

  • # re: C# Dom - parser now regenerates C#
    Niklas Petterson
    Posted @ 4/13/2004 8:32 AM
    Nice work Robin. Will the source for all this be available? I can only find an exe in there and I'm looking for a C# parsing solution in C#.

    I need to instrument C# code to insert tracing statements and it seems you have all the needed components here (C# grammar => C# parser, C# AST representation, C# pretty printer from AST).


  • # re: C# Dom - parser now regenerates C#
    Robin Debreuil
    Posted @ 4/13/2004 5:51 PM
    Yeah, I am making a home for it eventually. It won't be 'open source', but the source will be available and it will be free to use (in most situations at least)... There are still some adjustments to make that will change it a bit, though hopefully the basic 'public' structure is about there. I'm doing type attribution now, that should be the last step before getting a 1.0 version. You can run this version from the console (filename [showtree, console, walktree, gencode]).

    If you want to try an early copy let me know - the interface to using it is a bit rough though.


  • # re: C# Dom - parser now regenerates C#
    Bent Rasmussen
    Posted @ 4/15/2004 6:16 AM
    This is utterly cool. I can't wait to see or better, use, the endresult.

  • # re: C# Dom - parser now regenerates C#
    Jesse Ezell
    Posted @ 4/24/2004 3:29 AM
    Amazingly cool. I was just thinking about doing c#->swf, but looks like you beat me to it. I have a complete .net swf library in dev (supports reading and writing of swf files). Would it be OK to integrate this parser with my lib for a C#->Actionscript bytecode compiler?

  • # re: C# Dom - parser now regenerates C#
    Jesse Ezell
    Posted @ 4/24/2004 11:20 PM
    I guess it isn't "complete" yet?

    I noticed a few things while playing with this today. Sure there is more, bute I have a few fixes, if you haven't already fixed this stuff in your local version):

    MethodRef doesn't set name.

    Dom.Param should have member Expression?

    Dom.MethodInvokeExpr should have ParamCollection, not ExpressionCollection for Parameters? (same goes for constructors, etc.).


  • # re: C# Dom - parser now regenerates C#
    Robin Debreuil
    Posted @ 4/26/2004 6:35 PM
    Hey Jesse,

    Sorry I was out for the weekend... First, thanks for having a look. The local copy is mostly complete, and yeah, there were quite a few things unfinished, and more bugs uncovered when doing the type attribution. That is pretty much done, I have all they types and scopes laid out, and most of the typeRef stuff now looking up and linking to an actual definition rather than just text. There are some issues to go though, one being mapping to the correct overloaded methods that must cast (Meth(long l),Meth(uint s) - resolve Meth(5);).

    The bad news is I'm going to have to think about how to release this. I don't think this compiler will ever be bulletproof unless it gets somewhat commercial. Right now I can only pick away at parts I need in spare time, other work is taking a priority. Obviously there are a lot of uses for a C# compiler that aren't generating swfs or .net multimedia (which are my goals), so that is great by me if people use it elsewhere. However if there are free swf emitters based on this exact parser, it will be pretty hard to convince people they need to buy a copy I assume.

    I guess what I'll do is finish the emitter and then take a realistic look at how much potential this has for commercial application (I have a .net swf parser/generator built already too, also svg and emf/.net). If it doesn't look good (and given my love of marketing that is likely), then I would probably just open it up to unrestricted use. If it actually is something that could be sustainable, then I would probably put some restrictions on use - so I won't have to compete with myself at least.

    Of course then you could just run out and make your own C# parser (which someone like you could obviously do easily), so I don't know... Are you making an open source project with this or something? Maybe we could figure something out...

    Doing it again I would probably work directly from the IL to swf. That is actually a much more emitter friendly structure to work with.. I might try that yet, I'm not 100% happy with antlr for things like error detection, speed, and working with its trees gives a very 'translated java' bent to your code, that is hard to shake off in one pass..

    Also, as you've probably thought about, there are subtle differences between the C# specs and how actionscript tags work, esp in the type graphs. Other things like operator overloading, indexers, auto conversion of types, unmatchable built-in types etc I think means you have to avoid most of the swf type stuff and just emit code that is more like 'procedural assembly' (if that is a word). I'm thinking that is best done with arrays of functions - which should be pretty straght forward if you build it off the scope and symbol tables. I'm thinking a control flow graph will probably be needed for this too, or at least worth doing to just get all the ducks in a row like that.

    Other things like structs are strange because you don't really have such control over the sizes and layouts of bytes, and treating it as a value type in all cases (eg passing copies to methods) might be tricky. Then again, Ref and Out modifiers will require that type of thing anyway. Enums and delegates require some custom code, attributes will require some type of built in reflection system... I guess it depends on how compatible you want to make it to the C# spec - I'm really hoping to get it where the program can run the same way on either, but there may have to be restrictions. At minimum restricted expectations - what is faster/more efficient in C# isn't always going to hold when going to swf. I'm worried that even a simple cast from int to uint could be tricky to get right 100% of the time.

    Anyway, we should keep in touch, swf bytecode enthusiasts form a pretty small world : ).

  • # re: C# Dom - parser now regenerates C#
    enricolu
    Posted @ 5/10/2004 11:45 PM
    Can I ask what libary I must to reference, so I can use the the sample code in version 2? like
    FileStream s = new FileStream(args[0], FileMode.Open, FileAccess.Read);
    CSharpLexer lexer = new CSharpLexer(s);
    lexer.setFilename(fileName);
    CSharpParser parser = new CSharpParser(lexer);
    parser.setFilename(fileName);

  • # re: C# Dom - parser now regenerates C#
    Chris
    Posted @ 6/27/2004 2:07 AM
    It would be great if you can make this a Reflector add-in. This is exactly the code I'm looking for except for it should be working in Reflector.

Blog Stats

  • Posts - 121
  • Stories - 1
  • Comments - 1441
  • Trackbacks - 47

.Net Blogs

01101 Blogs

Flash Blogs

Graphics

People