http://www.debreuil.com/CSharp

** UPDATE ** The old parser is gone, but there is a new parser completely hand written (and thus much faster) at the above link. Best news of all, no antlr, yeeay!

Now back to the old irrelevant post.

I've been pretty busy recently, working on some interesting projects, and in the between time working on a C# parser with ANTLR. The parser (grammar file) is now pretty much done. It generates an AST that is modeled fairly closely on the codeDom, plus some extra peices. It should be easy to generate codeDom objects or text from, it is on the list anyway.

It should parse everything except for unsafe code, and the #if preprocessor stuff (next up). It is very slow on long long nested expression (like you get sometimes in the antlr generated lines), but reasonable on more normal code. It has not been optimized yet, and doesn't verify much as far as code integrity.

Next up is swf export, I have the action generation stuff written, so it is a matter of munging the AST into that. I suspect it will be fairly easy to do poorly, and very hard to do well. I think I'll start on 'poorly' : ).

posted on Friday, March 05, 2004 4:44 PM
Feedback
  • # re: C# Parser completed
    Burak KALAYCI
    Posted @ 3/5/2004 5:08 PM
    Congrats!

    Best regards,

  • # re: C# Parser completed
    Ahmet Zorlu
    Posted @ 3/7/2004 11:16 AM
    Great stuff,

    Even the default xml serialization of syntax tree is inspring enough:

    //AST t = parser.getAST();
    antlr.BaseAST t = (antlr.BaseAST) parser.getAST();
    if(t!=null)
    {
    string ser = fileName + ".ser";
    StreamWriter swr = new StreamWriter(ser); t.xmlSerialize(swr);
    swr.Close();
    //...
    }

    Well this one lacks a root node, but it can be wrapped in an xml file written with XmlTextWriter or there should be better solutions. The docs say we have to override the xml serialization methods in the base class in order to customize the output. (disclaimer: I'm new to ANTLR)

    Many side projects can be developed, code conversion assistants, UML generators etc.

    As far as I understand, at the end, we will be able to generate SWF with C#. If so, this also solves the IDE problem.
    Is there a grammar for SWF binary files too? Will it be possible to parse SWF to generate C# code?

    Best of luck,





  • # re: C# Parser completed
    Robin Debreuil
    Posted @ 3/7/2004 3:17 PM
    Cool to look at the xml, thanks for pointing that out : ). I am moving it into a set of .net specific classes, mostly because of java translation issues (getType vs GetType) keeping it from being cls compliant. It will look a lot like a CodeDom tree, with a dot structure rather than an ast tree - though it will still be 'walkable'. As you probably noticed there are a lot of nodes that could be optimized, so that will happen in that stage.

    I'm pretty new to antlr too, but between the power of that tool, and the scope of .Net, a lot of things become possible eh?

    This will go to swf, though I'm not sure about swf>C# - I think Burak has done more than everything there already ; ). I didn't use a grammar for the swf part, I just did a hand parser. You can parse binary files with antlr, though I'm not sure about reading in tags from their tag length. It would be possible with embedded code, but it might harder in the end than just doing it by hand - not sure. I have a swf parser for graphics and action tags done, and a generator for the action tags. So now it is just a matter of getting an emitter from the ast. I hope that doesn't take too long...

    I think instead of trying to get a 'good' compiler (that debugs etc) I will make .net stub classes that map to the built in classes of swf (like movieclip). Then you can build using the microsoft csc compiler, and as a second step export to swf. That way, you can be sure you have correct code, which makes life quite a bit easier. Eventually I'd like to improve the compiler, but there is so much else to do first... Also, I'd like to fill in the stub code so that you can actually export an swf clip as a .net program.

    Anyway, lots to do - I'll post more code when it starts going to swf, or doing something useful in general. If you have any opinons on structure etc, I'm all ears : ).

    Cheers,
    Robin

  • # re: C# Parser completed
    Florian Krüsch
    Posted @ 3/10/2004 4:13 PM
    this is very cool! I can imagine that C# maps very nice to Flash bytecode... maybe with some exceptions like overloaded methods.
    Would be cool to even borrow the .net event model and with events and delegates as well, maybe using an underlying (hidden) base libary.

    best
    Florian

  • # re: C# Parser completed
    Robin Debreuil
    Posted @ 3/11/2004 4:29 AM
    Yeah, mapping IL to bytecode would be even closer I think, very neat. I don't think I'll map it too close though, I'm so far planning on using array tables (and registers) for pretty much everything. I'm hoping this will be faster, perhaps easier, and certainly obfuscated'er. I haven't got it running yet, but basically I was thinking scopes, local vars, functions, and maybe even registry dumps could be stored in arrays, and then just mix and match as appropritate. It should be possible to set up an interface to that that might be IL friendly too, perhaps for down the road.

    I totally agree on the events and delegates point you make - hopefully it will eventually support 100% of CSharp. Even things like unsafe code have a parallel in swf (calls to actionscript/bytecode based swfs). Type Indexers, operator overloading, enums, nested classes... Structs will be a problem in that they likely wont be true value types, but maybe store them in some numeric/string encoded format...? The size and 'valueness' would be preserved, but the performance would likely be worse, rather than better. Anyway, lots to think about.

    I'm just finishing converting the AST to a more useable 'CSharpDom', maybe one more night. I can hardly wait to start generating swfs! So hard not to just skip ahead : ).

  • # re: C# Parser completed
    Todd H.
    Posted @ 3/24/2004 8:12 PM
    Good stuff. Although it doesn't seem to be able to parse complex expressions which involve arrays of a type within multiple levels of a class structure (something which happens with proxy references and typed datasets). Here's the simplest example I can come up with which illustrates this:

    <blockquote>
    using System;

    namespace Test
    {
    public struct myStruct
    {
    public struct myLevel2Struct
    {
    }
    }

    public class TestClass
    {
    myStruct.myLevel2Struct[] structs = null;
    }
    }

    The parser will return this error:

    Parsing ..\..\TestClass.cs
    ..\..\TestClass.cs
    ..\..\TestClass.cs:12:9: unexpected token: ["public",<193>,line=12,col=9]
    ..\..\TestClass.cs:12:16: unexpected token: ["class",<192>,line=12,col=16]
    ..\..\TestClass.cs:12:22: expecting "RBRACE", found 'TestClass'
    ..\..\TestClass.cs:14:41: unexpected token: ["]",<120>,line=14,col=41]
    ..\..\TestClass.cs:14:43: unexpected token: ["structs",<93>,line=14,col=43]
    </blockquote>

    Cheers,
    -TH

  • # re: C# Parser completed
    Robin Debreuil
    Posted @ 3/25/2004 12:18 AM
    Right you are... I have a new version pretty much ready, but it has the same problem, so double thanks : ).

    It was the structure a.b.c[] x = null; it had a problem with. I've had a hard time with arrays vs regular variables, ugh...

    Thanks again!

  • # re: C# Parser completed
    dfh
    Posted @ 4/13/2004 9:35 AM
    what's a c# parser

  • # re: C# Parser completed
    Robin Debreuil
    Posted @ 4/13/2004 2:28 PM
    A parser, in this case, takes text input of a C# program, and converts it to a structure that is 'meaningful'. For example there is a namespace element, it has a collection of classes, interfaces, structs, etc. Then each class would have acollection of fields, methods, properties etc. Each method has a collection of statements, each statement can have a collection of expressions...

    With this structure, you can do things like emit a series of bytes that a processor understands (make an exe) or re-emit text for another language (or the same language). Other things you can do are things like code analysis (find errors, generate uml), or in this case, generate actions for swf files.

    Not terribly exciting stuff, but a compiler is probably the base of all computer work. Hmmm, well at least it is the base interface between people and machines : ).

  • # re: C# Parser completed
    Daniel, a fast typing developer
    Posted @ 8/3/2004 8:03 PM
    I wish VB had something like this. It's another reason I look forward to the completed version of Visual C# Express 2005.

  • # re: C# Parser completed
    Ashraf Shahen
    Posted @ 12/5/2004 5:55 PM
    Hi, can I find (C# Compiler - Version 0.8)?
    Best regards,


Blog Stats

  • Posts - 121
  • Stories - 1
  • Comments - 1441
  • Trackbacks - 47

.Net Blogs

01101 Blogs

Flash Blogs

Graphics

People