Saturday, 15 November 2008

Programming language idea

Well, it's an idea about programming language without a programming language.

Think about writing a script that generates assembly. It's doable. Sometimes it's even *almost* sane thing to do, eg. when working on spellchecker I've had a crazy idea about generating code that does dictionary lookup for a hardcoded word set (It's no rocket science, just compare, jump, compare, jump...). But it's so low level and not cross-platform.

I've done some C generation from Python several times (eg. to make last year's april fool's joke - it's on this blog). The problem is that C is often not flexible enough (GCC would choke on 100MB file for dictionary; TCO, call/cc and other fancy stuff is hard...). Not C, not assembly...

There is LLVM - a perfect target.

Moving on: a scripting language that would generate all this mess.

It should be very, very meta one. Like lisp, but with syntax. A pluggable one.

My idea is to write a language that would have pluggable lexer (one lexer would switch to another one when it would encounter some sequence of characters - think of reader macros on steroids), then a huge layer of macros (tons of macros. Like Nemerle). Add a lispy semantics on top of it (few orthogonal concepts that combine together nicely) and my evil plan is 10% complete.

You would be able to create literal syntax for your new, shiny DSL inside your script, then semantics that would generate optimal machine code for it and boom!

It would enable some crazy stuff. Think of all super-duper assertions library creator would be able to enforce compile time with it! You like that D enables you have optional purity control? You can add it to your script.

It wouldn't be general purpose tool (it would probably be to wired and demanding to do everyday work), but I can see applications where you have to be extremely fast (OS kernels, games) or you are very domain-specific stuff (think of all those code-generating tools like lex, bison, antlr).

More or less related things to do/blog about in future:
- DBus + gvim - a possible step towards an headless IDE
- Actor-based OS

random stuff

Beep, beep! I'm still (more or less) alive.

University takes me lots of time, which is my primary excuse for not blogging
for so long time. Projects are like gas - they fill up all available space^H^H^H^H^H time.

Interesting (for geeky enough definition of "interesting") stuff I've done recently:

  • Spellchecker. It's a quite smart one -- written in python and C, it uses trie (ternary search tree) for storing dictionary. It's super-fast to load data (there are no pointers, so loading it is just one read of binary file, and you can use it).

    Then you can use TST to quickly (it's quicker than hashmap) check if a word belongs to vocabulary or retrieve (it's still bleeding fast) a list of words that are no further (in Levenstein's (edit) distance) than misspelled one.

    Then it uses longest common subsequence algorithm to get parts that don't match and compares those parts using knowledge about typical spelling errors in Polish. It can correct "grzegrzułka", "zomp" and "fzhut". In summary: it's cool.

  • I'm preparing a talk about scala. It's work in progress. You can see some slides here. I'll give this talk on 3th December as a part of BIWAK

  • Oh, yes, BIWAK. We (BIT science club) have started a series of talks called BIWAK.

  • Oh, yes, science club. I've done some work on platform game with cool physics, but there is nothing cool to show off yet.

  • I've published some of my .rc files

  • Hooray new swimming pool! Hooray hiking! Hooray birthdays and weddings. Hooray real life.