Emerald Editor Discussion
March 29, 2017, 12:31:49 am *
Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
News:
 
   Home   Help Search Login Register  
Pages: [1]
  Print  
Author Topic: Esoteric syntax formats?  (Read 8466 times)
0 Members and 1 Guest are viewing this topic.
Arantor
Site Administrator
Administrator
Master Jeweller
*****
Posts: 618



« on: January 09, 2007, 09:11:27 pm »

I've been thinking about syntax formats today, and I had a random thought: how unusual are the formats that we could be supporting?

I'm not talking about supporting binary formats instead of text-based ones, that's not the point of EE. But how unusual are the formats we would be supporting? For example, this really strange part of me wants to support - of all things - AMOS Basic syntax (ASCII mode), from the Amiga.

I can see the uses for the obvious (HTML, PHP, SQL, Ruby, Python, Perl, C/C++ et al) but how unusual are the formats would we support?

Because the model is intended to be totally extensible, we could provide hundreds or even thousands of downloadable syntaxes, so any format no matter how bizarre or unusual, could be supported.

What's the most bizarre format you'd like to see it support?
Logged

"Cleverly disguised as a responsible adult!"
John Yeung
Senior Miner
***
Posts: 85


« Reply #1 on: January 10, 2007, 06:58:17 am »

I've been thinking about syntax formats today, and I had a random thought: how unusual are the formats that we could be supporting?  For example, this really strange part of me wants to support - of all things - AMOS Basic syntax (ASCII mode), from the Amiga.  I can see the uses for the obvious (HTML, PHP, SQL, Ruby, Python, Perl, C/C++ et al) but how unusual are the formats would we support?

I can't imagine AMOS Basic is any more unusual than Ruby or Perl, if it's a dialect of Basic.  (I don't know AMOS myself but I do have an Amiga, and I know someone who wrote games in it.)  By "unusual" I mean structurally different than familiar syntaxes.  For example, if you made a language whose blocks began with "%%^%%" and ended with "&&^&&" and all statements had to end with "__xXx__", it would sure look crazy and unusual, but structurally it could be identical to C.  To me, this language would qualify as not unusual at all.  This is exactly the kind of thing I mean by "settings-based" syntax specification.

Because the model is intended to be totally extensible, we could provide hundreds or even thousands of downloadable syntaxes, so any format no matter how bizarre or unusual, could be supported.

What's the most bizarre format you'd like to see it support?

I would like to see Emerald support any context-free language; that is, any that can be specified using BNF notation.  I realize most users do not know how to use BNF or its variants/successors, but for them we could provide a less daunting conventional syntax facility based on tweaking particular settings of a predefined language like C++, Perl, or Python.  (For example, start with C++, then replace "block-begin={" with "block-begin=%%^%%", etc.)  These could be translated almost trivially to Emerald's flavor of BNF.

That said, I would be plenty happy enough to see Emerald use a purely conventional syntax scheme like Crimson's or SciTE's or Notepad++'s (all of these are roughly equivalent, in that they all have fixed notions of structure, and the user can only twiddle with settings) as long as it has "enough" settings and is generous with lists.  By "list generosity" I mean things like allowing arbitrary numbers of keyword groups--not just 2 or 4 or 10, but however many the user would like to define and has the visual acuity to detect the colors for.  Don't allow just 2 or 3 different kinds of comments, but however many kinds of comments, preferably all understood by Emerald to be comments (and thus equivalent to whitespace in certain contexts).

I know a big feature for Web developers (and I am not one) is nested syntax types.  And with the proliferation of scripting languages, it would be best to allow an arbitrary number of them to be active in a single document.  (I don't think it's a priority to have arbitrary levels of nesting, though.)

I'm sorry if you were trying to get us to name specific languages or formats, but in my mind the whole point of giving the user as much flexibility as possible is that we cannot know what the user is going to need.  A lot of people out there use Crimson or other colorizing editors as log-reading tools.  Some people are literally creating their own programming languages.  I would love for Emerald to be able to serve all of them.

John


P.S.  OK, I came up with a specific example of an existing language which is not only uncommon, but structurally unusual:  RPG IV.  It's partially column-based, with provisions for somewhat free-form code.  Full support would probably involve not just parsing of the syntax but also a quite complicated tab-stop scheme.  This has to be the ugliest Frankenstein of a programming language I have ever seen or heard of.  (For the record, I personally do not think it is at all important for Emerald to fully support this language.)
« Last Edit: January 10, 2007, 07:02:54 am by John Yeung » Logged
Arantor
Site Administrator
Administrator
Master Jeweller
*****
Posts: 618



« Reply #2 on: January 11, 2007, 11:48:20 pm »

It just occurred to me: of all the human- and machine-readable formats, there is one which we wouldn't really be able to write a lexer for - Inform 7. Darn it.

But if you want esoteric formats, that's a nice one. Smiley
Logged

"Cleverly disguised as a responsible adult!"
Szandor
Senior Miner
***
Posts: 92



« Reply #3 on: January 12, 2007, 12:10:39 am »

It just occurred to me: of all the human- and machine-readable formats, there is one which we wouldn't really be able to write a lexer for - Inform 7. Darn it.

But if you want esoteric formats, that's a nice one. Smiley

Yes, we would. It would be one hell of an SDD to write - several hundreds of definitions and almost a full vocabulary of keywords - but it is in no way impossible. I admire the people who put inform together...

Actually, it would be cool to have an SDD for english. That too would be kinda nightmarish to write though.
Logged

"Cleverly disguised as an original signature..."
John Yeung
Senior Miner
***
Posts: 85


« Reply #4 on: January 12, 2007, 02:04:03 am »

It just occurred to me: of all the human- and machine-readable formats, there is one which we wouldn't really be able to write a lexer for - Inform 7. Darn it.
Actually, it would be cool to have an SDD for english. That too would be kinda nightmarish to write though.

I don't think Inform 7 would be impossible--after all, if it is already machine-readable, then we should be able to replicate it--but I do think it would be harder than most, and the benefit would be less than most.  It looks like the whole point of its design is so that you can just read it like English; colorizing seems on the superfluous side.  It might even be more distracting than helpful.

Actual English, on the other hand, does not lend itself to machine lexing.  It's a natural language, and one of the more chaotic ones at that, so it would require something far more powerful than any grammar in formal language theory.  Not to mention the fact that it is inherently ambiguous, and a single sentence can have multiple valid parsings.  It is beyond nightmarish and knocking on the door of impossibility, if not breaking it down.

John
Logged
Szandor
Senior Miner
***
Posts: 92



« Reply #5 on: January 12, 2007, 07:57:25 am »

Actual English, on the other hand, does not lend itself to machine lexing.  It's a natural language, and one of the more chaotic ones at that, so it would require something far more powerful than any grammar in formal language theory.  Not to mention the fact that it is inherently ambiguous, and a single sentence can have multiple valid parsings.  It is beyond nightmarish and knocking on the door of impossibility, if not breaking it down.
It's true, although the plugin system should - in theory - make it possible to implement context sensitivity through speech recognition based on intonation. In theory. And with a bunch of crazy programmers without a life. Would be cool though.

How about esperanto? Klingon or sindarin should work too. Since they are contructed languages they aren't as chaotic.
Logged

"Cleverly disguised as an original signature..."
Szandor
Senior Miner
***
Posts: 92



« Reply #6 on: February 02, 2007, 12:18:34 pm »

Hey! The beginnings of some SDD definitions for RPG IV.
Code:
: definitions
file_description | sequence type file_name file_type
sequence | *(/cp = 1-5)
type | "F"(/cp = 6)
file_name | *(/cp = 7-14)
file_type | [ I : O : U ](/cp = 15)

And an extremely basic SDD for Inform 7!
Code:
: definitions
statement | object " is " .relationship object "."
object | *

: key relationship
on under in beside over

Not impossible at all.
Logged

"Cleverly disguised as an original signature..."
corelon
Miner
**
Posts: 10


« Reply #7 on: February 02, 2007, 01:56:31 pm »

Hello,

I think that the syntax format used by Intype and e texteditor/Textmate seems rather capable of parsing any language. I really like the fact of scoping. Maybe we could check it out.

Cheers,

Nick
Logged
Szandor
Senior Miner
***
Posts: 92



« Reply #8 on: February 02, 2007, 03:51:52 pm »

I think that the syntax format used by Intype and e texteditor/Textmate seems rather capable of parsing any language. I really like the fact of scoping. Maybe we could check it out.

It might be powerful, but it's a pretty darn cumbersome XML. It's hard to read and has quite a lot of text. I don't believe this is the way to go. Simplicity and power should be combined.
Logged

"Cleverly disguised as an original signature..."
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2013, Simple Machines Valid XHTML 1.0! Valid CSS!
Page created in 0.121 seconds with 18 queries.