Emerald Editor Discussion
October 21, 2017, 11:23:02 am *
Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
   Home   Help Search Login Register  
Question: Exactly which option should we take for EE's syntax format?
Leave it the way it is (use CE format) - 1 (5%)
Use CE's as a base but extend it. - 2 (10%)
Create a new custom format. - 4 (20%)
Use an XML structure. - 13 (65%)
Use another public language... - 0 (0%)
Total Voters: 19

Pages: [1]
Author Topic: Which syntax format should we use?  (Read 9425 times)
0 Members and 1 Guest are viewing this topic.
Site Administrator
Master Jeweller
Posts: 618

« on: August 23, 2006, 06:14:44 pm »

It has been highly debated exactly what we take for EE's syntax parsing system. We have several options:

1. Leave it the way it is (use CE format)
This would mean we immediately have access to the hundreds (if not thousands, by now) of syntaxes available for Crimson. However, the extra features we've talked about don't get added.

2. Use CE's as a base but extend it.
This would mean taking the base we already have, and bolt on extras. Doing so would add function but make parsing a little more awkward since the existing language does have known limitations. Any new implementation would have to work around those. On the plus side, we should still be able to harness the existing syntax base.

3. Create a new custom format.
There is nothing to stop us creating a new custom format, similar to CE's original, but having support for the goodies we wanted to support. On the plus side, we can make it minimal and handle only what needs to be handled, so it should be faster and more compact when implemented. On the downside, we can't make use of some tools that exist (e.g. dedicated XML editors for XML) and would mean we'd have to write our own parser - XML, for example, already has a parser available, and one such library is already included with wxWidgets, which we use in EE.

4. Use an XML structure.
Having an XML structure would make it extensible and would only mean additions to the XML backend, not rewrites to handle it. Plus we get to use dedicated editors, like xmlSpy. On the downside it is far less lean and efficient on space or speed to process. This is the approach I originally took, but am considering revisiting this.

5. Use another public language...
There may be another language out there that suits what we're trying to do. If anyone has any ideas... please post them here as replies.

"Cleverly disguised as a responsible adult!"
Senior Miner
Posts: 95

Maintainer of Obscure and Unused Ports

« Reply #1 on: August 23, 2006, 07:17:02 pm »

I'm torn between #3 and #4, but ultimately I think the benefits of using something standard (such as XML) outweigh the negatives.  Primarily, I believe it is important that we do not tie ourselves to CE's format, either outright or as a basis for a new format; I think we'd be best served learning from its mistakes, and creating somekind of converter for whatever new format we settle on.

-->>  This Space 4 Rent  <<--
Derek Parnell
Lead Architect
Posts: 36

« Reply #2 on: August 24, 2006, 07:30:15 am »

Definitely not #1 or #2. We can always write a conversion tool to translate the CE syntax files to the new format.

#3 is enticing because of the potential to make it more accessible and faster (compared to XML). But on the other hand, syntax files are not read very much by the editor.

#4 is a standard format that doesn't need a lot of justifying. If it turns out the XML is too slow to process, we could write a 'compiler' to convert the XML form into a faster format for the editor to read. And as for making it easy to maintain, we can write a GUI tool to do that if required. This could be a good plug-in actually. And if all else fails, it can be edited by a EE in raw form. The XML does lend itself to being able to extend the syntax and capabilities of the EE syntax files without having to update a whole lot of stuff.

#5 is just the same as #3 really, except that we would be dependant on a third party for support, updates, and enhancements. Not really a good idea.

So .... my vote would be for XML, even though #3 would be a whole lot more fun, but pragmatism is required here.

Derek Parnell
"Down with Mediocrity!"
Posts: 17

« Reply #3 on: August 24, 2006, 10:36:55 am »

I think that either 3 or 4 would suit our purposes best. I voted for 4 finally.
Gem Cutter
Posts: 107

« Reply #4 on: September 02, 2006, 03:58:45 pm »

I'd say use an XML format. To give a speed and performance boost, use object serialization to place parsed XML objects into binary form which is then read the rest of the time instead of the XML. The binaries are only out-of-date when EE detects that the file has been touched.
Tom B
Posts: 3

« Reply #5 on: September 12, 2006, 11:52:30 pm »

Custom format. see my post here: http://forum.emeraldeditor.com/index.php?topic=62.msg1052#msg1052

I think we should try to use as little disk space as possible. Using XML will require more disk space than a custom format. EE would be useful to run from a floppy disk or USB stick and so not use unneccesary space.
Posts: 1

« Reply #6 on: September 13, 2006, 08:40:57 am »

I think we should try to use as little disk space as possible. Using XML will require more disk space than a custom format. EE would be useful to run from a floppy disk or USB stick and so not use unneccesary space.
Not much of a reason. Floppies aren't terribly useful anymore. USB sticks and/or CDs are far cheaper and hold much more space. Of course you could always store the XML syntax files compressed. Wink
Posts: 1

« Reply #7 on: October 06, 2006, 07:53:05 pm »

How about BNF, or a slightly modified version.  In the past, Paul Mann distributed LOTS of syntax files with his parser generator "LALR"C (that's supposed to be a copyright symbol in case I messed it up).  I believe these syntax files would (maybe) be available, without having to be re-written.

The advantage of using BNF is that you can get FULL syntax highlighting (EXACTLY as the language itself will do it), error correction (assuming the user wants that, he can toggle a switch or some such thing), even the ability to recognize and properly parse user defined types -- leading to the ability to emulate "IntelliSense" currently available within the Visual Studio (and maybe other) editor(s).  The "view" utility that Paul shipped with his product a decade ago did exactly that: it displayed a drop-down list of the next legal token set.

The way Paul's system worked was that it had a stand-alone language-independent parser, and a set of parser tables that defined the language itself.  This approach allows the system to have any number of syntax recognition engines available by simply repointing the table pointers to the appropriate set of tables.  Language rules are handled by tables of function pointers.  For an editor, those functions would be much simpler than for an actual compiler, merely turning on and off certain font attributes, running macros, and from time-to-time logging special symbols (user-defined types) into a table.  Of course, all this assumes that Emerald will have a built-in "parser generator".  That is, some way of converting the syntax input files into sets of parser arrays within memory. 

A) Does XML give you that power?  Perhaps it does; I know very little about XML.

B) Where does the parser table generator come from?  Regardless of the input format, you still have to convert the input into transition sets, accept sets, look-ahead sets, etc.  Isn't that right???  Or are you taking an entirely different approach...
Pages: [1]
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2013, Simple Machines Valid XHTML 1.0! Valid CSS!
Page created in 0.096 seconds with 20 queries.