And Now For Something Completely Different

(Not really, just a different perspective.)

Hello! It's been a hot minute. Over the past two weeks I've tried several approaches to try to figure out a way get bytecode that could be executed, with most of those approaches having fairly limited success. But! I think I've finally made some progress in a direction that I'm going to be able to use. If you want to read the highlights of the in-between steps, feel free to keep reading full steam ahead. If you'd like to go ahead and skip to the current approach and what I've managed to get working, jump down to Modifying eval.

The Simple[sic] Way

I started off with working on isolating the chunk of code that the perl6 executable used to set up the system to run. One thing that was pretty useful in sorting this out was a chunk of code was a recommendation by timotimo to set the environment variable MVM_COVERAGE_LOG=~/mvm.log. The output written to that log includes the line number and file name of every line that gets encountered when running a program using the perl6 or nqp. For example, when I run perl6 -e "say 'Hello world!';", the output looks something like this:

The file has a total of 5293 lines. Of these 5293 lines, the first 5218 occur before the -e is handled. In those 5218 lines, 202 different files are referenced. Setting everything up is far from a simple handful of lines. I spun on this for a while, reading through all the things that would be needed, and trying to isolate what was needed and what wasn't. But the sheer mass of material that would need to be gather to set up the system made me realize that this first plan of approach would likely be both result in a rather unwieldy, and unnecessarily large ELF, and would be much much of a headache than it was worth. So I scrapped this plan and turned to some of the suggestions made in response to my previous post.

The Rewrite

Combined with the sheer massiveness of the startup code I would need to grab, and some points brought up by patrickb, it was time to switch gears again. Because it's necessary during the set-up process to locate the perl6 libraries, and that information is encoded in the perl6 executable and can't be found (or at least, is very difficult to get a hold of) in other places, I needed to develop an approach that enabled me to use the perl6 executable, instead of the moar executable. Additionally, approaching it from the other end of starting with implementing something that can run a pre-compiled Perl 6 script (as suggested by nine), rather than starting with an executable and figuring out something that can run it, should save me a lot of headaches from the bizarre errors I was running into before switching gears. It also has the benefit of making this feature potentially usable by versions of the language not using MoarVM as its VM.

This approach started me looking at how to modify rakudo/src/main.nqp to make something that would be able to run pre-compiled bytecode. The next step in doing that was tracking down where command_line is defined. While figuring out where that was located, I also stumbled across a few functions (eval and command_eval) that I think should prove pretty useful. command_eval is where it's determined whether the program to be run is in a file or being typed directly into the command line. eval is where the way the program is being run is customized based on the user-provided flags. command_eval is also defined in Perl6/Compiler.nqp, but calls the parent version specified in HLL/Compiler.nqp.

So, the steps from this point on were as follows:
  1. I needed to add a command line option to set when I wanted to use my bytecode processing version of eval.
  2. I needed to modify eval to do something when that option was set
  3. I needed to modify eval to be able to run the bytecode without attempting to send it to the compiler to be compiled down to bytecode.


It was about at this point where brrt suggested to see if my ELF file generator could work for MBC generated from an NQP source file, since NQP doesn't require as much set up as Perl 6. The conclusion of this excursion was that running nqp --target=mbc --output=foo.moarvm foo.nqp to generate the bytecode, and then running it by calling moar foo.moarvm didn't work, because it complains that it can't find the ModuleLoader.moarvm file. However, if you attempt to run that same file with moar --libpath=<the appropriate libpath> foo.moarvm, it would be able to successfully execute the program. This indicates that the ELF file approach is something that can work, as long as we can manage to have the libpaths set up properly. This was pretty good news, as it's confirmation that embedding bytecode in the ELF file is something that can actually work in the future.

An interesting note on something that might be worth fixing (I may look and see if it's a quick thing at some point this week, but if someone else wanted to take a swing at it, go for it), nqp --target=mbc --output=foo.moarvm foo.nqp works, but nqp --output=foo.moarvm --target=mbc foo.nqp does not. Not a big thing, just interesting that switching the order of the parameters would cause a problem.

I also decided that it was worth pulling the most recent versions of NQP, Rakudo, and MoarVM at this point, and ran into an issue indicating that the version of MoarVM I was using was too old for the version of NQP. It turns out that the problem was that I had used '~' in the path to the directory I was using as my prefix, which did not play nicely with the /usr/bin/perl -MExtUtils::Command -e mkpath command used as part of make install. I've updated the set up instructions in my earlier blog post to hopefully help others from running into the same issue in the future.

Modifying eval

Here's where we get to the exciting bit, what I've managed to get working and will be keeping. I added a command line options --bytecode and -b, and modified eval to actually do something when that option was set!

To do add the command line options, I found a pull request where an option was being added to see what parts got modified. This indicated that I needed to add the option (bytecode|b) to this line, and that I could modify eval to contain an if statement that would print "hello!" when option was used. I made the appropriate changes, and then recompiled nqp and was greeted with this error message when I ran perl6 --bytecode:

I fiddled around with this for a while, assuming I had missed some other place I needed to modify to be allowed to have the new option, since removing the option made everything work again. I realized after a bit that the issue was much simpler than that, which I would have realized if I had read the error message a little more closely. I just needed to rebuild Rakudo as well. Once both NQP and Rakudo had been rebuilt, it worked like a charm, resulting in this output:

Now that the --bytecode option exists and is capable of being useful, I've been working on making it actually do its intended purpose. This has brought up a bit of a philosophical implementation question of whether you should need to use the --bytecode flag to tell perl6 that you want it to run bytecode, or whether you should be able to just hand it an MBC file and have perl6 know that it should treat it like bytecode. However, that decision doesn't affect the modification that needs to be made, just where the modification should live in the code, so I'm deferring that decision until I ask someone more qualified to decide.

Right now, I'm tracking down exactly how to bypass all the bits that actually compile the .pl6 program that perl6 expects to be handed, as well as the bits that check that the precompiled code matches that program, and skip straight to invoking the bytecode. patrickb found the point at which the bytecode become executable from the Perl 6 side of things, and brrt found the point at which bytecode is loaded from the MoarVM side of things, which have been very helpful in figuring things out. My approach right now is to make my --bytecode option duplicate the procedure for executing files which calls evalfiles before calling eval. From my version of evalfiles, I'm ripping out everything besides reading the file and calling eval, simplifying it down to the techique used to evaluate programs passed with -e. From there, I'm cutting anything in my version of eval that isn't simply running the bytecode. Hopefully, I'll have an update on how that's working in a day or two!

Other Related Highlights

lizmat and jnthn had an interesting conversation the other day on the usefulness of multi-threaded pre-compilation, or a server which generated bytecode, to speed up module loading. I found it interesting, and will be interested to see in what ways my work might eventually hook in and be impacted by those potential approaches.


Popular posts from this blog

Getting Started: Developing for Perl 6

Flags and Syscalls and Modules, Oh My!

Building an ELF File